query-f668a1e226d074763d01717835ec1430
inferring narrower occupations we have large numbers of people with a sole occupation of "researcher" and a description either "researcher" or based on an ORCID. This makes disambiguation really hard. Problem: Most journals have a main subject, many of which are linked by a P3095 to an occupation, so we can link a human through articles to journals then topics and occupations. If the person has 10 articles in wikidata, picking the most common occupation linked to them should be a good approximation of their occupation. Proposed solution: So far the query I've got times out. How do I make it go faster so it doesn't timeout? How to ignore people occupation of "researcher" AND another occupation? Problem:
Use at
- https://query.wikidata.org/sparql
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?occupation ?author (COUNT(?article) AS ?count) WHERE
{
?topic wdt:P3095 ?occupation .
?journal wdt:P921 ?topic .
?article wdt:P1433 ?journal ; wdt:P31 wd:Q13442814 ; wdt:P50 ?author .
?author wdt:P31 wd:Q5 ; wdt:P106 wd:Q1650915 .
}
GROUP BY ?occupation ?author
HAVING (?count > 10) LIMIT 5
Query found at
graph TD
classDef projected fill:lightgreen;
classDef literal fill:orange;
classDef iri fill:yellow;
v5("?article"):::projected
v6("?author"):::projected
v7("?count")
v4("?journal")
v3("?occupation"):::projected
v2("?topic")
c6(["wd:Q13442814"]):::iri
c8(["wd:Q5"]):::iri
c10(["wd:Q1650915"]):::iri
f0[["?count > '10^^xsd:integer'"]]
f0 --> v7
v2 --"wdt:P3095"--> v3
v4 --"wdt:P921"--> v2
v5 --"wdt:P1433"--> v4
v5 --"wdt:P31"--> c6
v5 --"wdt:P50"--> v6
v6 --"wdt:P31"--> c8
v6 --"wdt:P106"--> c10
bind2[/"count(?article)"/]
v5 --o bind2
bind2 --as--o v7