query-f668a1e226d074763d01717835ec1430

rq turtle/ttl

inferring narrower occupations we have large numbers of people with a sole occupation of "researcher" and a description either "researcher" or based on an ORCID. This makes disambiguation really hard. Problem: Most journals have a main subject, many of which are linked by a P3095 to an occupation, so we can link a human through articles to journals then topics and occupations. If the person has 10 articles in wikidata, picking the most common occupation linked to them should be a good approximation of their occupation. Proposed solution: So far the query I've got times out. How do I make it go faster so it doesn't timeout? How to ignore people occupation of "researcher" AND another occupation? Problem:

Use at

https://query.wikidata.org/sparql

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?occupation ?author (COUNT(?article) AS ?count)  WHERE
    {
        ?topic wdt:P3095 ?occupation .
        ?journal wdt:P921 ?topic .
        ?article wdt:P1433 ?journal ; wdt:P31 wd:Q13442814 ; wdt:P50 ?author .
        ?author wdt:P31 wd:Q5 ; wdt:P106 wd:Q1650915 .
    } 
GROUP BY  ?occupation ?author 
HAVING (?count > 10) LIMIT 5

Query found at

https://www.wikidata.org/wiki/Wikidata:Request_a_query/Archive/2024/09

graph TD classDef projected fill:lightgreen; classDef literal fill:orange; classDef iri fill:yellow; v5("?article"):::projected v6("?author"):::projected v7("?count") v4("?journal") v3("?occupation"):::projected v2("?topic") c6(["wd:Q13442814"]):::iri c8(["wd:Q5"]):::iri c10(["wd:Q1650915"]):::iri f0[["?count > '10^^xsd:integer'"]] f0 --> v7 v2 --"wdt:P3095"--> v3 v4 --"wdt:P921"--> v2 v5 --"wdt:P1433"--> v4 v5 --"wdt:P31"--> c6 v5 --"wdt:P50"--> v6 v6 --"wdt:P31"--> c8 v6 --"wdt:P106"--> c10 bind2[/"count(?article)"/] v5 --o bind2 bind2 --as--o v7