query-8acd3c540dee9fd7754a7cfed6abb47e

rq turtle/ttl

avoiding duplicatesI got a number of reverts due to duplicates, so I'll try to fix them. There are a number of "legitimate" duplicates due to several VIAF IDs. I wrote a query to find items that have two different statements but with the same value: select ?x ?xLabel ?s1 ?s2 ?v { ?x p:P7859 ?s1,?s2 filter(str(?s1)<str(?s2)) ?s1 ps:P7859 ?v. ?s2 ps:P7859 ?v. } limit 100 1. I've created some due to QS race conditions: item is created, reference can't be created because it still doesn't see the new item; then I "Reset errors" on the 3 batches, which creates some duplicates. I won't "reset errors" anymore, so hopefully that'll stop . So I tried to filter out this reason but the query times out: https://www.wikidata.org/wiki/Topic:Vkf0s6bt3kzm14j0: is creating 800 duplicates through adding qualifier "quantity", see Gamaliel2. @select ?x ?xLabel ?s1 ?s2 ?v { ?x p:P7859 ?s1,?s2 filter(str(?s1)<str(?s2)) ?s1 ps:P7859 ?v. ?s2 ps:P7859 ?v. ?s1 ps:P7859 ?v filter not exists {?s1 pq:P1114 ?q1} ?s2 ps:P7859 ?v filter not exists {?s2 pq:P1114 ?q2} } limit 100 ]reply[07:53, 15 April 2020 (UTC)) talk (Vladimir Alexiev), 16 don't have qualifier (I'm fixing those) and the rest have "quantity". --https://w.wiki/Msf: is working to eliminate the duplicates. Out of the first 100 duplicates (query Gamaliel@]reply[14:40, 19 April 2020 (UTC) )talk( GamalielWill this query include the legitimate duplicates due to multiple VIAF IDs? If I can get a query that includes no false positives I can run a QS batch to eliminate them. ]reply[14:08, 20 April 2020 (UTC)) talk (Vladimir Alexiev: This does not return legitimate duplicates: it looks for the same item ?x having the same WorldCat ?v through two different statements. You have to process them in slices of 100 else the query times out. It would be good if you could preserve my References but that's not crucial because the form of WorldCat id shows where it came from (VIAF or LCCN). Cheers! --Gamaliel@

Use at

PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX p: <http://www.wikidata.org/prop/>
select ?x ?s1 ?s2 ?v ?q1 ?q2 {
  ?x p:P7859 ?s1,?s2
  filter(str(?s1)<str(?s2))
  ?s1 ps:P7859 ?v.
  ?s2 ps:P7859 ?v.
  optional{?s1 pq:P1114 ?q1}
  optional{?s2 pq:P1114 ?q2}
} limit 100

Query found at

graph TD classDef projected fill:lightgreen; classDef literal fill:orange; classDef iri fill:yellow; v5("?q1"):::projected v6("?q2"):::projected v1("?s1"):::projected v2("?s2"):::projected v4("?v"):::projected v3("?x"):::projected f0[["str(?s1) < str(?s2)"]] f0 --> v1 f0 --> v2 v3 --"p:P7859"--> v1 v3 --"p:P7859"--> v2 v1 --"p:statement/P7859"--> v4 v2 --"p:statement/P7859"--> v4 subgraph optional0["(optional)"] style optional0 fill:#bbf,stroke-dasharray: 5 5; v1 -."p:qualifier/P1114".-> v5 end subgraph optional1["(optional)"] style optional1 fill:#bbf,stroke-dasharray: 5 5; v2 -."p:qualifier/P1114".-> v6 end