01:40, 29 December 2020 (UTC)) talk (Tagishsimon --Very grateful for any suggestions - I banged my head against the walll for hours before giving up and writing a script to reassemble the lines (which merely changed the wall I was hitting my head against... :-).)11:08, 29 December 2020 (UTC)) talk (Andrew Gray (never changed seat or party). should have just one (Q9545)Tony Blair (all the same seat, but switched into and out of a party). Someone straightforward like should have three (Q3330215)Paul Marsden (two of which are the same seat different parties), and should have six (Q8016)Winston Churchill should have five entries (he had four periods of continuing service, but the last one was over two seats). (Q335791)Herbert Morrison The target here is one line per "period of continuing service in the same seat and party", so as a couple of other test cases 12:15, 29 December 2020 (UTC)) talk (Andrew GraySo we can now deduce what the end date should be for each start date line. The one clause in the outer query says "for a given start date, the end date has to be larger" (obviously), throwing out any that are earlier. The main SELECT clause then has a MIN(?end) element, meaning that it will select the lowest date left to it - which is the next one in sequence. It seems to give the right results for Morrison (below). but also for Churchill and Marsden : I think I've got it! This query runs the same item as two subqueries, one finding all the seat/party/start date pairs, and one finding all the seat/party/end date pairs. Each one throws out any with blank fields so each one is returning only "valid" sets. We know that the corresponding end date is logically the one immediately following the start date - because terms shouldn't overlap.Tagishsimon@
Use at
https://query.wikidata.org/sparql
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX bd: <http://www.bigdata.com/rdf#>
SELECT distinct ?mp ?mpLabel ?partyLabel ?seatLabel ?start (min(?end) as ?end2) where {
# find all seat-party-start pairs
{ SELECT distinct ?mp ?mpLabel ?partyLabel ?seatLabel ?start
WHERE {
VALUES ?mp { wd:Q335791 }
?mp p:P39 ?ps. ?ps ps:P39 ?term . ?term wdt:P279 wd:Q16707842. ?ps pq:P768 ?seat . ?ps pq:P4100 ?party .
{ ?ps pq:P580 ?start. filter not exists { ?ps pq:P2715 ?elec . ?elec wdt:P31 wd:Q15283424 . ?mp p:P39 ?ps0 . ?ps0 ps:P39 ?term0 . ?term0 wdt:P156 ?term .
?ps0 pq:P4100 ?party . ?ps0 pq:P768 ?seat . ?ps0 pq:P1534 wd:Q741182 . } }
union
{ ?ps pq:P582 ?end. filter not exists { ?ps pq:P1534 wd:Q741182 . ?mp p:P39 ?ps2 . ?ps2 ps:P39 ?term2 . ?term2 wdt:P155 ?term .
?ps2 pq:P4100 ?party . ?ps2 pq:P768 ?seat . ?ps2 pq:P2715 ?elec . ?elec wdt:P31 wd:Q15283424 . } }
filter(bound(?start))
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}ORDER BY (?mp) ?start }
# and all possible end dates
{ SELECT distinct ?mp ?mpLabel ?partyLabel ?seatLabel ?end
WHERE {
VALUES ?mp { wd:Q335791 }
?mp p:P39 ?ps. ?ps ps:P39 ?term . ?term wdt:P279 wd:Q16707842. ?ps pq:P768 ?seat . ?ps pq:P4100 ?party .
{ ?ps pq:P580 ?start. filter not exists { ?ps pq:P2715 ?elec . ?elec wdt:P31 wd:Q15283424 . ?mp p:P39 ?ps0 . ?ps0 ps:P39 ?term0 . ?term0 wdt:P156 ?term .
?ps0 pq:P4100 ?party . ?ps0 pq:P768 ?seat . ?ps0 pq:P1534 wd:Q741182 . } }
union
{ ?ps pq:P582 ?end. filter not exists { ?ps pq:P1534 wd:Q741182 . ?mp p:P39 ?ps2 . ?ps2 ps:P39 ?term2 . ?term2 wdt:P155 ?term .
?ps2 pq:P4100 ?party . ?ps2 pq:P768 ?seat . ?ps2 pq:P2715 ?elec . ?elec wdt:P31 wd:Q15283424 . } }
filter(bound(?end))
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}ORDER BY (?mp)?end }
# now take our starts as the key, and match each to its appropriate end - the next one along
# this is the *smallest* end date which is still *larger* than the start date
# so filter by larger here, and smallest using min in the SELECT clause
filter(?end > ?start) . # note > not >=
} group by ?mp ?mpLabel ?partyLabel ?seatLabel ?start order by ?start