query-7df9b42753f651462110c56cdc744132
. At the time, I didn’t know how to fix it, but now I decided to try fixing this with Pywikibot. (Q631167)Darwin's fox statements a while ago: many of them contain the character “í” where they should contain an apostrophe – for example, “Darwinís Fox” for (P1843)taxon common name an error with some noticed). I remembered that I had this great tutorial, who showed me how to use Pywikibot with the awesome PAWS interface (and who also wrote Tobias1984At WikiCon16, I met , and read along as this post unfolds. Note that the notebook includes a lot of output, so it will take some time to load.) here. Log in, start your server, and create a new Python 3 notebook (New > Python 3). (Notebooks are public, so you can see the one I used PAWSWe start off by opening Insert some boring boilerplate code: ()data_repository.site = repo )"wikidata" ,"wikidata"(Site.pwb = site pwb as pywikibot import(You send off each input unit – which can contain multiple lines – with Shift+Enter.) We first want to fix one item “manually”, because automated editing is dangerous! So let’s start with Darwin’s Fox, Q631167. We obtain its item: )"Q631167" ,repo(ItemPage.pwb = darwins_foxWe can see the broken taxon name with: text.()getTarget.]0]["P1843"]["claims"()[get.darwins_foxOkay, that’s a lot of code out of nowhere. Where did it come from? I’m not going to write it all up, but essentially it’s a combination of ).__foobar, but that’s usually a last resort because it includes a lot of useless internal results (dir()listing the object: you can show the methods and properties of any object with and copying snippets from there.tutoriallooking at the ."claims", you can see that the result is a dictionary which contains the key darwins_fox.get()printing a snippet of the code, and looks what’s available in it: for example, if you print method.get(), press Tab, and PAWS will tell you that the object has a darwins_fox.tab completion: enter Now we want to edit the item. I’m still playing around with the pywikibot, so let’s just see what happens if I do: "Darwin’s fox" = text.()getTarget.]0]["P1843"]["claims"()[get.darwins_fox has with tab completion. This one sounded promising: darwins_foxNothing happens so far. That’s good. I tried to see what methods ()editEntity.darwins_foxNo error message – and the item is updated! Whoa. That’s really easy. And also, I should probably stress that this was pretty irresponsible of me (sorry!). Let’s use the sandbox item for playing around some more. text.()getTarget.]0]["P1843"]["claims"()[get.sandbox )"Q4115189" ,repo(ItemPage.pwb = sandbox to the sandbox item, so this still printed “Darwinís fox”. (When you read this, that statement will probably be gone.) (P1843)taxon common name I had added a faux Let’s try to fix up the name without hard-coding the proper solution. That is, instead of setting it to the fixed string “Darwin’s fox”, replace the broken character “í” with an apostrophe. )"test edit from pywikibot"=summary(editEntity.sandbox )"’" ,"í"(replace.text.target = text.targetThat looks good! Okay, let’s move onto SPARQL, since we’ll want to fix all items, not one. I had already written this SPARQL query back when I found this issue:
Use at
- https://query.wikidata.org/sparql
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?taxon ?commonName ?fixedCommonName
WHERE
{
?taxon wdt:P1843 ?commonName.
FILTER(CONTAINS(?commonName, "í") && CONTAINS(?commonName, " ")).
BIND(REPLACE(REPLACE(?commonName, "í", "'"), "‰", "ä") AS ?fixedCommonName).
}