query-a687e28995145c81a9071589a9c3afc2
Order text strings taking accents in accountBoth Listeria and Wikidata Query Service can sort text (labels) in alphabetical order. However, both of them place modified characters like à, è or é after z, when the correct alphabetical order in most languages that use those character (like Catalan, French or Spanish) those characters should be ordered as their unmodified equivalent (that is, "à" should be treated as if it were just an "a"). I'm looking for a way to correctly order text in Catalan or Spanish in a query. I can imagine several ways, but I would need help for every one of them. Most of them involve creating a new variable (without accents) to work as an index: Is there a way to execute several regex replacements on the labels inside a query? Is there a way to replace characters? Ideally the set "àáèéìíòóùúÀÈÌÒÙÁÉÍÓÚçÇñÑ" should be replaced by the set "aaeeiioouuAEIOUAEIOUcCnN", although for most purposes a simpler set could be enough. Is there a function to remove accents (that is, transform "bé" to "be")? Is there a way (in SPARQL or in Listeria) to just change the alphabetical order to that of a given language?, where Listeria puts "Alájar" after "Alonso" when it should be in the first place of the list befor "Aljaraque". The (sligthly modified) query from there is as follows: ca:Llista de municipis de HuelvaJust for context, we have recently run into this problem with
Use at
- https://query.wikidata.org/sparql
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX bd: <http://www.bigdata.com/rdf#>
SELECT DISTINCT ?item ?itemLabel ?escut ?poblacio ?superficie ?data ?altitud ?codipostal ?territori ?comarca WHERE {
?item wdt:P31 wd:Q2074737;
#wdt:P131+ wd:Q95015;
wdt:P772 ?p772. # codi territorial utilitzat per selecciona pel prefix (INE -IDESCAT)
FILTER(STRSTARTS(?p772, "21")) # selecció
OPTIONAL { ?item p:P1082 [ps:P1082 ?poblacio; pq:P585 ?data; wikibase:rank wikibase:PreferredRank] . }
OPTIONAL { ?item wdt:P131 ?comarca.
?comarca wdt:P31 wd:Q3141478}
OPTIONAL { ?item wdt:P2046 ?superficie. }
OPTIONAL { ?item wdt:P2044 ?altitud. }
OPTIONAL { ?item wdt:P281 ?codipostal. }
OPTIONAL { ?item wdt:P94 ?escut. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY ?itemLabel