Wikidata does not primarily aim to store facts about the world, rather it tries to collect links to refences to knowledge.
Therefore, it's possible to have conflicting information in Wikidata which gives raise to the statement ranks (preferred, normal and deprecated).
Misc
SERVICE wikibase:label
service wikibase:label
looks up labels, descriptions and/or alternative labels (AltLabel) for unbound variables whose names end in Label
, Description
and/or AltLabel
:
select
?country
?countryLabel
# ?countryPrefLabel
?countryDescription
?countryAltLabel
# ?altLabelDirect
{
wd:Q188 wdt:P17 ?country .
# ?country <http://www.w3.org/2004/02/skos/core#altLabel> ?altLabelDirect . filter(lang(?altLabelDirect) = 'en')
service wikibase:label { bd:serviceParam wikibase:language "en" . }
}
With this «service», it's not necessary to explicitly use rdfs:label
, skos:altLabel
and schema:description
.
Default namespace: wd
By default, SERVICE wikibase:label
only supplies labels for entities in the wd:
namepace.
This can be changed by adding ?prop wikibase:directClaim ?p
to the query.
Lexemes (Lemmas)
Lexemes are not automatically looked up with this service. Thus, a optional { ?x wikibase:lemma ?xLabel }
is required for lexemes:
select
?x
?xLabel
{
values (?x) {
(wd:L296666)
(wd:Q39 )
}
optional { ?x wikibase:lemma ?xLabel }
service wikibase:label { bd:serviceParam wikibase:language 'en,de,fr,rm' }
}
Search for lexemes in a few given languages:
select
?lem
(lang(?L) as ?lng)
{
?lem wikibase:lemma ?L .
{
select
?lem
{
{ ?lem wikibase:lemma "ding"@da } union
{ ?lem wikibase:lemma "ding"@en } union
{ ?lem wikibase:lemma "ding"@pt }
}
}
}
Search for a lexeme in any language:
select
?lem
(lang(?str) as ?lng)
{
?lem wikibase:lemma ?str . filter(str(?str) = "ding")
}
Wikidata Query Service (WDQS)
WDQS is the server that executes wikidata queries formulated in
SPARQL.
connecting wd:P… to their counterpart …:P… nodes
wikibase:directClaim
connects wd:P…
to its counterpart wdt:P…
.
The following query returns wdt:P10
:
select * {
wd:P10 wikibase:directClaim ?y .
}
There are quite a few …:P… nodes which all can be reached from wd:P…
. The following query returns true
:
ask {
wd:P31 wikibase:claim p:P31 .
wd:P31 wikibase:directClaim wdt:P31 .
wd:P31 wikibase:novalue wdno:P31 .
wd:P31 wikibase:qualifier pq:P31 .
wd:P31 wikibase:qualifierValue pqv:P31 .
wd:P31 wikibase:reference pr:P31 .
wd:P31 wikibase:referenceValue prv:P31 .
wd:P31 wikibase:statementProperty ps:P31 .
wd:P31 wikibase:statementValue psv:P31 .
}
wikibase:badge
wikibase:badge
assigns a badge (such as «good article badge») to a Wikipedia article.
The following query finds all(?) German featured articles:
select
?aboutTxt
?about
?featuredArticle
{
?featuredArticle wikibase:badge wd:Q17437796 ;
schema:inLanguage "de" ;
schema:about ?about .
?about rdfs:label ?aboutTxt .
filter(lang(?aboutTxt) = 'de')
}
order by lcase(?aboutTxt)
The following query lists the number of assigned badges for each badge:
select
?cnt
?badgeLabel
?badge
{
service wikibase:label { bd:serviceParam wikibase:language "[auto_language],en". }
{
select
(count(*) as ?cnt)
?badge
{
[] wikibase:badge ?badge
}
group by
?badge
}
}
order by
?badgeLabel
wikibase:Dump
The types (rdf:type
) of wikibase:Dump
are schema:Dataset
and owl:Ontology
:
select
?type
{
wikibase:Dump rdf:type ?type
}
TODO
select
?a ?b
{
wikibase:Dump ?a ?b .
}
select
?rel ?x
{
values (?rel) {
(<http://creativecommons.org/ns#license>)
(schema:softwareVersion )
(owl:imports )
}
wikibase:Dump ?rel ?x .
}
wikibase:identifiers
wikibase:identifiers
indicates the number of identifiers for a topic.
The following query finds the maximum number of identifiers for a topic (which, as of 2022-11-17, is 833):
select
(max(?nofIdentifiers) as ?maxNofIdentifers)
{
#
# Use a subquery to prevent a timeout:
#
{
select distinct
?nofIdentifiers
{
[] wikibase:identifiers ?nofIdentifiers .
}
}
}
The following query finds all topics that have 833 identifiers:
select
?x
{
?x wikibase:identifiers 833 .
}
As of 2022-11-17, this query returns one record:
Q88174316.
wikibase:propertyType
The data type of a property item (wd:P…
) can be queried with wikibase:propertyType
.
As of October 2022, there are 17 property (or data) types:
http://wikiba.se/ontology#CommonsMedia | |
http://wikiba.se/ontology#ExternalId | |
http://wikiba.se/ontology#GeoShape | |
http://wikiba.se/ontology#GlobeCoordinate | |
http://wikiba.se/ontology#Math | |
http://wikiba.se/ontology#Monolingualtext | |
http://wikiba.se/ontology#MusicalNotation | |
http://wikiba.se/ontology#Quantity | |
http://wikiba.se/ontology#String | |
http://wikiba.se/ontology#TabularData | |
http://wikiba.se/ontology#Time | |
http://wikiba.se/ontology#Url | |
http://wikiba.se/ontology#WikibaseForm | |
http://wikiba.se/ontology#WikibaseItem | |
http://wikiba.se/ontology#WikibaseLexeme | |
http://wikiba.se/ontology#WikibaseProperty | |
http://wikiba.se/ontology#WikibaseSense | |
The list of these property types were obtained with the following query:
select distinct
?propertyType
{
[] wikibase:propertyType ?propertyType .
}
order by
?propertyType
A list of properties for a given property type is returned by the following query:
select
?prop
?propLabel
{
?prop wikibase:propertyType <http://wikiba.se/ontology#GlobeCoordinate> .
service wikibase:label { bd:serviceParam wikibase:language "en" . }
}
wikibase:lemma
The wikibase:lemma
relations connects entities whose type is ontolex:LexicalEntry
to strings:
select
?sub
?typ
?lem
(datatype(?lem) as ?dtp)
{
?sub wikibase:lemma ?lem .
?sub rdf:type ?typ .
}
limit 100
wikibase:lexicalCategory
select
?category
?categoryLabel
{
hint:Query hint:optimizer "None".
{
select distinct
?category
{
[] wikibase:lexicalCategory ?category
}
}
service wikibase:label { bd:serviceParam wikibase:language 'en,de,fr,de,ru,tg' . }
}
order by
lcase(?categoryLabel)
wikibase:quantityAmount and wikibase:quantityUnit
select
?itemLabel
?elevation
?unitLabel
{
values (?item)
{
( wd:Q46588 ) # Cordillera Kimsa Cruz : a mountain range in Bolivia
( wd:Q499164 ) # Ascraeus Mons : a martian shield volcano
( wd:Q55615607 ) # Kalindi Pass : highest and most adventurous trekking trail in India (almost 6000 M)
}
?item p:P2044 ?eleStmts .
?eleStmts psv:P2044 ?eleValues .
?eleValues wikibase:quantityAmount ?elevation .
?eleValues wikibase:quantityUnit ?unit .
?item rdfs:label ?itemLabel .
?unit rdfs:label ?unitLabel .
filter(lang(?itemLabel) = 'en')
filter(lang(?unitLabel) = 'en')
}
wikibase:sitelinks
select
(max(?nofSitelinks) as ?maxNofSitelinks)
{
#
# Use a subquery to prevent a timeout:
#
{
select distinct
?nofSitelinks
{
[] wikibase:sitelinks ?nofSitelinks .
}
}
}
select
?x
{
?x wikibase:sitelinks 873 .
}
As of 2022-11-17, the topic with the most site links is
Q105429923
wikibase:* TODO
wikibase:statements
, wikibase:quantityNormalized
, wikibase:grammaticalFeature
, wikibase:quantityLowerBound
, wikibase:quantityUpperBound
, wikibase:geoGlobe
, wikibase:geoPrecision
, wikibase:geoLongitude
, wikibase:geoLatitude
. wikibase:timeCalendarModel
, wikibase:timeTimezone
, wikibase:timePrecision
, wikibase:timeValue
schema:about
schema:about
links a Wikipedia URL to a «wd:…» number.
The following query translates the Hebrew Wikipedia page of Zurich to
Q72:
select
?wdNr
{
<https://he.wikipedia.org/wiki/%D7%A6%D7%99%D7%A8%D7%99%D7%9A> schema:about ?wdNr .
}
schema:about
can be combined with schema:isPartOf
to query the Wikipedia URL of a given topic (Q-Nr) in a given Wikipedia language:
select
?wikipediaURL
{
?wikipediaURL schema:about wd:Q72 .
?wikipediaURL schema:isPartOf <https://he.wikipedia.org/> .
}
schema:Article
schema:Article
is the type (rdf:type
) of a Wikipedia article URL.
Thus, the following query returns true
:
ask {
<https://en.wikipedia.org/wiki/Wikidata> rdf:type schema:Article .
}
schema:dataset
There is only one record returned by the following query. The value of ?subj
in that record is wikibase:Dump
.
select *
{
?subj rdf:type schema:Dataset .
}
schema:inLanguage
schema:inLanguage
relates a Wikimedia(?) article to a language.
schema:name
schema:name
relates a Wikimedia URL to a label.
The following query finds the URLs that are associated with August Piccard in French and Oracle Database in English.
select
?name
?wikimediaURL
{
values (?name) {
("Auguste Piccard" @fr)
("Oracle Database" @en)
}
?wikimediaURL schema:name ?name
}
The following query finds the label that is associated with a given Wikimedia URL
select
?label
(lang(?label) as ?lang)
{
<https://de.wikipedia.org/wiki/London> schema:name ?label .
}
schema:isPartOf
The schema:isPartOf
relation connects a Wikimedia URL to a wikimedia project.
The following query returns <https://de.wikisource.org/>
select ?partOf
{
<https://de.wikisource.org/wiki/Das_Newe_Testament_Deutzsch> schema:isPartOf ?partOf
}
The following query uses SPARQL's distinct
operator to select all wikimedia(?) related websites.
select distinct ?partOf
{
$item schema:isPartOf ?partOf
}
order by
?partOf
Compare schema:isPartOf
with the P361 («part of») relation.
schema:description
select
?descr
(lang(?descr) as ?lang)
{
wd:Q39 schema:description ?descr .
}
schema:* TODO:
schema:modified
, schema:version
, schema:description
Combining some schema:* relations
select
?term
(lang(?term) as ?termLang)
?lang
?wikimediaURL
?wikidataID
?project
{
values (?term) {
("Gift"@en)
("Gift"@de)
}
?wikimediaURL schema:name ?term .
?wikimediaURL schema:about ?wikidataID .
?wikimediaURL schema:isPartOf ?project .
?wikimediaURL schema:inLanguage ?lang .
}
rdfs:label
The rdfs:label
predicates assigns a text and a language to a node. The language ID of the added text can be quried with the lang(…)
function.
The following query selects translations of «Switzerland» and their language ID and order them by the language ID:
select
(lang(?label) as ?langId)
?label
{
wd:Q39 rdfs:label ?label .
}
order by
?langId
lang(…)
can be used in a filter
clause to limit the result to a given language (in the following example to Bengali, whose ID is bn):
select
?label
{
wd:Q39 rdfs:label ?label .
filter( lang(?label) = 'bn' )
}
See also the skos:altLabel
predicate.
rdf:type
The value for ?type
of any ?x rdf:type ?type
triplet is one of:
wdno:P… | |
schema:Article | The URL of a Wikipedia article |
schema:Dataset | Only one triplet! |
wikibase:BestRank | Something like wds:Q103-08172073-0D68-4629-BE18-9CF0DD561EB1 |
wikibase:GeoAutoPrecision | Something like wdv:8000239f9ef48fc7fb846097dc6a3e12 |
wikibase:GlobecoordinateValue | Something like wdv:8000047f8684a2c2aebdc7ba7c394403 |
wikibase:Property | The type of some(?) wd:P… |
wikibase:QuantityValue | Something like wdv:800000dc9c08f381edb20359f260af3f |
wikibase:TimeValue | Something like wdv:8000170412b9aeb739d076fed903a0ff |
owl:Class | The type of wdno:P… |
owl:DatatypeProperty | The type of some wdt:P… |
owl:ObjectProperty | The type of some wdt:P… |
owl:Ontology | Only one triplet |
owl:Restriction | The type of something similar to http://www.wikidata.org/.well-known/genid/83cf7cf86e26b18479fbe7609e990e0e |
ontolex:Form | Forms of lexemes? (for example wd:L10033-F2 ) |
ontolex:LexicalEntry | Lexemes (for example wd:L484 ) |
ontolex:LexicalSense | For example wd:L17815-S1 |
owl:Ontology
There is only one record returned by the following query. The value of ?subj
in that record is wikibase:Dump
.
select *
{
?subj rdf:type owl:Ontology .
}
prov:* TODO
prov:wasDerivedFrom
skos:altLabel
skos:altLabel
assigns a list of aliases to a node.
In a wikidata entry page, these aliases appear under the column Also known as.
select
?alias
{
wd:Q39 skos:altLabel ?alias .
filter( lang(?alias) = 'en' )
}
See also the rdfs:label
predicate.
ontolex:* TODO
ontolex:representation
, ontolex:lexicalForm
Lexemes
Query the grammatical gender (
P5185) of the german word
Aprikose.
select
?lexem
?gender
{
?lexem rdf:type ontolex:LexicalEntry .
?lexem dct:language wd:Q188 . # Q188: German
?lexem wikibase:lemma ?text . filter(str(?text) = "Aprikose")
?lexem wdt:P5185 ?gender_ . # P5185: grammatical gender
?gender_ rdfs:label ?gender . filter(lang(?gender) = 'de')
}
wdt:P…
wdt:P… connects a node to its so-called «truthy value».
p: connect wd:Q… to wds:…
select
?wds
?rel
?obj
{
wd:Q544 p:P2670 ?wds . # Q544 = Solar System
?wds ?rel ?obj .
}
ps: connect wds: to simple value
A «ps:…» relation connects a «wds:…» node to a simple value. Apparently, simple values are «Q numbers»
select
?wd
{
wds:Q544-1edcc7b8-4eb3-28d9-35b5-b6ff5bbc8e38 ps:P2670 ?wd .
}
Note that a «wdt:» relation connects a «wd:» node to a simple value.
p: , ps: and pq:
The following query prints a line chart of the population growth in Swizerland since 1970 and tries to demonstrate the combined usage of p:
, ps:
and pq:
relations.
First, wd:Q39 p:P1082 ?populationStmt
extracts all statements that are relate Swizterland (Q39) to its population (p:1082
).
Then, the pq:P585
relation extracts each statement's year, and ps:P1082
the statement's population value. It is unclear to me why I have to use pq:
for one relation an ps:
for the other one.
Finally, the
filter
condition only selects values where the year is greater or equal to 1970 (
Run it):
#defaultView:LineChart
select
?year
?population
{
wd:Q39 p:P1082 ?populationStmt .
?populationStmt pq:P585 ?year ;
ps:P1082 ?population .
filter(year(?year) >= 1970)
}
order by
?year
I belive that pq:
stands for «property qualifier» and ps:
for «property statement».
Note: this example also demonstrates a weakness of wikidata: no population values are shown for the years 2015, 2016 and after 2017, probably, because these values were imported before 2018 and are not maintained anymore.
psn:
TODO: does psn:
stand for «normalized property statement»?
pqv:
TODO: does pqv:
stand for «property qualifier value»?
P31 / instanceOf
P31 connects an object to the class of which it is an instance.
The following query finds the 100 classes with the most objects that belong to that class:
select
?cnt
?xLabel
?x
with {
select
(count(*) as ?cnt)
?x
{
[] wdt:P31 ?x .
}
group by ?x
order by desc(?cnt)
limit 100
}
as %I
{
include %I
service wikibase:label { bd:serviceParam wikibase:language "en" . }
}
order by
desc(?cnt)
DESCRIBE
descibe <entity>
returns all triplets where <entity>
is subject or object:
describe <http://www.wikidata.org/entity/Q72>
Wikidata categories
A Wikidata category is an instance of Q4167836
.
As of October 2022, there are over 5 Million categories:
select
(count(*) as ?cntWikidataCategories)
{
?category wdt:P31 wd:Q4167836 .
}
Disambiguation pages
A so-called disambiguation page should be an instance of wd:Q4167410
.
ask {
wd:Q399841 wdt:P31 wd:Q4167410 .
}
See also «Wikipedia article covering multiple topics» (Q21484471), «main subject» (P921) and «has parts» (P527)
RDFS and OWL terms
The Wikidata schema avoids direct use of RDFS or OWL terms and redefines many of them, e.g. wkd:P31 defines a local property similar to rdf:type. There are attempts to connect Wikidata properties to RDFS/OWL and provide alternative exports of Wikidata data.
Finding an Wikidata item's corresponding DBpedia resource
On the
DBpedia SPARQL endpoint, a Wikidata item's corresponding DBpedia resource can be queried like so:
select
?sub
{
?sub owl:sameAs <http://www.wikidata.org/entity/Q507459> .
}
Some queries
Mountains higher than 4000 meters in Switzerland
select
?elevation
?name
?mountain
{
?mountain wdt:P31 wd:Q8502 ; # ?mountain is an "instance of" (P31) a "mountain" (Q8502)
wdt:P17 wd:Q39 ; # ?mountain has "country" (P17) "Switzerland" (Q39)
wdt:P2044 ?elevation ; # P2044: "elevation above sea level"
rdfs:label ?name . # Use rdfs:label to find the mountain's name
filter(?elevation >= 4000)
filter(lang(?name) = 'de')
}
order by
?elevation
Most common relations
The following query tries to determine the most common relations.
In order to prevent a query timeout error, we have to use a nested query to select a sample of 10 Million relations before we group and count them:
select
?rel
(count(?rel) as ?cnt)
{
select ?rel
{
?x ?rel ?y .
}
limit 10000000
}
group by
?rel
order by
desc(count(?rel))
limit
100
On 2022-10-08, the query returned
rdf:type | 2150595 |
wikibase:rank | 1307831 |
schema:inLanguage | 900424 |
schema:name | 900424 |
schema:isPartOf | 900424 |
schema:about | 888219 |
prov:wasDerivedFrom | 793082 |
pq:P585 | 76617 |
pqv:P585 | 76615 |
ps:P31 | 42471 |
pq:P580 | 34544 |
pqv:P580 | 34498 |
ps:P1082 | 33245 |
psv:P1082 | 33244 |
pq:P459 | 32322 |
ps:P106 | 30558 |
ps:P646 | 29962 |
psn:P646 | 29962 |
ps:P18 | 22791 |
ps:P373 | 21941 |
pq:P582 | 21832 |
pqv:P582 | 21598 |
ps:P47 | 18965 |
pq:P1810 | 18165 |
psn:P214 | 15899 |
2022-11-12: it seems that the subquery with the limit
clause skewed the data. The following query counts (assumedly) better:
select
?cnt
?rel_
?relLabel
with {
#
# Use named subquery (which is guaranteed to be run only once) to improve performance
#
select
?rel_
(count(*) as ?cnt)
{
[] ?rel_ []
}
group by ?rel_
order by desc(?cnt)
limit 400 # not every relation will be matched below...
} as %rels
where {
include %rels .
optional { ?rel wikibase:directClaim ?rel_ .
?rel rdfs:label ?relLabel . filter(lang(?relLabel) = 'en') . }
}
order by desc(?cnt)
The following query builds on the previous one, but only returns relations for which an english label is defined:
select
?cnt
?rel_
?relLabel
with { # Use named subquery (which is guaranteed to be run only once) to improve performance
select
?rel_
(count(*) as ?cnt)
{
[] ?rel_ []
}
group by ?rel_
order by desc(?cnt)
limit 400 # not every relation will be matched below...
} as %rels
where {
include %rels .
?rel wikibase:directClaim ?rel_ .
?rel rdfs:label ?relLabel . filter(lang(?relLabel) = 'en') .
}
order by desc(?cnt)
All objects of a given node
The following query returns all relations of a given node (here: Switzerland = Q39).
In order to be more readable, the query returns a relation's label.
?relType
corresponds to the property type expected by the relation.
Because some relations point to another node, other relations just to a value, coalesce(…)
returns either the label of the referenced node or the value.
select
?relText
(coalesce(?xText, ?x) as ?obj)
?rel_
?relType
# ?x
# ?xText
{
# wd:Q35120 ?rel ?x .
wd:Q39 ?rel ?x .
?rel_ wikibase:directClaim ?rel .
?rel_ rdfs:label ?relText . filter(lang(?relText) = 'en')
?rel_ wikibase:propertyType ?relType .
optional {
?x rdfs:label ?xText . filter(lang(?xText ) = 'en')
}
}
order by
lcase(?relText)