Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Cleaned object properties extracted with mappings

Cleaned version of high quality statements with IRI object values extracted by the mappings extraction from Wikipedia Infoboxes.

Statements are based on input from mappingbased-objects-uncleaned after applying post processing steps:

  1. Canonicalization of all object values replacing them by their (transitive) redirects, i.e. http://dbpedia.org/resource/Barack_Obama_Jr will be replaced by http://dbpedia.org/resource/Barack_Obama . The _transitive file of the corresponding language chapter from redirects dataset will be used to resolve the transitive redirects. See code for more details.
  2. Type consistency filtering: extracted rdf:type statements from instance-types are used to check domain and range according to the definition of the properties in the DBpedia ontology. Statements with predicate p for which the subject resource is from a different type than specified in rdfs:domain of p are passed to _disjointDomain files, whereas statements with an object resource disjoint from rdfs:range will be passed _disjointRange files. Statements where the types match or are subtypes of the expected ones are passed to the regular dataset files (without content variant). See code for more details. We keep the disjoint* files since they can contain also false positives due to incomplete type information (e.g. no infobox exists for a specific resource or infobox class mapping is incomplete). If you union all 3 files the results is same as applying only step 1.