YAGO (Yet Another Great
Ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
) is an open source
knowledge base
In computer science, a knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces migh ...
developed at the
Max Planck Institute for Informatics
The Max Planck Institute for Informatics (German: ''Max-Planck-Institut für Informatik'', abbreviated ''MPI-INF'' or ''MPII'') is a research institute in computer science with a focus on algorithms and their applications in a broad sense. It host ...
in
Saarbrücken
Saarbrücken (; Rhenish Franconian: ''Sabrigge'' ; ; ; ; ) is the capital and largest List of cities and towns in Germany, city of the state of Saarland, Germany. Saarbrücken has 181,959 inhabitants and is Saarland's administrative, commerci ...
. It is automatically extracted from
Wikidata
Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, are able to use under the CC0 public domain ...
and
Schema.org.
YAGO4, which was released in 2020, combines data that was extracted from Wikidata with relationship designators from Schema.org. The previous version of YAGO, YAGO3, had knowledge of more than 10 million entities and contained more than 120 million facts about these entities. The information in YAGO3 was extracted from
Wikipedia
Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
(e.g., categories, redirects, infoboxes),
WordNet
WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into ''synsets'' with short definitions and usage examples. It can thu ...
(e.g., synsets, hyponymy), and
GeoNames
GeoNames (or GeoNames.org) is a user-editable geographical database available and accessible through various web services, under a Creative Commons attribution license. The project was founded in late 2005.
The GeoNames dataset differs from, b ...
.
The accuracy of YAGO was manually evaluated to be above 95% on a sample of facts. To integrate it to the
linked data
In computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web ...
cloud, YAGO has been linked to the
DBpedia
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia a ...
ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
and to the
SUMO
is a form of competitive full-contact wrestling where a ''rikishi'' (wrestler) attempts to force his opponent out of a circular ring (''dohyō'') or into touching the ground with any body part other than the soles of his feet (usually by th ...
ontology.
YAGO3 is provided in
Turtle
Turtles are reptiles of the order (biology), order Testudines, characterized by a special turtle shell, shell developed mainly from their ribs. Modern turtles are divided into two major groups, the Pleurodira (side necked turtles) and Crypt ...
and
tsv formats. Dumps of the whole
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
are available, as well as thematic and specialized dumps. It can also be queried through various online browsers and through a
SPARQL
SPARQL (pronounced ":wikt:sparkle, sparkle", a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a Semantic Query, semantic query language for databases—able to retrieve and manipulate data sto ...
endpoint hosted by OpenLink Software. The source code of YAGO3 is available on
GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
.
YAGO has been used in the
Watson artificial intelligence system.
[David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, Chris Welty. Building Watson: An Overview of the DeepQA Project. AI Magazine 31(3): 59–79 (2010)]
See also
*
Commonsense knowledge bases
*
Cyc
Cyc (pronounced ) is a long-term artificial intelligence (AI) project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge ...
*
Evi (software)
Evi (formerly True Knowledge) is a technology company in Cambridge, England, founded by William Tunstall-Pedoe,
*
DBpedia
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia a ...
References
External links
YAGO Homepage
YAGO Homepage (Old)
{{Semantic Web
Max Planck Institute for Informatics
Knowledge bases
Online databases
Creative Commons-licensed databases