Ontology alignment, or ontology matching, is the process of determining correspondences between
concept
Concepts are defined as abstract ideas. They are understood to be the fundamental building blocks of the concept behind principles, thoughts and beliefs.
They play an important role in all aspects of cognition. As such, concepts are studied by s ...
s in
ontologies
In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
,
cognitive science or
philosophy
Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. Some ...
.
Computer science
For
computer scientist
A computer scientist is a person who is trained in the academic study of computer science.
Computer scientists typically work on the theoretical side of computation, as opposed to the hardware side on which computer engineers mainly focus (al ...
s, concepts are expressed as labels for data. Historically, the need for ontology alignment arose out of the need to
integrate heterogeneous
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s, ones developed independently and thus each having their own data vocabulary. In the
Semantic Web context involving many actors providing their own
ontologies
In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
, ontology matching has taken a critical place for helping heterogeneous resources to interoperate. Ontology alignment tools find classes of data that are
semantically equivalent {{about, semantic equivalence of metadata, the concept in mathematical logic, Logical equivalence
In computer metadata, semantic equivalence is a declaration that two data elements from different vocabularies contain data that has similar meaning. ...
, for example, "truck" and "lorry". The classes are not necessarily logically identical. According to Euzenat and Shvaiko (2007),
[Jérôme Euzenat and Pavel Shvaiko. 2013]
Ontology matching
, Springer-Verlag, 978-3-642-38720-3. there are three major dimensions for similarity: syntactic, external, and semantic. Coincidentally, they roughly correspond to the dimensions identified by Cognitive Scientists below. A number of tools and frameworks have been developed for aligning ontologies, some with inspiration from Cognitive Science and some independently.
Ontology alignment tools have generally been developed to operate on
database schema
The database schema is the structure of a database described in a formal language supported by the database management system (DBMS). The term "schema" refers to the organization of data as a blueprint of how the database is constructed (divide ...
s,
XML schema
An XML schema is a description of a type of Extensible Markup Language, XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed ...
s,
[D. Aumueller, H. Do, S. Massmann, E. Rahm. 2005]
Schema and ontology matching with COMA++
Proc. of the 2005 International Conference on Management of Data, pp. 906-908 taxonomies,
formal language
In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.
The alphabet of a formal language consists of symb ...
s,
entity-relationship models,
dictionaries
A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies, p ...
, and other label frameworks. They are usually converted to a graph representation before being matched.
Since the emergence of the Semantic Web, such graphs can be represented in the
Resource Description Framework The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of ...
line of languages by triples of the form
, as illustrated in the Notation 3
Notation3, or N3 as it is more commonly known, is a shorthand non-XML serialization of Resource Description Framework models, designed with human-readability in mind: N3 is much more compact and readable than XML RDF notation. The format is being ...
syntax.
In this context, aligning ontologies is sometimes referred to as "ontology matching".
The problem of Ontology Alignment has been tackled recently by trying to compute matching first and mapping (based on the matching) in an automatic fashion. Systems like DSSim, X-SOM or COMA++ obtained at the moment very high precision and recall. Th
Ontology Alignment Evaluation Initiative
aims to evaluate, compare and improve the different approaches.
Formal definition
Given two ontologies and where is the set of classes, is the set of relations, is the set of individuals, is the set of data types, and is the set of values, we can define different types of (inter-ontology) relationships. Such relationships will be called, all together, alignments and can be categorized among different dimensions:
* similarity vs logic: this is the difference between matchings (predicating about the similarity of ontology terms), and mappings (logical axiom
An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
s, typically expressing logical equivalence
In logic and mathematics, statements p and q are said to be logically equivalent if they have the same truth value in every model. The logical equivalence of p and q is sometimes expressed as p \equiv q, p :: q, \textsfpq, or p \iff q, depending on ...
or inclusion among ontology terms)
* atomic vs complex: whether the alignments we considered are one-to-one, or can involve more terms in a query-like formulation (e.g., LAV/GAV mapping)
* homogeneous vs heterogeneous: do the alignments predicate on terms of the same type (e.g., classes are related only to classes, individuals to individuals, etc.) or we allow heterogeneity in the relationship?
* type of alignment: the semantics associated to an alignment. It can be subsumption
Subsumption may refer to:
* A minor premise in symbolic logic (see syllogism)
* The Liskov substitution principle in object-oriented programming
* Subtyping in programming language theory
* Subsumption architecture in robotics
* A subsumption ...
, equivalence, disjointness
In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A c ...
, part-of
In linguistics, meronymy () is a semantic relation between a meronym denoting a part and a holonym denoting a whole. In simpler terms, a meronym is in a ''part-of'' relationship with its holonym. For example, ''finger'' is a meronym of ''hand'' ...
or any user-specified relationship.
Subsumption, atomic, homogeneous alignments are the building blocks to obtain richer alignments, and have a well defined semantics in every Description Logic.
Let's now introduce more formally ontology matching and mapping.
An atomic homogeneous matching is an alignment that carries a similarity degree