Ontology Alignment
   HOME

TheInfoList



OR:

Ontology alignment, or ontology matching, is the process of determining correspondences between
concept Concepts are defined as abstract ideas. They are understood to be the fundamental building blocks of the concept behind principles, thoughts and beliefs. They play an important role in all aspects of cognition. As such, concepts are studied by s ...
s in
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, cognitive science or
philosophy Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. Some ...
.


Computer science

For
computer scientist A computer scientist is a person who is trained in the academic study of computer science. Computer scientists typically work on the theoretical side of computation, as opposed to the hardware side on which computer engineers mainly focus (al ...
s, concepts are expressed as labels for data. Historically, the need for ontology alignment arose out of the need to integrate heterogeneous
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s, ones developed independently and thus each having their own data vocabulary. In the Semantic Web context involving many actors providing their own
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
, ontology matching has taken a critical place for helping heterogeneous resources to interoperate. Ontology alignment tools find classes of data that are
semantically equivalent {{about, semantic equivalence of metadata, the concept in mathematical logic, Logical equivalence In computer metadata, semantic equivalence is a declaration that two data elements from different vocabularies contain data that has similar meaning. ...
, for example, "truck" and "lorry". The classes are not necessarily logically identical. According to Euzenat and Shvaiko (2007),Jérôme Euzenat and Pavel Shvaiko. 2013
Ontology matching
, Springer-Verlag, 978-3-642-38720-3.
there are three major dimensions for similarity: syntactic, external, and semantic. Coincidentally, they roughly correspond to the dimensions identified by Cognitive Scientists below. A number of tools and frameworks have been developed for aligning ontologies, some with inspiration from Cognitive Science and some independently. Ontology alignment tools have generally been developed to operate on
database schema The database schema is the structure of a database described in a formal language supported by the database management system (DBMS). The term "schema" refers to the organization of data as a blueprint of how the database is constructed (divide ...
s,
XML schema An XML schema is a description of a type of Extensible Markup Language, XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed ...
s,D. Aumueller, H. Do, S. Massmann, E. Rahm. 2005
Schema and ontology matching with COMA++
Proc. of the 2005 International Conference on Management of Data, pp. 906-908
taxonomies,
formal language In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of symb ...
s, entity-relationship models,
dictionaries A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies, p ...
, and other label frameworks. They are usually converted to a graph representation before being matched. Since the emergence of the Semantic Web, such graphs can be represented in the
Resource Description Framework The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of ...
line of languages by triples of the form , as illustrated in the
Notation 3 Notation3, or N3 as it is more commonly known, is a shorthand non-XML serialization of Resource Description Framework models, designed with human-readability in mind: N3 is much more compact and readable than XML RDF notation. The format is being ...
syntax. In this context, aligning ontologies is sometimes referred to as "ontology matching". The problem of Ontology Alignment has been tackled recently by trying to compute matching first and mapping (based on the matching) in an automatic fashion. Systems like DSSim, X-SOM or COMA++ obtained at the moment very high precision and recall. Th
Ontology Alignment Evaluation Initiative
aims to evaluate, compare and improve the different approaches.


Formal definition

Given two ontologies i=\langle C_, R_, I_, T_, V_\rangle and j=\langle C_, R_, I_, T_, V_\rangle where C is the set of classes, R is the set of relations, I is the set of individuals, T is the set of data types, and V is the set of values, we can define different types of (inter-ontology) relationships. Such relationships will be called, all together, alignments and can be categorized among different dimensions: * similarity vs logic: this is the difference between matchings (predicating about the similarity of ontology terms), and mappings (
logical axiom An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
s, typically expressing
logical equivalence In logic and mathematics, statements p and q are said to be logically equivalent if they have the same truth value in every model. The logical equivalence of p and q is sometimes expressed as p \equiv q, p :: q, \textsfpq, or p \iff q, depending on ...
or inclusion among ontology terms) * atomic vs complex: whether the alignments we considered are one-to-one, or can involve more terms in a query-like formulation (e.g., LAV/GAV mapping) * homogeneous vs heterogeneous: do the alignments predicate on terms of the same type (e.g., classes are related only to classes, individuals to individuals, etc.) or we allow heterogeneity in the relationship? * type of alignment: the semantics associated to an alignment. It can be
subsumption Subsumption may refer to: * A minor premise in symbolic logic (see syllogism) * The Liskov substitution principle in object-oriented programming * Subtyping in programming language theory * Subsumption architecture in robotics * A subsumption ...
, equivalence,
disjointness In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A c ...
,
part-of In linguistics, meronymy () is a semantic relation between a meronym denoting a part and a holonym denoting a whole. In simpler terms, a meronym is in a ''part-of'' relationship with its holonym. For example, ''finger'' is a meronym of ''hand'' ...
or any user-specified relationship. Subsumption, atomic, homogeneous alignments are the building blocks to obtain richer alignments, and have a well defined semantics in every Description Logic. Let's now introduce more formally ontology matching and mapping. An atomic homogeneous matching is an alignment that carries a similarity degree s\in ,1/math>, describing the similarity of two terms of the input ontologies i and j. Matching can be either ''computed'', by means of heuristic algorithms, or ''
inferred Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that i ...
'' from other matchings. Formally we can say that, a matching is a quadruple m=\langle id, t_, t_, s\rangle, where t_ and t_ are homogeneous ontology terms, s is the similarity degree of m. A (subsumption, homogeneous, atomic) mapping is defined as a pair \mu=\langle t_, t_\rangle, where t_ and t_ are homogeneous ontology terms.


Cognitive science

For cognitive scientists interested in ontology alignment, the "concepts" are nodes in a
semantic network A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, ...
that reside in brains as "conceptual systems." The focal question is: if everyone has unique experiences and thus different semantic networks, then how can we ever understand each other? This question has been addressed by a model called ABSURDIST (Aligning Between Systems Using Relations Derived Inside Systems for Translation). Three major dimensions have been identified for similarity as equations for "internal similarity, external similarity, and mutual inhibition."


Ontology alignment methods

Two sub research fields have emerged in ontology mapping, namely monolingual ontology mapping and cross-lingual ontology mapping. The former refers to the mapping of ontologies in the same natural language, whereas the latter refers to "the process of establishing relationships among ontological resources from two or more independent ontologies where each ontology is labelled in a different natural language". Existing matching methods in monolingual ontology mapping are discussed in Euzenat and Shvaiko (2007). Approaches to cross-lingual ontology mapping are presented in Fu et al. (2011).Fu B., Brennan R., O'Sullivan D., Using Pseudo Feedback to Improve Cross-Lingual Ontology Mappin

In Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011), LNCS 6643, pp.336-351, Heraklion, Greece, May 2011.


See also

*
Data conversion Data conversion is the conversion of computer data from one format to another. Throughout a computer environment, data is encoded in a variety of ways. For example, computer hardware is built on the basis of certain standards, which requires tha ...
*
Graph isomorphism In graph theory, an isomorphism of graphs ''G'' and ''H'' is a bijection between the vertex sets of ''G'' and ''H'' : f \colon V(G) \to V(H) such that any two vertices ''u'' and ''v'' of ''G'' are adjacent in ''G'' if and only if f(u) and f(v) a ...
*
Minimal mappings Minimal mappings are the result of an advanced technique of semantic matching, a technique used in computer science to identify information which is semantically related. Semantic matching has been proposed as a valid solution to the semantic het ...
*
Ontology (information science) In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
*
Rule Interchange Format The Rule Interchange Format (RIF) is a W3C Recommendation. RIF is part of the infrastructure for the semantic web, along with (principally) SPARQL, RDF and OWL. Although originally envisioned by many as a "rules layer" for the semantic web, in ...
*
Semantic heterogeneity Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogen ...
*
Semantic integration Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including ...
*
Semantic interoperability Semantic interoperability is the ability of computer systems to exchange data with unambiguous, shared meaning. Semantic interoperability is a requirement to enable machine computable logic, inferencing, knowledge discovery, and data federation bet ...
*
Semantic matching Semantic matching is a technique used in computer science to identify information which is semantically related. Given any two graph-like structures, e.g. classifications, taxonomies database or XML schemas and ontologies, matching is an operato ...
* Semantic unification


References


Further reading


Collection of surveys and research papers related to ontology mapping, matching, and alignment

The Ontology Alignment Source

ABSURDIST



Ontology alignment for linked open data

Instance-based ontology matching
* Noy, N. F. (2004).
Semantic integration: a survey of ontology-based approaches
" SIGMOD Rec. 33(4): 65-70.
Ontology mapping and alignment tools


External links


ITM Align: semi-automated ontology alignment
{{Webarchive, url=https://web.archive.org/web/20101103233846/http://www.stanford.edu/~sfalc/cogz/cogz.html , date=2010-11-03
AgreementMaker: Matching for large real-world schemas and ontologiesBiomixer
A web-based collaborative ontology visualization tool
SDI(Semantic Data Integration) Tool
Semantic mapping representation and generation tool using UML for system engineers Ontology (information science) Semantic Web Knowledge engineering Information science Knowledge representation