Semantic integration
   HOME

TheInfoList



OR:

Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including social graphs), search results, and advertising and marketing relevance derived from them. In this regard,
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
focuses on the organization of and action upon
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
by acting as an intermediary between heterogeneous data sources, which may conflict not only by structure but also context or value.


Applications and methods

In
enterprise application integration Enterprise application integration (EAI) is the use of software and computer systems' architectural principles to integrate a set of enterprise computer applications. Overview Enterprise application integration is an integration framework comp ...
(EAI), semantic integration can facilitate or even automate the communication between computer systems using
metadata publishing Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes. Metadata publishing is the foundation upon which a ...
. Metadata publishing potentially offers the ability to automatically link
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
. One approach to (semi-)automated ontology mapping requires the definition of a semantic distance or its inverse,
semantic similarity Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tool ...
and appropriate rules. Other approaches include so-called ''lexical methods'', as well as methodologies that rely on exploiting the structures of the ontologies. For explicitly stating similarity/equality, there exist special properties or relationships in most ontology languages.
OWL Owls are birds from the order Strigiformes (), which includes over 200 species of mostly solitary and nocturnal birds of prey typified by an upright stance, a large, broad head, binocular vision, binaural hearing, sharp talons, and feathers a ...
, for example has "owl:equivalentClass", "owl:equivalentProperty" and "owl:
sameAs In data science, sameAs or exactMatch is a method of indicating that the subject of, or entity represented by, two resources is considered to be one and the same thing. It is a key part of the Semantic Web. Uses The concept of sameAs exists in ...
". Eventually system designs may see the advent of composable architectures where published semantic-based interfaces are joined together to enable new and meaningful capabilities. These could predominately be described by means of design-time declarative specifications, that could ultimately be rendered and executed at run-time. Semantic integration can also be used to facilitate design-time activities of interface design and mapping. In this model, semantics are only explicitly applied to design and the run-time systems work at the
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
level. This "early semantic binding" approach can improve overall system performance while retaining the benefits of semantic driven design.


Semantic integration situations

From the industry use case, it has been observed that the semantic mappings were performed only within the scope of the ontology class or the
datatype In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most progra ...
property. These identified semantic integrations are (1) integration of ontology class instances into another ontology class without any constraint, (2) integration of selected instances in one ontology class into another ontology class by the range constraint of the property value and (3) integration of ontology class instances into another ontology class with the value transformation of the instance property. Each of them requires a particular mapping relationship, which is respectively: (1) equivalent or subsumption mapping relationship, (2) conditional mapping relationship that constraints the value of property (data range) and (3) transformation mapping relationship that transforms the value of property (unit transformation). Each identified mapping relationship can be defined as either (1) direct mapping type, (2) data range mapping type or (3) unit transformation mapping type.


KG vs. RDB approaches

In the case of integrating supplemental data source, * KG(
Knowledge graph The Google Knowledge Graph is a knowledge base from which Google serves relevant information in an infobox beside its search results. This allows the user to see the answer in a glance. The data is generated automatically from a variety of sou ...
) formally represents the meaning involved in information by describing concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources. The rules can be applied on KG more efficiently using graph query. For example, the graph query does the data inference through the connected relations, instead of repeated full search of the tables in relational database. KG facilitates the integration of new heterogeneous data by just adding new relationships between existing information and new entities. This facilitation is emphasized for the integration with existing popular linked open data source such as Wikidata.org. * SQL query is tightly coupled and rigidly constrained by datatype within the specific database and can join tables and extract data from tables, and the result is generally a table, and a query can join tables by any columns which match by datatype.
SPARQL SPARQL (pronounced " sparkle" , a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description ...
query is the standard query language and protocol for Linked Open Data on the web and loosely coupled with the database so that it facilitates the reusability and can extract data through the relations free from the datatype, and not only extract but also generate additional knowledge graph with more sophisticated operations(logic: transitive/symmetric/inverseOf/functional). The inference based query (query on the existing asserted facts without the generation of new facts by logic) can be fast comparing to the reasoning based query (query on the existing plus the generated/discovered facts based on logic). * The information integration of heterogeneous data sources in traditional database is intricate, which requires the redesign of the database table such as changing the structure and/or addition of new data. In the case of semantic query, SPARQL query reflects the relationships between entities in a way that aligned with human's understanding of the domain, so the semantic intention of the query can be seen on the query itself. Unlike SPARQL, SQL query, which reflects the specific structure of the database and derived from matching the relevant primary and foreign keys of tables, loses the semantics of the query by missing the relationships between entities. Below is the example that compares SPARQL and SQL queries for medications that treats "TB of vertebra".
SELECT ?medication
WHERE
SELECT DRUG.medID
FROM DIAGNOSIS, DRUG, DRUG_DIAGNOSIS
WHERE DIAGNOSIS.diagnosisID=DRUG_DIAGNOSIS.diagnosisID
AND DRUG.medID=DRUG_DIAGNOSIS.medID
AND DIAGNOSIS.name=”TB of vertebra”


Examples

The
Pacific Symposium on Biocomputing The Pacific Symposium on Biocomputing (PSB) is an annual multidisciplinary scientific meeting co-founded in 1996 by Dr. Teri Klein, Dr. Lawrence Hunter and Sharon Surles. The conference is to presentation and discuss research in the theory and a ...
has been a venue for the popularization of the ontology mapping task in the biomedical domain, and a number of papers on the subject can be found in its proceedings.


See also

*
Data integration Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies ...
*
Dataspaces Dataspaces are an abstraction in data management that aim to overcome some of the problems encountered in data integration system. The aim is to reduce the effort required to set up a data integration system by relying on existing matching and ma ...
*
Enterprise integration Enterprise integration is a technical field of enterprise architecture, which is focused on the study of topics such as system interconnection, electronic data interchange, product data exchange and distributed computing environments. It is a c ...
*
Ontology-based data integration Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-Vie ...
*
Ontology alignment Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, ...
* Ontology engineering *
Ontology matching Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, ...
*
Semantic heterogeneity Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogen ...
* Semantic technology *
Semantic translation Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model. Semantic translation takes advantage of semantics that associate meani ...
* Semantic unification


References


External links


Semantic Integration: Loosely Coupling the Meaning of DataOntology Mapping: The State of the Art
(2005 paper)
2010 paper by Carl HewittOpenCyc to Oracle Interface
{{DEFAULTSORT:Semantic Integration Ontology (information science) Data management Semantics Bioinformatics