A heterogeneous database system is an automated (or semi-automated) system for the
integration of heterogeneous, disparate
database management system
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and an ...
s to present a user with a single, unified query interface.
Heterogeneous database systems (HDBs) are computational models and software implementations that provide heterogeneous database integration.
Problems of heterogeneous database integration
This article does not contain details of
distributed database management systems (sometimes known as
federated database system
A federated database system (FDBS) is a type of Meta (prefix), meta-database management system (DBMS), which transparently maps multiple autonomous Database management system, database systems into a single federated database. The constituent data ...
s).
Technical heterogeneity
Different
file format
A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
s, access
protocols, query languages etc. Often called syntactic heterogeneity from the point of view of data.
Data model heterogeneity
Different ways of representing and storing the same data. Table decompositions may vary, column names (data labels) may be different (but have the same semantics), data
encoding
In communications and Data processing, information processing, code is a system of rules to convert information—such as a letter (alphabet), letter, word, sound, image, or gesture—into another form, sometimes data compression, shortened or ...
schemes may vary (i.e., should a measurement scale be explicitly included in a field or should it be implied elsewhere). Also referred as schematic heterogeneity.
Semantic heterogeneity
Data across constituent databases may be related but different. Perhaps a database system must be able to integrate genomic and proteomic data. They are related—a gene may have several protein products—but the data are different (
nucleotide
Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
sequences and
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
sequences, or hydrophilic or -phobic amino acid sequence and positively or negatively charged amino acids). There may be many ways of looking at semantically similar, but distinct, datasets.
The system may also be required to present "new" knowledge to the user. Relationships may be inferred between data according to rules specified in domain
ontologies
In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...
.
See also
*
Big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
*
Expert system
In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert.
Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as ...
*
Knowledge base
In computer science, a knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces migh ...
*
Ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
References
{{reflist
Database management systems