The
semantic spectrum
A spectrum (plural ''spectra'' or ''spectrums'') is a condition that is not limited to a specific set of values but can vary, without gaps, across a continuum. The word was first used scientifically in optics to describe the rainbow of colors ...
(sometimes referred to as the ontology spectrum or the smart data continuum or semantic precision) is a series of increasingly precise or rather
semantically
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and compu ...
expressive definitions for
data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:
# An identification such as a data element name
# A clear data element definition
# One or more representation terms
...
s in
knowledge representations, especially for machine use.
At the low end of the spectrum is a simple binding of a single word or phrase and its definition. At the high end is a full
ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities exi ...
that specifies relationships between data elements using precise
URIs for relationships and properties.
With increased
specificity comes increased precision and the ability to use tools to automatically
integrate systems but also increased cost to build and maintain a
metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
A metadata repository is the database where metadata is stored. The registry also adds relationships with r ...
.
Some steps in the semantic spectrum include the following:
#
glossary: A simple list of terms and their definitions. A glossary focuses on creating a complete list of the terminology of domain-specific terms and
acronyms
An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
. It is useful for creating clear and unambiguous definitions for terms and because it can be created with simple word processing tools, few technical tools are necessary.
#
controlled vocabulary
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controllin ...
: A simple list of terms, definitions and naming conventions. A controlled vocabulary frequently has some type of oversight process associated with adding or removing data element definitions to ensure consistency. Terms are often defined in relationship to each other.
#
data dictionary
A data dictionary, or metadata repository, as defined in the ''IBM Dictionary of Computing'', is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". ''Oracle'' defines it ...
: Terms, definitions, naming conventions and one or more representations of the data elements in a computer system. Data dictionaries often define data types, validation checks such as enumerated values and the formal definitions of each of the enumerated values.
#
data model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be c ...
: Terms, definitions, naming conventions, representations and one or more representations of the data elements as well as the beginning of specification of the relationships between data elements including abstractions and containers.
#
taxonomy
Taxonomy is the practice and science of categorization or classification.
A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. ...
: A complete data model in an inheritance hierarchy where all data elements inherit their behaviors from a single "super data element". The difference between a data model and a formal taxonomy is the arrangement of data elements into a formal tree structure where each element in the tree is a formally defined concept with associated properties.
#
ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities exi ...
: A complete, machine-readable specification of a conceptualization using
URIs (and then
IRIs) for all data elements, properties and relationship types. The
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
standard language for representing ontologies is the
Web Ontology Language
The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for vario ...
(OWL). Ontologies frequently contain formal business rules formed in discrete logic statements that relate data elements to each another.
Typical questions for determining semantic precision
The following is a list of questions that may arise in determining semantic precision.
;correctness: How can correct syntax and semantics be enforced? Are tools (such as
XML Schema
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
) readily available to validate syntax of data exchanges?
;adequacy/expressivity/scope: Does the system represent everything that is of practical use for the purpose? Is an emphasis being placed on data that is externalized (exposed or transferred between systems)?
;efficiency: How efficiently can the representation be searched / queried, and - possibly -
reasoned
Reason is the capacity of consciously applying logic by drawing conclusions from new or existing information, with the aim of seeking the truth. It is closely associated with such characteristically human activities as philosophy, science, lan ...
on?
;complexity: How steep is the
learning curve
A learning curve is a graphical representation of the relationship between how proficient people are at a task and the amount of experience they have. Proficiency (measured on the vertical axis) usually increases with increased experience (the ...
for defining new concepts, querying for them or constraining them? are there appropriate tools for simplifying typical workflows? (See also:
ontology editor)
;translatability: Can the representation easily be transformed (e.g. by
Vocabulary-based transformation
In metadata, a vocabulary-based transformation (VBT) is a transformation aided by the use of a semantic equivalence statements within a controlled vocabulary.
Many organizations today require communication between two or more computers. Although ...
) into an equivalent representation so that
semantic equivalence {{about, semantic equivalence of metadata, the concept in mathematical logic, Logical equivalence
In computer metadata, semantic equivalence is a declaration that two data elements from different vocabularies contain data that has similar meaning. ...
is ensured?
Determining location on the semantic spectrum
Many organizations today are building a
metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
A metadata repository is the database where metadata is stored. The registry also adds relationships with r ...
to store their data definitions and to perform
metadata publishing Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes.
Metadata publishing is the foundation upon which a ...
. The question of where they are on the semantic spectrum frequently arises. To determine where your systems are, some of the following questions are frequently useful.
# Is there a centralized glossary of terms for the subject matter?
# Does the glossary of terms include precise definitions for each terms?
# Is there a central repository to store
data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:
# An identification such as a data element name
# A clear data element definition
# One or more representation terms
...
s that includes data types information?
# Is there an approval process associated with the creation and changes to data elements?
# Are coded data elements fully enumerated? Does each enumeration have a full definition?
# Is there a process in place to remove duplicate or redundant data elements from the metadata registry?
# Is there one or more classification schemes used to classify data elements?
# Are document exchanges and
web services created using the data elements?
# Can the central metadata registry be used as part of a
Model-driven architecture
Model Driven Architecture (MDA) is a software design approach for the development of software systems. It provides a set of guidelines for the structuring of specifications, which are expressed as models. Model Driven Architecture is a kind of doma ...
?
# Are there staff members trained to extract data elements that can be reused in metadata structures?
Strategic nature of semantics
Today, much of the World Wide Web is stored as
Hypertext Markup Language
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript ...
. Search engines are severely hampered by their inability to understand the meaning of published web pages. These limitations have led to the advent of the
Semantic web movement.
In the past, many organizations that created custom database application used isolated teams of developers that did not formally publish their data definitions. These teams frequently used internal data definitions that were incompatible with other computer systems. This made
Enterprise Application Integration
Enterprise application integration (EAI) is the use of software and computer systems' architectural principles to integrate a set of enterprise computer applications.
Overview
Enterprise application integration is an integration framework comp ...
and
Data warehousing
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. DWs are central repositories of integr ...
extremely difficult and costly. Many organizations today require that teams consult a centralized data registry before new applications are created.
The job title of an individual that is responsible for coordinating an organization's data is a
Data architect
A data architect is a practitioner of data architecture, a data management discipline concerned with designing, creating, deploying and managing an organization's data architecture. Data architects define how the data will be stored, consumed, inte ...
.
History
The first reference to this term was at the 199
AAAIOntologies Panel The panel was organized by Chris Welty, who at the prodding of Fritz Lehmann and in collaboration with the panelists (Fritz,
Mike Uschold,
Mike Gruninger, and
Deborah McGuinness
Deborah Louise McGuinness (born ca. 1960) is an American computer scientist and researcher at Rensselaer Polytechnic Institute (RPI). She is a professor of Computer, Cognitive and Web Sciences, Industrial and Systems Engineering, and an endowed ...
) came up with a "spectrum" of kinds of information systems that were, at the time, referred to as ontologies. The "ontology spectrum" picture appeared in print in the introduction to
Formal Ontology and Information Systems: Proceedings of the 2001 Conference'' The ontology spectrum was also featured in a talk at the Semantics for the Web meeting in 2000 at Dagstuhl by Deborah McGuinness. McGuinness produced
describing the points on that spectrum that appeared in the book that emerged (much later) from that workshop calle
"Spinning the Semantic Web."Later, Leo Obrst extended the spectrum into two dimensions (which technically is not really a spectrum anymore) and added a lot more detail, which was included in his book, ''The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management.''
The concept of the Semantic precision in
business systems was popularized by
Dave McComb in his book ''Semantics in Business Systems: The Savvy Managers Guide'' published in 2003 where he frequently uses the term Semantic Precision.
This discussion centered around a 10 level partition that included the following levels (listed in the order of increasing semantic precision):
# Simple Catalog of Data Elements
#
Glossary of Terms and Definitions
#
Thesauri
A thesaurus (plural ''thesauri'' or ''thesauruses'') or synonym dictionary is a reference work for finding synonyms and sometimes antonyms of words. They are often used by writers to help find the best word to express an idea:
Synonym dictionar ...
, Narrow Terms,
Relationships
# Informal "
Is-a
In knowledge representation, object-oriented programming and design (see object-oriented program architecture), is-a (is_a or is a) is a subsumption relationship between abstractions (e.g. types, classes), wherein one class ''A'' is a subclass ...
" relationships
# Formal "Is-a" relationships
# Formal
instances
#
Frames (properties)
#
Value Restrictions
#
Disjointness
In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A c ...
, Inverse,
Part-of
In linguistics, meronymy () is a semantic relation between a meronym denoting a part and a holonym denoting a whole. In simpler terms, a meronym is in a ''part-of'' relationship with its holonym. For example, ''finger'' is a meronym of ''hand' ...
#
General Logical Constraints
Note that there was a special emphasis on the adding of formal ''is-a'' relationships to the spectrum which seems to have been dropped.
The compan
Cerebrahas also popularized this concept by describing the data formats that exist within an enterprise in their ability to store semantically precise
metadata. Their list includes:
#
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaSc ...
#
PDF
#
Word Processing
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
documents
#
Microsoft Excel
Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android and iOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for App ...
#
Relational databases
#
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
#
XML Schema
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
#
Taxonomies
#
Ontologies
In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
What the concepts share in common is the ability to store information with increasing precision to facilitate intelligent agents.
See also
*
Enterprise messaging system
An enterprise messaging system (EMS) or messaging system in brief is a set of published enterprise-wide standards that allows organizations to send semantically precise messages between computer systems. EMS systems promote loosely coupled arch ...
*
Semantics
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comp ...
*
SKOS
Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the ...
*
Web service
*
Classification scheme (information science)
In information science and ontology, a classification scheme is the product of arranging things into kinds of things (classes) or into ''groups'' of classes; this bears similarity to categorization, but with perhaps a more theoretical bent, as cla ...
References
* ''Semantics in Business Systems: The Savvy Managers Guide'',
Dave McComb, 2003
Ontologies Come of Ageby
Deborah L. McGuinness
Figure 2 that includes Ontological Spectrum
{{DEFAULTSORT:Semantic Spectrum
Ontology (information science)
Metadata