OntoClean is a methodology for analyzing

ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...

based on formal, domain-independent properties of classes (the metaproperties) developed by

Nicola Guarino Nicola Guarino (born 1954, in Messina) is an Italian computer scientist and researcher in the area of Formal Ontology for Information Systems, and the head of the Laboratory for Applied Ontology (LOA), part of the Italian National Research Council ...

and

Chris Welty Christopher A. Welty is an American computer scientist, who works at Google Research in New York. He is best known for his work on ontologies, in the Semantic Web, and on IBM's Watson. While on sabbatical from Vassar College from 1999 to 200 ...

Overview and History

OntoClean was the first attempt to formalize notions of ontological analysis for information systems. The idea was to justify the kinds of decisions that experienced ontology builders make, and explain the common mistakes of the inexperienced.
Alan Rector
during a debate at the KR-2002 conference in

Toulouse Toulouse ( , ; oc, Tolosa ) is the prefecture of the French department of Haute-Garonne and of the larger region of Occitania. The city is on the banks of the River Garonne, from the Mediterranean Sea, from the Atlantic Ocean and from Par ...

, said, "What you have done is reduce the amount of time I spend arguing with medics." The notions Guarino & Welty focused on were drawn from philosophical ontology. They were not after the seemingly endless arguments about what the right ontology of the universe is, but rather the techniques these philosophers use to analyze, support, and criticize each other's arguments. These techniques make very little, if any, commitment to a particular ontology, instead they expose what are often very subtle distinctions. The ideas underlying OntoClean appeared first in the literature in a series of three papers published in 2000. The name ''OntoClean'' does not appear in the literature until 2002. According to Thompson-ISI, work on OntoClean was the most cited of academic papers on Ontology. OntoClean was important as it was the first formal methodology for

ontology engineering In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categori ...

, applying scientific principles to a field whose practice was mostly art.

Note on terminology

In logic, a property is a unary predicate in intention, in other words a property is ''what it means'' to be a member of a class. For example, we say that instances of the Person class have the property of "being a person." In the semantic web, a property is a binary relation. The distinction between property and class is subtle, and probably not critical to understanding OntoClean, however this article, follows the OntoClean publications and consistently uses "property" according to its original meaning, and one can treat "property" and "class" as synonymous. Thus a metaproperty is a property of a property or class.

Metaproperties

The basis of OntoClean are the domain-independent properties of classes, the OntoClean metaproperties: identity, unity, rigidity, and dependence. Later work by Welty & Andersen has added two more metaproperties: permanence and actuality.

Identity

Identity is fundamental to ontology, and especially to information systems ontologies. Identity is well known in metaphysics and in database conceptual modeling. In the latter case, it is an accepted best practice to specify a primary key for rows in a table. If "two" rows have identical primary keys, they are considered the same row. More importantly for ontology are questions of identity that expose the existence of, or at least the need to represent, other entities. Here the issue at stake is finding the conditions under which a proposed entity would be both the same and different. The classic example is an amount of clay that is shaped into a statue. If you use the ''same'' clay but reshape it into a ''different'' statue, is it the same entity? If so, how could it be ''different''? If not, how could it be ''the same''. In conceptual modeling, it is understood that when such an ambiguity arises, one should treat it as two different entities to account for a situation where one changes and the other stays the same. In OntoClean, ''identity criteria'' are associated with, or carried by, some classes of entities, called ''sortals''. A sortal is a class all of whose instances are identified in the same way. In information systems, these criteria are often extrinsic, like a social security number or universally unique id, which is not interesting from an ontological point of view. Identity criteria should be ''informative'', they should help us and others understand what a class means. A triangle, for example, can be identified by the length of its three sides, or by two sides and an interior angle, etc. This says a lot about what is ''intended'' by the triangle class here, e.g. the same triangle could be in many places at the same time. Someone else may have an ontology in which the triangle class has different identity criteria, such that different drawings are always different triangles, even if they are the same size. Identity criteria (and OntoClean, for that matter) do not tell you that one of these definitions of triangle is right or wrong, just that they are different and thus that the classes are different. Identity criteria and sortals are intuitively meant to account for the linguistic habit of associating identity with certain classes. In the classical statue and clay example, we naturally say "the same ''clay''" or "the same ''statue''", indicating that there are identity criteria that are peculiar to each class. ''Being a sortal'' is the first OntoClean metaproperty, indicated with the +I superscript (−I for non-sortals) on a class in the original notation. +I (but not −I) is inherited down the class hierarchy, if a class is a sortal then all its subclasses are as well.

Unity

There are certain properties that only hold of individuals that are ''wholes''. In formal ontology, wholes are often distinguished from ''mere sums'', which are individuals whose boundaries are, in a sense, arbitrary. For example, consider the class ''clay''. An instance of this class might be some amount of the material (this is only one possible meaning, of course), such that any (in fact, every) arbitrary subsection of the amount would be a different instance of the same class. By contrast, instances of the class Person are, typically, not decomposable in this fashion. For the purposes of OntoClean, wholes are individuals all of whose parts are related to each other, and only to each other, by some distinguished relation. This relation can be viewed as a ''generalized connection'' relation. Mere sums have no such relation since any decomposition of a mere sum is connected to any larger sum, which is not one of its parts, by the same relation. Unity is the metaproperty, indicated by +U, of classes all of whose individuals are wholes under the same relation. Like identity, OntoClean does not require that the relation itself be specified, often it is enough to know that the relation exists. Intuitively, a class has unity if all its instances are the same type of whole, and is typically true of classes of natural objects. Non-unity, indicated by −U, is the meta-property of classes whose instances are not all wholes, or not all wholes by the same relation. A further and more useful refinement of non-unity is anti-unity, indicated by ~U, the meta-property of classes all of whose instances are not wholes, such as classes of mere sums. +U and ~U (but not −U) are inherited down the class hierarchy.

Rigidity

Leibniz's law makes good sense when first considered, however it doesn't take long to see how considerations of time causes problems between most ontologies (especially semantic web ontologies) and Leibniz's law. For example, I might have a beard on one day and shave it off the next, yet I am the same entity at both times. How is it possible for me to be the ''same'' if I have ''changed''? There are many logical approaches to this classic dilemma (including simply ignoring it), the most common is to consider some properties to be ''essential''; an essential property (and, q.v. terminology above, properties are unary predicates) of an entity is a property that cannot change, and these are the properties for which Leibniz's law holds. Other properties of an entity that can change are non-essential and cannot be involved in identity. Some properties are essential to all their instances. Think of the property of ''being a person'', usually represented by the class Person. For every entity that has this property, the property is essential. So at least one of the properties that has not changed about me when I shave my beard is that I am a person. These properties, that are essential to all their instances, are ''rigid properties''. Rigid properties are designated by +R, and properties that are not rigid −R. An important specialization of non-rigid properties are ''anti-rigid'' properties (~R), which are properties that must be changeable. Think of ''being a student'' — all students must possibly not be students. ~R (but not −R or +R) is inherited down the class hierarchy. Note that these are just examples — it is certainly possible to have an ontology in which Person is anti-rigid. Imagine an ontology of mystical beliefs, for example, in which an entity changes from Person to Spirit upon death. In order for the individual to be the same across this change, being a person must not be essential and furthermore must be changeable (i.e. anti-rigid). Rigidity should not be confused with Kripke's notion of Rigid Designators, which are particulars. The term rigid in OntoClean is meant to describe the instanceOf link between an individual and a rigid class — it cannot be broken.

Dependence

Dependence is a varied notion. In the core OntoClean papers, Guarino & Welty used a kind of dependence that captures a meta-property of certain relational roles. A property is dependent if each instance of it implies the existence of another entity. The property Student, for example, is dependent, since to be a student there must be a teacher; for every instance of student there is at least one instance of teacher. In later work for olcethis was noted to subsume two kinds of property dependence: ''specific'' constant dependence and ''generic'' constant dependence. The former accounts for dependence on specific entities, e.g. each person is dependent on having a particular brain. The latter accounts for the Student/Teacher case, where any instance of Teacher will do. There are many other kinds of dependence, see ine and Smith, 1983and especially imons, 1987Simons, P., 1987, Parts: A Study in Ontology, Oxford: Clarendon Press. It is an open problem to adapt them into the OntoClean framework. Being dependent is indicated with +D, being independent with −D. +D (but not −D) is inherited down the class hierarchy.

References

Ontology (information science) Philosophical methodology