Glossary

This glossary summarizes the terminology of methods and techniques for defining, sharing, and merging ontologies. These definitions, which were written by John F. Sowa, are based on discussions in the ontology working group of the NCITS T2 Committee on Information Interchange and Interpretation.

alignment.

A mapping of concepts and relations between two ontologies A and B that preserves the partial ordering by subtypes in both A and B. If an alignment maps a concept or relation x in ontology A to a concept or relation y in ontology B, then x and y are said to be equivalent. The mapping may be partial: there could be many concepts in A or B that have no equivalents in the other ontology. Before two ontologies A and B can be aligned, it may be necessary to introduce new subtypes or supertypes of concepts or relations in either A or B in order to provide suitable targets for alignment. No other changes to the axioms, definitions, proofs, or computations in either A or B are made during the process of alignment. Alignment does not depend on the choice of names in either ontology. For example, an alignment of a Japanese ontology to an English ontology might map the Japanese concept Go to the English concept Five. Meanwhile, the English concept for the verb go would not have any association with the Japanese concept Go.

differentiae.

The properties, features, or attributes that distinguish a type from other types that have a common supertype. The term comes from Aristotle's method of defining new types by stating the genus or supertype and stating the differentiae that distinguish the new type from its supertype. Aristotle's method of definition has become the de facto standard for natural language dictionaries, and it is also widely used for AI knowledge bases and object-oriented programming languages. For a discussion and comparison of various methods of definition, see the notes on definitions by Norman Swartz.

formal ontology.

A terminological ontology whose categories are distinguished by axioms and definitions stated in logic or in some computer-oriented language that could be automatically translated to logic. There is no restriction on the complexity of the logic that may be used to state the axioms and definitions. The distinction between terminological and formal ontologies is one of degree rather than kind. Formal ontologies tend to be smaller than terminological ontologies, but their axioms and definitions can support more complex inferences and computations. The two major contributors to the development of formal ontology are the philosophers Charles Sanders Peirce and Edmund Husserl. Examples of formal ontologies include theories in science and mathematics, the collections of rules and frames in an expert system, and specification of a database schema in SQL.

hierarchy.

A partial ordering of entities according to some relation. A type hierarchy is a partial ordering of concept types by the type-subtype relation. In lexicography, the type-subtype relation is sometimes called the hypernym-hyponym relation. A meronomy is a partial ordering of concept types by the part-whole relation. Classification systems sometimes use a broader-narrower hierarchy, which mixes the type and part hierarchies: a type A is considered narrower than B if A is subtype of B or any instance of A is a part of some instance of B. For example, Cat and Tail are both narrower than Animal, since Cat is a subtype of Animal and a tail is a part of an animal. A broader-narrower hierarchy may be useful for information retrieval, but the two kinds of relations should be distinguished in a knowledge base because they have different implications.

identity conditions.

The conditions that determine whether two different appearances of an object represent the same individual. Formally, if c is a subtype of Continuant, the identity conditions for c can be represented by a predicate Id_c. Two instances x and y of type c, which may appear at different times and places, are considered to be the same individual if Id_c(x,y) is true. As an example, a predicate Id_Human, which determines the identity conditions for the type HumanBeing, might be defined by facial appearance, fingerprints, DNA, or some combination of all those features. At the atomic level, the laws of quantum mechanics make it difficult or impossible to define precise identity conditions for entities like electrons and photons. If a reliable identity predicate Id_t cannot be defined for some type t, then t would be considered a subtype of Occurrent rather than Continuant.

integration.

The process of finding commonalities between two different ontologies A and B and deriving a new ontology C that facilitates interoperability between computer systems that are based on the A and B ontologies. The new ontology C may replace A or B, or it may be used only as an intermediary between a system based on A and a system based on B. Depending on the amount of change necessary to derive C from A and B, different levels of integration can be distinguished: alignment, partial compatibility, and unification. Alignment is the weakest form of integration: it requires minimal change, but it can only support limited kinds of interoperability. It is useful for classification and information retrieval, but it does not support deep inferences and computations. Partial compatibility requires more changes in order to support more extensive interoperability, even though there may be some concepts or relations in one system or the other that could create obstacles to full interoperability. Unification or total compatibility may require extensive changes or major reorganizations of A and B, but it can result in the most complete interoperability: everything that can be done with one can be done in an exactly equivalent way with the other.

knowledge base.

An informal term for a collection of information that includes an ontology as one component. Besides an ontology, a knowledge base may contain information specified in a declarative language such as logic or expert-system rules, but it may also include unstructured or unformalized information expressed in natural language or procedural code.

lexicon.

A knowledge base about some subset of words in the vocabulary of a natural language. One component of a lexicon is a terminological ontology whose concept types represent the word senses in the lexicon. The lexicon may also contain additional information about the syntax, spelling, pronunciation, and usage of the words. Besides conventional dictionaries, lexicons include large collections of words and word senses, such as WordNet from Princeton University and EDR from the Japan Electronic Dictionary Research Institute, Ltd. Other examples include classification schemes, such as the Library of Congress subject headings or the Medical Subject Headers (MeSH).

mixed ontology.

An ontology in which some subtypes are distinguished by axioms and definitions, but other subtypes are distinguished by prototypes. The top levels of a mixed ontology would normally be distinguished by formal definitions, but some of the lower branches might be distinguished by prototypes.

partial compatibility.

An alignment of two ontologies A and B that supports equivalent inferences and computations on all equivalent concepts and relations. If A and B are partially compatible, then any inference or computation that can be expressed in one ontology using only the aligned concepts and relations can be translated to an equivalent inference or computation in the other ontology.

primitive.

A category of an ontology that cannot be defined in terms of other categories in the same ontology. An example of a primitive is the concept type Point in Euclid's geometry. The meaning of a primitive is not determined by a closed-form definition, but by axioms that specify how it is related to other primitives. A category that is primitive in one ontology might not be primitive in a refinement of that ontology.

prototype-based ontology.

A terminological ontology whose categories are distinguished by typical instances or prototypes rather than by axioms and definitions in logic. For every category c in a prototype-based ontology, there must be a prototype p and a measure of semantic distance d(x,y,c), which computes the dissimilarity between two entities x and y when they are considered instances of c. Then an entity x can classified by the following recursive procedure:

Suppose that x has already been classified as an instance of some category c, which has subcategories s₁,...,s_n.
For each subcategory s_i with prototype p_i, measure the semantic distance d(x, p_i , c).
If d(x, p_i , c) has a unique minimum value for some subcategory s_i, then classify x as an instance of s_i, and call the procedure recursively to determine whether x can be further classified by some subcategory of s_i.
If c has no subcategories or if d(x, p_i , c) has no unique minimum for any s_i, then the classification procedure stops with x as an instance of c, since no finer classification is possible with the given selection of prototypes.

As an example, a black cat and an orange cat would be considered very similar as instances of the category Animal, since their common catlike properties would be the most significant for distinguishing them from other kinds of animals. But in the category Cat, they would share their catlike properties with all the other kinds of cats, and the difference in color would be more significant. In the category BlackEntity, color would be the most relevant property, and the black cat would be closer to a crow or a lump of coal than to the orange cat. Since prototype-based ontologies depend on examples, it is often convenient to derive the semantic distance measure by a method that learns from examples, such as statistics, cluster analysis, or neural networks.

Quine's criterion.

A test for determining the implicit ontology that underlies any language, natural or artificial. The philosopher Willard van Orman Quine proposed a criterion that has become famous: "To be is to be the value of a quantified variable." That criterion makes no assumptions about what actually exists in the world. Its purpose is to determine the implicit assumptions made by the people who use some language to talk about the world. As stated, Quine's criterion applies directly to languages like predicate calculus that have explicit variables and quantifiers. But Quine extended the criterion to languages of any form, including natural languages, in which the quantifiers and variables are not stated as explicitly as they are in predicate calculus. For English, Quine's criterion means that the implicit ontological categories are the concept types expressed by the basic content words in the language: nouns, verbs, adjectives, and adverbs.

refinement.

An alignment of every category of an ontology A to some category of another ontology B, which is called a refinement of A. Every category in A must correspond to an equivalent category in B, but some primitives of A might be equivalent to nonprimitives in B. Refinement defines a partial ordering of ontologies: if B is a refinement of A, and C is a refinement of B, then C is a refinement of A; if two ontologies are refinements of each other, then they must be isomorphic.

semantic factoring.

The process of analyzing some or all of the categories of an ontology into a collection of primitives. Combinations of those primitives generate a hierarchy, called a lattice, which includes the original categories plus additional ones that make it more symmetric. The techniques of semantic factoring can be applied to any level of an ontology from the highest, most general concept types to the lowest, most specialized types. The methods can be automated, as in formal concept analysis, which is a systematic technique for deriving a lattice of concept types from low-level data about individual instances.

semiotic.

The study of signs in general, their use in language and reasoning, and their relationships to the world, to the agents who use them, and to each other. It was developed independently by the logician Charles Sanders Peirce, who called it semeiotic, and by the linguist Ferdinand de Saussure, who called it sémiologie; other variants are the terms semiotics and semiology. Peirce developed semiotic into a rich, highly nuanced foundation for formal ontology, starting with three metalevel categories, which he called Firstness, Secondness, and Thirdness. Specialized examples of these categories include Aristotle's triad of Inherence, Directedness, and Containment in Figure 1 and the triad of Independent, Relative, and Mediating in Figure 6. One of Peirce's most famous examples is the triad of Icon, Index, and Symbol.

terminological ontology.

An ontology whose categories need not be fully specified by axioms and definitions. An example of a terminological ontology is WordNet, whose categories are partially specified by relations such as subtype-supertype or part-whole, which determine the relative positions of the concepts with respect to one another but do not completely define them. Most fields of science, engineering, business, and law have evolved systems of terminology or nomenclature for naming, classifying, and standardizing their concepts. Axiomatizing all the concepts in any such field is a Herculean task, but subsets of the terminology can be used as starting points for formalization. Unfortunately, the axioms developed from different starting points are often incompatible with one another.

unification.

A one-to-one alignment of all concepts and relations in two ontologies that allows any inference or computation expressed in one to be mapped to an equivalent inference or computation in the other. The usual way of unifying two ontologies is to refine each of them to more detailed ontologies whose categories are one-to-one equivalent.

Send comments to John F. Sowa.

Last Modified: