Common syntactic formats are necessary for one system to read data from another. To perform compatible operations on the data, the systems must also agree on the semantics. For over fifty years, researchers and developers have proposed methods for specifying and using semantics to support interoperability. To emphasize the community of people and projects, this web page lists some signficant documents they produced in roughly chronological order. The comments show the narrative thread that relates abd motivates them.
By the 1970s, the developers of database and knowledge base systems recognized that standards for semantics are necessary for systems to process shared data correctly. The database community proposed a conceptual schema for representing the semantics of shared data. The diagram on the left of Figure 1 shows the mappings from the conceptual schema to the internal schemata of different databases and the external schemata of different applications. The diagram on the right shows the conceptual schema as the heart of an application system. It specifies the types, definitions, constraints, and relationships of the data stored in the databases and used by the application programs and the human interfaces. For the knowledge bases in artificial intelligence and the Semantic Web, an ontology specified in some version of logic contains equivalent information.
Figure 1. The conceptual schema
Workshop on Data Abstraction, Databases and Conceptual Modelling, an ACM workshop in 1980 that brought together researchers in database systems, artificial intelligence, and programming languages. Although databases have grown from megabytes to terabytes and beyond, many of the problems are still open research issues. The methods that proved to be useful are running in computer systems today.
First-order theories of individual concepts and propositions, an article written by John McCarthy in 1979 and updated many times over the years. This version from 2000 is closely related to nearly all the other documents mentioned on this web page. In 1989, he proposed Elephant 2000: A programming language based on speech acts, which shows how these ideas can be adapted to programming systems. In 1993, he wrote Notes on contexts, which develops earlier ideas and adapts them to ongoing research. In 2007, he published From here to human-level AI, a summary of his thoughts about intelligent systems. In 2008, he discussed Elephant 2000 and related issues in an interview and slides.
The Cyc project, founded by Doug Lenat in 1984, designed and implemented the world’s largest and most detailed formal ontology and reasoning system. Members of the Cyc project have contributed to many of the documents cited on this web page. A large subset of the Cyc ontology and software is freely available in the OpenCyc Platform. For older Cyc reports, see a collection cited on the W3C Wiki. For a discussion of design issues, see a review of the Cyc project in 1993 and some recent observations.
The Society of Mind in 1986 and The Emotion Engine in 2006 by Marvin Minsky. In contrast to McCarthy’s emphasis on logic, Minsky emphasized the need for a diversity of heterogeneous methods. For a summary of the issues and a comparison with other methods, see Examining the society of mind by Push Singh in 2003. For a joint article by McCarthy, Minsky, and their colleagues in 2002, see An architecture of diversity for commonsense reasoning.
Conceptual Schema and Relational Database Design, by Sjir Nijssen and Terry Halpin in 1989. The Natural language Information Analysis Method (NIAM) for analyzing informal specifications to produce a conceptual schema. To illustrate NIAM, the SILC report analyzes a one-page English description to produce a two-page specification. Then it systematically maps the complex English sentences to simple sentences, a NIAM diagram, business rules, a database design, process specifications, and screen shots for an application interface.
Shared Reusable Knowledge Bases (SRKB), a DARPA-sponsored project from 1991 to 1996. These emails from 1994 to 1996 discuss many of the issues that led to the DAML project in 2001. The two major deliverables of the SRKB project were the Knowledge Interchange Format (KIF) and the Knowledge Query and Manipulation Language (KQML). The emails in the Interlingua Archive discuss design issues for KIF, many of which were addressed in similar or different ways in Common Logic and IKL.
The KIF of death, by Matt Ginsberg in 1991. An article that was highly critical of an early version of KIF. As an alternative, Ginsberg proposed a foundation in first-order logic, but with methods for extending the logic to represent other semantics. Many of the people involved with SRKB were annoyed by Ginsberg’s criticisms, but others recognized the need for more expressive power. For emails by Ginsberg and the responses to them, search for "Ginsberg" in the SRKB and Interlingua archives.
IRDS conceptual schema, a working paper from 1991 for an ANSI standard. In a coalition for logic-based standards, some participants in SRKB joined the standards committee to develop parallel standards for KIF and Conceptual Graphs. The X3T2 committee also sponsored workshops on ontology, which involved participants from the database community and the SRKB community. In 1995, the proposal for a conceptual schema ended with a technical report. But in 2007, the proposed standards for KIF and CGs were merged in the ISO/IEC standard 24707 for Common Logic.
Data Semantics (DS) conferences, sponsored by IFIP Working Group 2.6 from 1985 to 2004. These conferences addressed issues related to the conceptual schema, knowledge representation, and implementations that support DBs and KBs. Many of the participants in these conferences also wrote or contributed to other documents cited on this page. After 2004, conferences on semantic technologies and the Semantic Web covered similar issues.
Semantic networks, by John Sowa in 1992. This article in the Wiley Encyclopedia of AI surveyed the development of network notations for logic and ontology since the 1960s. It relates networks used in AI and machine translation to the foundations in logic, linguistics, and philosophy. This version has been updated with references and discussion of more recent developments in knowledge representation, linguistics, and the Semantic Web.
International Conferences on Conceptual Structures from 1993 to 2014. A series of conferences on conceptual graphs, formal concept analysis (FCA), and related issues of knowledge representation and natural language processing. FCA was developed by Bernhard Ganter and Rudolf Wille as a mathematical theory of concepts and concept hierarchies. FCA tools for constructing lattices are used to generate and check the consistency of type hierarchies for many versions of logic, including OWL.
Formalizing Contexts, a workshop at the AAAI Fall Symposium in 1995. Most of the talks and papers for this workshop addressed issues that McCarthy analyzed in his papers of 1979, 1989, and 1993. Those were among the issues addressed by the IKRIS project and the related documents in Section 2 below.
Meta Content Framework (MCF) by R. V. Guha and Tim Bray in 1997. Guha had been an associate director of Cyc, where he implemented an early version of McCarthy’s theory of contexts for the Cyc microtheories. In 1995, Guha joined Apple, where he designed MCF as a simple subset of logic with a network representation. In 1997, he joined Netscape, where he collaborated with Bray to develop an XML-based version of MCF, which was later renamed RDF (Resource Description Framework). Figure 2 is an example of MCF for describing a document.
Figure 2. An example expressed in MCF, which evolved into RDF
Semantic web development, the original proposal submitted by Tim Berners-Lee in February 2000. Its central feature was the Semantic Web Logic Language (SWeLL) as a “unifying language for classical logic.” It proposed SWeLL as an “augmented language” designed to support “the power of KR systems.” As examples of the logics that SWeLL must support, the proposal cited KIF, KQML, Prolog, LOOM, semantic networks, higher-order logics, nonmonotonic logics, and context logics.
Agent Based Computing, slides by Jim Hendler in 2000. While on a two-year assignment at DARPA, Hendler was the original project manager for DAML. In these slides, he summarized the motivation for the project and some of its intended applications. One of Hendler’s slides (Figure 3) is based on a diagram by Tim Berners-Lee. The Classical Logic Interchange Level corresponds to SWeLL in the proposal, but it adds support for fuzzy logic. In 2001, he wrote a short article on Agents and the Semantic Web.
Figure 3. DAML and the Semantic Web (Hendler 2000)
An Axiomatic Semantics for RDF, RDF Schema, and DAML-ONT, by Richard Fikes and Deborah McGuinness in December 2000. An early report for the DAML project, when KIF was being used as the basis for SWeLL. But the requirement for quantifying over relations led Guha and Hayes to develop the LBase semantics.
LBase: Semantics for Languages of the Semantic Web, a W3C working group note by R. V. Guha and Patrick Hayes in 2003. This note presented LBase as a model-theoretic semantics for SWeLL, as described in the DAML proposal. But it was not approved as a W3C recommendation, since the OWL-DL designers wanted a less expressive semantics. However, LBase is consistent with RDF, RDFS, OWL full, and the semantics adopted for Common Logic.
SCL: A Logic standard for semantic integration, by Chris Menzel and Pat Hayes. The original specification of the Common Logic semantics. It was called Simplified Common Logic (SCL) to distinguish it from an earlier version of CL based on KIF semantics. The goal of making CL compatible with LBase was the primary reason for the new semantics.
Ontologies for semantically interoperable systems, by Leo Obrst in 2003. A distinction between levels of interoperability: “weak semantics” with underspecified ontologies for information retrieval, and “strong semantics” with detailed axioms for applications that require exact conformance to precise specifications. For a longer article on related issues, see Ontologies for corporate web applications, by Obrst, Liu, and Wray.
MIT-W3C DAML program: Final Report, the results in December 2005. The “layer cake” diagrams in Figure 4 show how the Semantic Web evolved from the proposal in 2000 to a widely used slide in 2001 and the final report in 2005. In 2000, the yellow arrow for the “unifying language for classical logic” dominates the diagram. In 2001, the box labeled “Logic” has shrunk. In 2005, the box for the unifying logic is smaller than the logics it’s supposed to unify.
Figure 4. Evolution of the Semantic Web from 2000 to 2005
The Project for Interoperable Knowledge Representation for Intelligence Support (IKRIS) was sponsored by the US Department of Defense in 2005 and 2006. IKRIS brought together an impressive group of researchers with backgrounds in artificial intelligence, knowledge representation, logic, and ontology. Its primary products were the specification of the IKL language and the evaluation of IKL as a basis for interoperability among systems that use different versions of logic.
Since the IKRIS project was not funded after the first two years, the IKRIS group disbanded, and most of the documents they produced have disappeared from the WWW. Following are some of the surviving reports:
IKL Specifications, by Pat Hayes and Chris Menzel with the IKRIS Interoperability Group (Bill Andersen, Richard Fikes, Patrick Hayes, Charles Klein, Deborah McGuiness, Christopher Menzel, John Sowa, Christopher Welty, Wlodek Zadrozny).
IKL User Guide, by Pat Hayes with the IKRIS Interoperability Group.
Evaluation Working Group Report, by David A. Thurman, Alan R. Chappell, and Chris Welty with Selmer Bringsjord, Andrew J. Cowell, Jennifer J. Ockerman, Chris Deaton, Ian Harrison, John Byrnes, and Mario Inchiosa.
IKRIS Scenarios Inter-Theory (ISIT), by Jerry Hobbs with Danny Bobrow, Chris Deaton, Mike Gruninger, Pat Hayes, Arun Majumdar, David Martin, Drew McDermott, Sheila McIlraith, Karen Myers, David Morley, John Sowa, Marco Valtorta, and Chris Welty.
MITRE Support to IKRIS, Final Report, by Brant Cheikes.
Common Logic, slides by Pat Hayes in 2004. An overview of the new semantics for Common Logic (based on SCL) and its relationship to KIF, Conceptual Graphs, RDF, OWL, and IKRIS.
Computational Context Logic and Species of ist, by Selene Makarios, Karl Heuer, and Richard Fikes. Issues and proposals for the IKRIS context logic.
Context Mereology by Pat Hayes, presented at the AAAI Spring Symposium at Stanford in 2005, but with some updates in 2006. Logical issues about the use of IKL for representing contexts.
IKRIS Challenges, Schedule, and Goals, slides by Chris Welty in February 2006. IKRIS Foundation Technology and Lessons Learned, slides by Richard Fikes in March 2006. Overviews of the IKRIS, project, technology, and developments.
IKL, Common Logic on Steroids, slides by Pat Hayes in 2006. IKL, A Logic for Interoperation, revised slides for a workshop on ontology. A survey of requirements for IKL; the use of IKL for representing metalanguage, contexts, quantification over functions and relations, and versions of temporal and modal logic; implications for specifying ontologies for interoperable systems.
Propositions, by John Sowa. Equivalence classes specified by meaning-preserving translations — a method used with IKL to distinguish sentences and propositions. This short article is an excerpt from Section 4 of Worlds, models, and descriptions, which shows how logics such as IKL can be used to define the semantics of modal logics.
New Developments in Interoperability, slides by Selmer Bringsjord, Andrew Shilliday, and Joshua Taylor. A summary of work in the IKRIS interoperability group and its relationship to the Slate Project at RPI. Provability-based semantic interoperability between knowledge bases and databases via translation graphs, MS thesis at RPI by Joshua Taylor. This thesis provides more detail than the slides.
Operationalizing semantic technologies, a report in 2007 by the Best Practices Committee of the Semantic Interoperability Community of Practice (SICoP). A discussion of the IKRIS project and its implications for interoperable systems, the Semantic Web, ontologies, natural language processing, Cyc, and WordNet.
Systems, capabilities, operations, programs, and enterprises (SCOPE) model for interoperability assessment, a report in 2008 for the Network-Centric Operations Industry Consortium (NCOIC). Figure 5 by Hans Polzer (page 12 of the SCOPE report) shows a map of approaches to interoperability. It’s an attempt “to position specific concepts, initiatives, technologies, and products within that space.” Methods at the top of Figure 5 address semantic issues related to the conceptual schema (Figure 1). Methods in the middle address the external schema; those at the bottom address the internal schema.
Figure 5. A map of approaches to interoperability
English expresses first-order logic with the words and, or, not, if-then, some, and every. Other languages use equivalent words to do the same. But natural languages have many syntactic and semantic features that go far beyond the expressive power of FOL. With its support for metalanguage, IKL uses a generalization of FOL to make statements about contexts and theories expressed in IKL. By using statistical, default, modal, or fuzzy information at the metalevel, IKL can express and reason with and about a wide range of logics and theories. The IKRIS project showed that this combination is sufficiently powerful to serve as an interlingua among the logics of the Semantic Web, Cyc, and other large knowledge bases.
Computational logic: memories of the past and challenges for the future, by John Alan Robinson in 2000. A wide range of logics considered as “first order theories, syntactically sugared in notationally convenient forms. From this point of view, higher order logic is essentially first order set theory.” Robinson surveyed theorem proving, logic programming, and heuristic methods from the 1960s to 2000. At the end, he stated eight challenges for future research.
The ISO/IEC standard 24707 for Common Logic was approved in 2007. For a tutorial presented at a semantic technology conference in 2008, see the slides by John Sowa, Pat Hayes, and Chris Menzel. Figure 6 is a diagram used in slide 5 by Sowa and slide 2 by Hayes.
Figure 6. Common Logic as a basis for semantic interoperability
Fads and Fallacies About Logic, by John Sowa in 2007. A clarification of issues that are often confused or misstated. In particular, this article explains that the speed or efficiency in solving a problem cannot be improved by reducing the expressive power of a logic. The only effect of reducing expressive power is to make some problems, definitions, and theorems impossible to state.
Concept lattices and their applications, the many ways of implementing and using lattices of concepts, classes, types, or sorts. Among them are Formal Concept Analysis (FCA) and Order-Sorted Feature logics (OSF). For example, the CEDAR system by Hassan Aït-Kaci is a version of OSF and Constraint Logic Programming (CLP) that supports very large ontologies. For 6000 to 900,000 sorts or classes, CEDAR is comparable to the fastest OWL reasoners in constructing the hierarchy. For queries, however, CEDAR is orders of magnitude faster.
Conceptual Graphs, by John Sowa in 2008. A survey of Common Logic and the IKL extensions as expressed in the CGIF and CLIF dialects. A more recent article, From existential graphs to conceptual graphs, relates CL and IKL semantics to Peirce’s existential graphs, Kamp’s discourse representation structures. and logics for contexts, metalanguage, and modality.
A Satisfiability-Preserving Reduction of IKL to Common Logic, by Pat Hayes in 2009. A proof that any model for IKL can be mapped to a model for CL that preserves satisfiability. Hayes also shows how the IKL model theory avoids traditional paradoxes, such as “This sentence is false.” The proof depends on the IKL distinction between sentences and propositions: Any attempt to translate that English sentence to IKL produces the equivalent of “There exists a proposition p, p is true, and p says that p is false.” This sentence is false because no such proposition can exist. Therefore, the paradox vanishes.
Knowledge Representation, the World Wide Web, and the evolution of logic, by Chris Menzel in 2011. The theory, motivation, and implications of the Common Logic semantics. In Completeness theorems for logic with a single type, he presents a more detailed analysis of the theoretical issues. Menzel claims that the traditional syntax for first-order logic is a vestige of a metaphysics that “stipulates an inviolable gulf between individuals and properties.” He shows how CL supports quantifiers over functions and relations while retaining a first-order style of model theory and proof theory.
Schema.org, a large, but simple ontology developed in a collaboration of search engines (Bing, Google, Yahoo! and Yandex) with later support by the W3C. After Schema.org was announced in 2011, Guha presented a talk at Ontolog Forum to explain the motivation and applications of the ontology. The online chat discussed the contrast between the “weak” semantics of Schema.org and the “strong” or “deep” semantics of systems like Cyc. The JSON-LD notation is a humanly readable format that is compatible with Schema.org, RDF, and HTML5. It’s consistent with the original goals for MCF (Figure 2), but more detail is necessary for strong semantics.
Hets for Common Logic Users, by Till Mosakowski, Christian Maeder, Mihai Codescu, Eugen Kuksa, and Christoph Lange in 2013. The Heterogeneous Tool Set, which includes a parser for CLIF and mappings to theorem provers for first-order logic and higher-order logic. Hets also includes translators to CLIF from OWL and propositional logic. The report on proposed revisions to the ISO/IEC standard 24707 for Common Logic cites applications of Common Logic, tools such as Hets, and some requirements for a forthcoming revision of the CL standard.
Semantics of a foundational subset for executable UML models (fUML), by the Object Management Group in 2013. An application of Common Logic for specifying a subset of UML diagrams. The goal is to provide “maximum flexibility to modify the organization of the data without affecting the definition of an algorithm.” This statement is consistent with the goals for the conceptual schema (Figure 1).
Every logician can read, write, and speak some natural language more fluently than any version of logic. As Robinson pointed out, FOL is sufficient to express everything a digital computer can do. But much or even most of what people do, say, and think is difficult or impossible to express in any known version of logic. To program a computer, people must learn to think like a computer. A foundation in logic and ontology is essential for precision, but the human interfaces are critical for usability. The documents in this section address the challenge of designing computer systems that meet people halfway.
Tossing algebraic flowers down the great divide, by Joseph Goguen in 1997. Observations about the “great divide” between research in computer science and practice in mainstream IT. Goguen emphasized the need for ontological diversity with “support for multiple evolving ontologies.” He noted “that translations among such theories will necessarily be partial and incomplete, and that we should provide tools to help construct such partial mappings.” To support the mappings, he applied category theory to translations among a wide range of declarative (logic based) and procedural (automata based) systems. One of the earliest applications was his PhD dissertation on fuzzy mappings; Lotfi Zadeh was his thesis adviser. A more recent application is the Distributed Ontology, Model, and Specification Language (DOL).
An architecture for diversity of commonsense reasoning, by John McCarthy, Marvin Minsky, and colleagues in 2002. This article addresses issues related to IKRIS, but it was written before IKRIS began. It proposes a “multilevel cognitive architecture” that “would exploit many existing artificial intelligence techniques for commonsense reasoning and knowledge representation, such as case-based reasoning, logic, neural nets, genetic algorithms, and heuristic search.”
Architectures for intelligent systems, by John Sowa in 2002. The design of a Flexible Modular Framework (FMF) for communication among heterogeneous modules or agents. The FMF was influenced by McCarthy’s Elephant 2000, Minsky’s Society of Mind, KQML, Common Logic, and the Linda system by Carriero and Gelernter. Versions of the FMF have been implemented in systems that support multiple languages and multiple paradigms.
Human rationality challenges universal logic, by Brian Gaines in 2010. A review of formal logic and its relationship to the methods of informal reasoning in science and everyday life. Among the methods for semi-automated knowledge acquisition, Gaines and Shaw have applied the personal construct psychology and the repertory grid by George Kelly.
The era of big data, by Michael Jordan, the president of the International Society for Bayesian Analysis in 2011. He observed that the large volumes of data are overwhelming the traditional methods of analysis. He asked “If data are our principal resource, why should having too much data cause such embarrassment?” In an interview in 2014, he discussed “cartoon models” of the brain: “There is progress at the very lowest levels of neuroscience. But for issues of higher cognition — how we perceive, how we remember, how we act — we have no idea how neurons are storing information, how they are computing, what the rules are, what the algorithms are, what the representations are, and the like.”
Why has AI failed? And how can it succeed? an article and slides by John Sowa in 2014. This article and the slides analyze trends that have limited the potential of AI technology. They extend an earlier history and survey of semantic systems. For an update on these issues, see the slides about Natural language understanding by John Sowa in December 2015. For an overview of related issues by a lexicographer and computational linguist, see “I don’t believe in word senses” by Adam Kilgarriff. The title of that article is a quotation by Sue Atkins, who devoted her career to defining word senses and who was a president of the European Association for Lexicography.
Scientism and its discontents, slides by the philosopher Susan Haack in 2016. Although philosophy is a primary source of fundamental ideas in ontology, she criticizes “scientistic” attempts to make it look scientific just by adding layers of abstract formalism. In slide 84, she says “the idea that philosophy can be conducted purely a priori is an illusion... but a seductive one.” To illustrate the point, the slide shows Bertrand Russell sitting in an armchair.
In the 1970s, Michael Stonebraker developed Ingres, a pioneering relational
database system. In the 1980s, he extended it to Postgress. In 2010,
he itroduced the term NoSQL, which means "Not only SQL".
For his views in 2018, see an interview,
A Short History of Database Systems, and the slides for
Ten fears about the future of the DBMS field:
The Big Three: 1. Research on core issues is decling; 2. Industries
ignore core issues and buy proprietary software; 3. A diarrhea of trivial papers
by students and assistant professors who divide their papers into
“Least Publishable Units.” The other fears are corollaries
of the Big 3. Result: Fundamental R & D is abandoned.
Note: Since resources on the WWW tend to disappear, many of the documents cited above have been copied to this web site. If the authors or copyright holders have any objections, please send a note to John Sowa: