Semantic Foundations of Contexts

John F. Sowa

Abstract. In logic, the first representation of context as a formal object was by the philosopher C. S. Peirce; but for nearly eighty years, his treatment was unknown outside a small group of Peirce aficionados. In the 1980s, three different approaches led to related notions of context: Kamp's discourse representation theory; Barwise and Perry's situation semantics; and Sowa's conceptual graphs, which explicitly introduced Peirce's theories to the AI community. During the 1990s, John McCarthy and his students developed a closely related notion of context as a basis for organizing and partitioning knowledge bases. Each of the theories has distinctive, but complementary ideas that can enrich the others, but the relationships between them are far from clear. This paper discusses several approaches to the semantics of contexts and related notions in logic and model theory: the possible worlds of Leibniz, Kripke, and Montague; the model sets by Hintikka; the situations of Barwise and Perry; and the contexts by Peirce and McCarthy. It concludes with a formal theory of contexts that can support all the above as special cases.

Contents:

1. Search for a Theory of Contexts

2. Syntax of Contexts

3. Possible Worlds, Situations, and Contexts

4. Laws and Legislation

5. Peirce's Classification of Contexts

6. Nested Graph Models

7. Nesting and Flattening

References

This paper is a revised merger of two publications by Sowa (1995) and Sowa (1997b), with some new material in Sections 6 and 7. For related background on this topic, see Chapter 5 of the book Knowledge Representation.

1. Search for a Theory of Contexts

The notion of context is indispensable for any theory of meaning, but no consensus has been reached about the formal treatment of context. Some of the conflicting approaches result from an ambiguity in the informal senses of the word. Dictionaries list two major senses:

The basic sense is a section of linguistic text or discourse that surrounds some word or phrase of interest.
The derived sense is a nonlinguistic situation, environment, domain, setting, background, or milieu that includes some entity, subject, or topic of interest.

The word context may refer to the text, to the information contained in the text, to the thing that the information is about, or to the possible uses of the text, the information, or the thing itself. The ambiguity about contexts results from which of these aspects happens to be the central focus. These informal senses of the word suggest criteria for distinguishing the formal functions:

Syntax. The syntactic function of context is to group, quote, delimit, or package a section of text.
Semantics. The quoted text of a context may describe or refer to some real or hypothetical situation. That nonlinguistic referent, which may also be called the context, constitutes the derived meaning of the term.
Pragmatics. The word interest, which occurs in both senses of the English definition, suggests some reason or purpose for distinguishing the section of linguistic text or nonlinguistic situation. That purpose is the pragmatics or the reason why the text is being quoted.

The three-way classification of syntax, semantics, and pragmatics is based on Peirce's categories of Firstness, Secondness, and Thirdness. Peirce's categories, his emphasis on purpose, his graph logic with its notation for contexts, his work on modal and intentional logics, his context-dependent references or indexicals, and his notion of laws as governing Thirdness constitute a rich foundation that can help unify the disconnected research of more recent logicians.

Peirce's friend and fellow pragmatist William James (1897) gave an example of the importance of purpose in determining what should be included in a context:

Can we realize for an instant what a cross-section of all existence at a definite point of time would be? While I talk and the flies buzz, a sea gull catches a fish at the mouth of the Amazon, a tree falls in the Adirondack wilderness, a man sneezes in Germany, a horse dies in Tartary, and twins are born in France. What does that mean? Does the contemporaneity of these events with one another, and with a million others as disjointed, form a rational bond between them, and unite them into anything that means for us a world? Yet just such a collateral contemporaneity, and nothing else, is the real order of the world. It is an order with which we have nothing to do but to get away from it as fast as possible. As I said, we break it: we break it into histories, and we break it into arts, and we break it into sciences; and then we begin to feel at home. We make ten thousand separate serial orders of it, and on any one of these we react as though the others did not exist. (p. 119)

The real world or any imaginary, hypothetical, or planned world is far too big and complex for any agent, human or robotic, to comprehend in all its richness. Smaller chunks are easier to think about than infinite worlds, but for a theory of semantics, there is an even more fundamental issue than size: the purpose that explains why an agent has selected one chunk rather than another. A cat, for example, will pay much more attention to the sound of a can opener than to a human conversation or a musical selection. For any agent, purpose determines what aspects of the world constitute a meaningful situation.

Theories of semantics based on possible worlds cannot derive purpose or intention from collections of worlds, no matter how big, or from collections of situations, no matter how small. In their theory of situations, Barwise and Perry (1983) tried to use finite situations rather than infinite worlds as a basis for deriving the semantics of propositional attitude verbs, such as hope, wish, want, or fear. But the situations they used as examples were not arbitrary, random chunks of the world. For every sample situation in their book, there was an unstated reason why some agent focussed attention on that situation rather than any other. The reason why an agent selects a situation as a topic of interest is a more fundamental clue to its meaning than the situation itself.

This paper discusses several models for languages that go beyond classical first-order logic: the possible worlds of Leibniz, Kripke, and Montague; the model sets by Hintikka; the situations of Barwise and Perry; and the contexts by Peirce and McCarthy. Of these, contexts are the most convenient and computable. But contexts alone do not introduce any more purpose than possible worlds, model sets, or situations. Purpose or intentionality, which can only be introduced by some agent (who need not be human), must be incorporated into the fundamental structures of a semantic theory. In his theories of logic and semiotics, C. S. Peirce addressed the relationship between a universe of discourse or state of affairs and the intention of some agent who selects it. His approach can be combined with techniques introduced by Michael Dunn (1973), which enable modal semantics based on possible worlds to be reinterpreted in terms of the laws that determine the modality. The next step from modality to intentionality requires a shift of focus from the laws to the agents who legislate them.

2. Syntax of Contexts

In 1883, Peirce invented the algebraic notation for predicate calculus. A dozen years later, he developed a graphical notation that more clearly distinguished contexts. Figure 1 shows his graph notation for delimiting the context of a proposition to be discussed. In explaining that graph, Peirce (1898) said "When we wish to assert something about a proposition without asserting the proposition itself, we will enclose it in a lightly drawn oval." The line attached to the oval links it to a relation that makes a metalevel assertion about the nested proposition.

Figure 1: One of Peirce's graphs for talking about a proposition

The oval supports the basic syntactic function of grouping related information in a package. But besides notation, Peirce also developed a theory of the semantics and pragmatics of contexts and the rules of inference for importing and exporting information into and out of the contexts. To support first-order logic, the only necessary metalevel relation is negation. By combining negation with the existential-conjunctive subset of logic, Peirce developed his existential graphs (EGs), which are based on three logical operators and an open-ended number of relations:

Existential quantifier: A bar or linked structure of bars, called a line of identity, represents $.
Conjunction: The juxtaposition of two graphs in the same context represents Ù.
Negation: An oval enclosure with no lines attached to it represents ~ or the denial of the enclosed proposition.
Relation: Character strings represent the names of propositions, predicates, or relations, which may be attached to zero or more lines of identities.

In Figure 1, the character string "You are a good girl" is the name of a medad, which represents a proposition or zero-adic relation; the string "is much to be wished" is the name of a monad or monadic predicate or relation. In the EG on the left of Figure 2, "farmer" and "donkey" are monads; "owns" and "beats" are dyads, which represent dyadic relations. When combined with relations in all possible ways, the three logical operators can represent full first-order logic. When used to state propositions about nested contexts, they form a metalanguage that can be used to define modal and higher-order logic. For Peirce's own tutorial on existential graphs and their rules of inference, see the annotated version of manuscript 514.

To illustrate the use of negative contexts for representing FOL, Figure 2 shows an existential graph and a conceptual graph for the sentence If a farmer owns a donkey, then he beats it. This sentence is one of a series of examples used by medieval logicians to illustrate issues in mapping language to logic. The EG on the left has two ovals with no attached lines; by default, they represent negations. It also has two lines of identity, represented as linked bars: one line, which connects farmer to the left side of owns and beats, represents an existentially quantified variable ($x); the other line, which connects donkey to the right side of owns and beats represents another variable ($y).

Figure 2: EG and CG for "If a farmer owns a donkey, then he beats it."

When the EG of Figure 2 is translated to predicate calculus, farmer and donkey map to monadic predicates; owns and beats map to dyadic predicates. If a relation is attached to more than one line of identity, the lines are ordered from left to right by their point of attachment to the name of the relation. With the implicit conjunctions represented by the Ù symbol, the result is an untyped formula:

~($x)($y)(farmer(x) Ù donkey(y) Ù owns(x,y) Ù ~beats(x,y)).

In CGs, a context is defined as a concept whose referent field contains nested conceptual graphs. Since every context is also a concept, it can have a type label, coreference links, and attached conceptual relations. Syntactically, Peirce's ovals are squared off to form boxes, and the negation is explicitly marked by a ¬ symbol in front of the box. The primary difference between EGs and CGs is in the treatment of lines of identity. In EGs, the lines serve two different purposes: they represent existential quantifiers, and they show how the arguments are connected to the relations. In CGs, those two functions are split: the concepts [Farmer] and [Donkey] represent typed quantifiers ($x:Farmer) and ($y:Donkey), and arcs marked with numbers or arrows show the order of the arguments connected to the relations. In the inner context, the two concepts represented as [T] are connected by coreference links to concepts in the outer context. The CG maps to a typed formula that is equivalent to the untyped formula for the EG:

~($x:Farmer)($y:Donkey)(owns(x,y) Ù ~beats(x,y)).

The arrow pointing toward the relation indicates the first arc, and the arrow pointing away indicates the last arc; if a relation has n>2 arcs, they are numbered from 1 to n. For more examples of CGs and their translation to predicate calculus and the Knowledge Interchange Format (KIF), see the tutorial.

A nest of two ovals, as in Figure 2, is what Peirce called a scroll. It represents implication, since ~(pÙ~q) is equivalent to pÉq. Using the É symbol, the two formulas may be rewritten

("x)("y)((farmer(x) Ù donkey(y) Ù owns(x,y)) É beats(x,y)).
("x:Farmer)("y:Donkey)(owns(x,y) É beats(x,y)).

The algebraic formulas with the É symbol illustrate a peculiar feature of predicate calculus: in order to keep the variables x and y within the scope of the quantifiers, the existential quantifiers in the phrases a farmer and a donkey must be moved to the front of the formula and be translated to universal quantifiers. This puzzling feature of logic has posed a problem for linguists and logicians since the middle ages.

Besides attaching a relation to an oval, Peirce also used colors or tinctures to distinguish contexts other than negation. Figure 3 shows one of his examples with red to indicate possibility. The graph contains four ovals: the outer two form a scroll for if-then; the inner two represent possibility (red) and impossibility (red inside a negation). The outer oval may be read If there exist a person, a horse, and water; the next oval may be read then it is possible for the person to lead the horse to the water and not possible for the person to make the horse drink the water.

Figure 3: EG for "You can lead a horse to water, but you can't make him drink."

The notation "¾leads¾to¾" represents a triad or triadic relation leadsTo(x,y,z), and "¾makes¾drink¾" represents makesDrink(x,y,z). In the algebraic notation with the symbol à for possibility, Figure 3 maps to the following formula:

~($x)($ y)($z)(person(x) Ù horse(y) Ù water(z) Ù ~(àleadsTo(x,y,z) Ù ~àmakesDrink(x,y,z)) ).

With the symbol É for implication, this formula becomes

("x)("y)("z)((person(x) Ù horse(y) Ù water(z)) É (àleadsTo(x,y,z) Ù ~àmakesDrink(x,y,z)) ).

This version may be read For all x, y, and z, if x is a person, y is a horse, and z is water, then it is possible for x to lead y to z, and not possible for x to make y drink z. These readings, although logically explicit, are not as succinct as the proverb You can lead a horse to water, but you can't make him drink.

Discourse representation theory. The logician Hans Kamp once spent a summer translating English sentences from a scientific article to predicate calculus. During the course of his work, he was troubled by the same kinds of irregularities that puzzled the medieval logicians. In order to simplify the mapping from language to logic, Kamp (1981a,b) developed discourse representation structures (DRSs) with an explicit notation for contexts. In terms of those structures, Kamp defined the rules of discourse representation theory for mapping quantifiers, determiners, and pronouns from language to logic (Kamp & Reyle 1993).

Although Kamp had not been aware of Peirce's existential graphs, his DRSs are structurally equivalent to Peirce's EGs. The diagram on the left of Figure 4 is a DRS for the donkey sentence, If there exist a farmer x and a donkey y and x owns y, then x beats y. The two boxes connected by an arrow represent an implication where the antecedent includes the consequent within its scope.

Figure 4: EG and DRS for "If a farmer owns a donkey, then he beats it."

The DRS and EG notations look quite different, but they are exactly isomorphic: they have the same primitives, the same scoping rules for variables or lines of identity, and the same translation to predicate calculus. Therefore, the EG and DRS notations map to the same formula:

~($x)($y)(farmer(x) Ù donkey(y) Ù owns(x,y) Ù ~beats(x,y)).

Peirce's motivation for the EG contexts was to simplify the logical structure and rules of inference. Kamp's motivation for the DRS contexts was to simplify the mapping from language to logic. Remarkably, they converged on isomorphic representations. Therefore, Peirce's rules of inference and Kamp's discourse rules apply equally well to contexts in the EG, CG, or DRS notations. For notations with a different structure, such as predicate calculus, those rules cannot be applied without major modifications.

Resolving indexicals. Besides inventing a logical notation for contexts, Peirce coined the term indexical for context-dependent references, such as pronouns and words like here, there, and now. In CGs, the symbol # represents the general indexical, which is usually expressed by the definite article the. More specific indexicals are marked by a qualifier after the # symbol, as in #here, #now, #he, #she, or #it. Figure 5 shows two conceptual graphs for the sentence If a farmer owns a donkey, then he beats it. The CG on the left represents the original pronouns with indexicals, and the one on the right replaces the indexicals with the coreference labels ?x and ?y.

Figure 5: Two conceptual graphs for "If a farmer owns a donkey, then he beats it."

In the concept [Animate: #he], the label Animate indicates the semantic type, and the indexical #he indicates that the referent must be found by a search for some type of Animate entity for which the masculine gender is applicable. In the concept [Entity: #it], the label Entity is synonymous with T, which may represent anything, and the indexical #it indicates that the referent has neuter gender. The search for referents starts in the inner context and proceeds outward to find concepts of an appropriate type and gender. The CG on the right of Figure 5 shows the result of resolving the indexicals: the concept for he has been replaced by [?x] to show a coreference to the farmer, and the concept for it has been replaced by [?y] to show a coreference to the donkey.

Predicate calculus does not have a notation for indexicals, and its syntax does not show the context structure explicitly. Therefore, the CG on the left of Figure 5 cannot be translated directly to predicate calculus. After the indexicals have been resolved, the CG on the right can be translated to the following formula:

("x:Farmer)("yDonkey)("z:Own) (expr(z,x) Ù thme(z,y)) É ($w:Beat)(agnt(w,x) Ù ptnt(w,y)) ).

Note that this formula and the graph it was derived from are more complex than the CG in Figure 2. In order to compare the EG and CG directly, Figure 2 represented the verbs by relations Owns and Beats, which do not explicitly show the linguistic roles. In Figure 5, the concept Own represents a state with an experiencer (Expr) and a theme (Thme). The concept Beat, however, represents an action with an agent (Agnt) and a patient (Ptnt). In general, the patient of an action is more deeply affected or transformed than a theme. For further discussion of these linguistic relations, see the web page on thematic roles.

In analyzing the donkey sentences, the scholastics defined transformations or conversion rules from one logical form to another. As an example, a sentence with the word every can be converted to an equivalent sentence with an implication. The sentence Every farmer who owns a donkey beats it is equivalent to the one represented in Figures 2, 4, and 5. In CGs, the word every maps to a universal quantifier in the referent of some concept:

[ [Farmer: l]¬(Expr)¬[Own]®(Thme)®[Donkey]: "]-
   (Agnt)¬[Beat]®[Entity: #it].

In this graph, the quantifier " does not range over the type Farmer, but over the subtype defined by the nested lambda expression: just those farmers who own a donkey. The quantifier " is an example of a defined quantifier, which is not one of the primitives in the basic CG notation. It is defined by a rule, which generates the following CG in the if-then form:

[If: [Farmer: *x]¬(Expr)¬[Own]®(Thme)®[Donkey]
   [Then: [?x]¬(Agnt)¬[Beat]®[Entity: #it]] ].

This graph, which may be read If a farmer x owns a donkey, then x beats it, is halfway between the two graphs in Figure 5. The indexical that relates the nested agent of beating to the farmer has already been resolved to the coreference pair *x-?x by the macro expansion. The second indexical for the pronoun it remains to be resolved to the donkey. This example shows how two sentences that have different surface structures may be mapped to different semantic forms, which are then related by a separate inference step.

The expansion of a universal quantifier to an implication has been known since medieval times. But the complete catalog of all the rules for resolving indexicals is still an active area of research in linguistics and logic. For the sentence You can lead a horse to water, but you can't make him drink, many more conversions must be performed to generate the equivalent of Peirce's EG in Figure 3. The first step would be the generation of a logical form with indexicals, such as the CG in Figure 6, which may be read literally It is possible (Psbl) for you to lead a horse to water, but it is not possible (¬Psbl) for you to cause him to drink the liquid. The relation ¬Psbl is defined by a lambda expression in terms of ¬ and Psbl:

¬Psbl  º  ¬[Proposition: (Psbl)®[Possible: l]].

Figure 6: CG for "You can lead a horse to water, but you can't make him drink."

A parser and semantic interpreter that did a purely local or context-free analysis of the English sentence could generate the four concepts marked as indexicals by # symbols in Figure 6:

The two occurrences of you would map to the two concepts of the form [Person: #you].
The pronoun him represents a masculine animate indexical in the objective case, whose concept is [Animate: #he].
The missing object of the verb drink would be supplied by a canonical graph for the type Drink, which would show an expected patient of type Liquid. Since a context-free parser could not determine the correct referent, the concept [Liquid: #] is marked with an indexical.

The indexicals would have to be resolved by a context-dependent search, proceeding outward from the context in which each indexical is nested.

Conversational implicatures. Sometimes no suitable referent for an indexical can be found. In such a case, the person who hears or reads the sentence must make some further assumptions about implicit referents. The philosopher Paul Grice (1975) observed that such assumptions, called conversational implicatures, are often necessary to make sense out of the sentences in ordinary language. They are justified by the charitable assumption that the speaker or writer was trying to make a meaningful statement, but for the sake of brevity, happened to leave some background information unspoken. To resolve the indexicals in Figure 6, the listener would have to make the following kinds of assumptions to fill in the missing information:

The two concepts of the form [Person: #you] would normally be resolved to the listener or reader of the sentence. Since no one is explicitly mentioned in any containing context, some such person must be assumed. That assumption corresponds to drawing an if-then nest of contexts with a hypothetical person x coreferent with you:
```
[If: [Person: *x]- - -[Person: #you]
  [Then:  ...]].
```
The entire graph in Figure 6 would be inserted in place of the three dots in the then part; then every occurrence of #you could be replaced by ?x. The resulting graph could be read If there exists a person x, then x can lead a horse to water, but x can't make him drink the liquid.
The concept [Animate: #he] might be resolved to either a human or a beast. Since the reader is referred to as you, the most likely referent is the horse. But in both CGs and DRSs, coreference links can only be drawn between concepts under one of the following conditions:
- The antecedent concept [Horse] and the indexical [Animate: #he] both occur in the same context.
- The antecedent occurs in a context that includes the context of the indexical.
In Figure 6, neither of these conditions holds. To make the second condition true, the antecedent [Horse] can be exported or lifted to some containing context, such as the context of the hypothetical reader x. This assumption has the effect of treating the horse as hypothetical as the person x. After a coreference label is assigned to the concept [Horse: *y], the indexical #he could be replaced by ?y.
The liquid, which had to be assumed to make sense of the verb drink, might be coreferent with the water. But in order to draw a coreference link, another assumption must be made to lift the antecedent concept [Water: *z] to the same hypothetical context as the reader and the horse. Then the concept [Liquid: #] would become [Liquid: ?z].

The result would be the following CG with all indexicals resolved:

[If: [Person: *x] [Horse: *y] [Water: *z]
  [Then:
    [Proposition:
      (Psbl)®[Proposition: [Person: ?x]¬[Lead]-
        (Thme)®[Horse: ?y]
        (Dest)®[Water: ?z] ]]®(But)-
    [Proposition:
      (¬Psbl)®[Proposition: [Person: ?x]¬(Agnt)¬[Cause]-
        (Rslt)®[Situation:
          [Animate: ?y]¬(Agnt)¬[Drink]®(Ptnt)®[Liquid: ?z]] ]]]].

This CG may be read If there exist a person x, a horse y, and water z, then the person x can lead the horse y to water z, but the person x can't make the animate being y drink the liquid z. This graph is more detailed than the EG in Figure 3, because it explicitly shows the conjunction but and the linguistic roles Agnt, Thme, Ptnt, Dest, and Rslt. Before the indexicals are resolved, the type labels are needed to match the indexicals to their antecedents. Afterward, the bound concepts [Person: ?x], [Horse: ?y], [Animate: ?y], [Water: ?z], and [Liquid: ?z] could be simplified to just [?x], [?y], or [?z].

As this example illustrates, indexicals frequently occur in the intermediate stages of translating language to logic, but their correct resolution may require nontrivial assumptions. Many programs in AI and computational linguistics are able to follow the rules of discourse representation theory to resolve indexicals. The problem of making the correct assumptions about conversational implicatures is more difficult. The kinds of assumptions needed to understand ordinary conversation are similar to the assumptions that are made in nonmonotonic reasoning. Both of them depend partly on context-independent rules of logic and partly on context-dependent background knowledge.

3. Possible Worlds, Situations, and Contexts

Leibniz introduced possible worlds as the foundation for modal semantics: a proposition p is necessarily true in the real world if it is true in every possible world, and p is possible in the real world if there is some accessible world in which it happens to be true. In his algebraic notation for predicate calculus, Peirce followed Leibniz by representing necessity with a universal quantifier P_w, in which the variable w ranges over all "states of affairs." In his graphic notation for logic, Peirce used a pad of paper instead of a single "sheet of assertion." Graphs that are necessarily true are copied on every sheet; those that are possibly true are drawn on some, but not all sheets. The top sheet contains assertions about the actual state of affairs, and the other sheets, which may be potentially infinite, describe related states of affairs that are possible relative to the actual state.

Axioms for modal logic. The philosopher Clarence Irving Lewis (1918), who was strongly influenced by Peirce, introduced the diamond symbol à for representing possibility in the algebraic notation. If p is any proposition, then àp means p is possibly true. For necessity, a box op is used to mean p is necessarily true. Either symbol, à or o, can be taken as a primitive, and the other can be defined in terms of it:

A statement is necessarily true if and only if it is not possibly false:
op º ~à~p.
A statement is possibly true if and only if it is not necessarily false:
àp º ~o~p.

Besides these definitions, many other axioms have been discussed since the time of Aristotle and the medieval Scholastics. A basic version of modal logic called System T has the following three axioms:

Anything that is provable is necessarily true: if p is a theorem, then op.
Anything that is necessarily true is true: opÉp.
If it is necessarily true that p implies q, then if p is necessary, q is also necessary:
o(pÉq) É (opÉoq).

An important theorem of System T is that anything true is possible: pÉàp. This theorem, like the axioms, are patterns or schemata for deriving an arbitrary number of new axioms and theorems by substituting any formula for the variable p.

System T does not include axioms for iterated modalities, such as àoàp, which says that p is possibly necessarily possible. Such mind-boggling combinations seldom occur in English, but they may arise in the intermediate stages of a proof. To relate iterated modalities to simple modalities and to one another, Lewis and Langford (1932) defined two additional axioms, called S4 and S5, which may be added to System T:

S4. If p is necessary, then p is necessarily necessary: opÉoop.
S5. If p is possible, then p is necessarily possible: àpÉoàp.

For quantified modal logic, Ruth Barcan Marcus proposed the following axiom, which has become known as the Barcan formula:

BF. If for every x, P(x) is necessarily true, then it is necessary that for every x, P(x):
("x)oP(x) É o("x)P(x).

For different applications, various combinations of axioms may be assumed. A version with fewer axioms is said to be weaker, and a version with more axioms is said to be stronger.

System T combined with axioms S4, S5, and BF is one of the strongest versions of modal logic, but it is often too strong. In a version of modal logic called deontic logic, op is interpreted p is obligatory, and àp is interpreted p is permissible. In a perfect world, all the axioms of System T would be true. But since people are sinners, some axioms and theorems of System T cannot be assumed. The axiom, opÉp, would be violated by a sin of omission because some obligatory actions are not performed. The theorem, pÉàp, would be violated by a sin of commission because some actions that are performed are not permitted.

Kripke's worlds. Peirce's notation with a universal quantifier P_w for necessity and an existential quantifier S_w for possibility cannot be used in statements that contain one modal operator within the scope of another. The iterated modalities oo in Axiom S4 and oà in Axiom S5 would be represented by a sequence of two quantifiers for the same variable w, which would make one of quantifiers redundant or irrelevant. To interpret such iterated modal operators, Saul Kripke (1963a,b) discovered an ingenious technique based on model structures having three components:

Possible worlds. A set K of entities called possible worlds, of which one privileged world w₀ represents the real world. Nothing in Kripke's theory depends on any particular world being privileged as w₀, but for many applications, it is useful to designate some world in K as the actual world.
Accessibility relation. A relation R(u,v) defined over K, which says that world v is accessible from world u. In effect, R(u,v) stipulates that v is sufficiently similar to u to be "accessible" in some way from u.
Evaluation function. A function F(p,w), which maps a proposition p and a possible world w to one of the two truth values {T,F}. The world w is said to semantically entail the proposition p if F(p,w) has the value T, and w entails ~p if F(p,w) has the value F. The double turnstile operator |= represents semantic entailment:
w|=p º F(p,w)=T.
w|=~p º F(p,w)=F.

For any world u, the evaluation function F determines whether a proposition p is actually true or false in u. To determine whether p is necessary or possible, Kripke defined àp and op by considering the truth of p in any world v that is accessible from u:

Possibility. p is possible in the world u if p is true in some world v accessible from u:
àp º ($v)(R(u,v) Ù F(p,v)=T).
Necessity. p is necessary in the world u if p is true in every world v accessible from u:
op º ("v)(R(u,v) É F(p,v)=T).

By these definitions, Kripke restricted Peirce's quantification over all possible worlds to just those worlds v that are accessible from a given world u. With the notation for restricted quantifiers, the accessibility relation R(u,v) can be written as a restriction on the range of the quantified variable:

àp º ($v:R(u,v)) F(p,v)=T.
op º ("v:R(u,v)) F(p,v)=T.

The accessibility relation R(u,v) introduces the extra variables needed to distinguish different ranges for different quantifiers. Therefore, if p is necessarily necessary, the iterated modality oop can be defined:

oop º ("v:R(u,v))("w:R(v,w)) F(p,w)=T.

Now the two quantifiers are distinct because one ranges over v and the other ranges over w. Kripke's most important contribution was to show how Lewis's axioms determine constraints on the accessibility relation R:

System T. Two basic axioms of System T are opÉp (Necessity implies truth) and pÉàp (Truth implies possibility). They require every world to be accessible from itself; hence, R must be reflexive:
reflexive(R) º ("w)R(w,w).
System S4. System T with Lewis's axiom S4, opÉoop, requires R to be transitive:
transitive(R) º ("u,v,w)((R(u,v) Ù R(v,w)) É R(u,w)).
System S5. System S4 with axiom S5, àpÉoàp, requires R to be symmetric:
symmetric(R) º ("u,v)(R(u,v) É R(v,u)).

For System S5, the properties of reflexivity, transitivity, and symmetry make R an equivalence relation. Those properties cause the collection of all possible worlds to be partitioned in disjoint equivalence classes. Within each class, all the worlds are accessible from one another, but no world in one class is accessible from any world in another class.

The world of Sherlock Holmes stories, for example, is similar enough to the real world w₀ that it could be in the same equivalence class. The proposition that Sherlock Holmes assisted Scotland Yard would be possible in w₀ if there were some accessible world w in which it is true:

($w:R(w₀,w)) F("Sherlock Holmes assisted Scotland Yard",w)=T.

A world with cartoon characters like talking mice and ducks, however, is too remote to be accessible from the real world. Therefore, it is not possible for ducks to talk in w₀. Business contracts further partition the cartoon worlds into disjoint classes: the world of Disney characters is not accessible from the world of Looney Tune characters. Therefore, Donald Duck can talk to Mickey Mouse, but he can't talk to Bugs Bunny or Daffy Duck.

Criticisms of possible worlds. Possible worlds are a metaphor for interpreting modality, but their ontological status is dubious. Truth is supposed to be a relationship between a statement and the real world, not an infinite family of fictitious worlds. In Candide, Voltaire satirized Leibniz's notion of possible worlds. In that same tradition, Quine (1948) ridiculed the imaginary inhabitants of possible worlds:

Take, for instance, the possible fat man in that doorway; and, again, the possible bald man in that doorway. Are they the same possible man, or two possible men? How do we decide? How many possible men are there in that doorway? Are there more possible thin ones than fat ones? How many of them are alike? Or would their being alike make them one?

After Kripke developed his model structures for possible worlds, Quine (1972) noted that models prove that the axioms are consistent, but they don't explain what the modalities mean:

The notion of possible world did indeed contribute to the semantics of modal logic, and it behooves us to recognize the nature of its contribution: it led to Kripke's precocious and significant theory of models of modal logic. Models afford consistency proofs; also they have heuristic value; but they do not constitute explication. Models, however clear they be in themselves, may leave us at a loss for the primary, intended interpretation.

Quine was never sympathetic to modal logic and the semantics of possible worlds, but even people who are actively doing research on the subject have difficulty in making a convincing case for the notion of accessibility between worlds. Following is an attempted explanation by two authors of a widely used textbook (Hughes and Cresswell 1968):

This notion of one possible world's being accessible to another has at first sight a certain air of fantasy or science fiction about it, but we might attach quite a sober sense to it in the following way. We can conceive of various worlds which would differ in certain ways from the actual one (a world without telephones, for example). But our ability to do this is at least partly governed by the kind of world we actually live in: the constitution of the human mind and the human body, the languages which exist or do not exist, and many other things, set certain limits to our powers of conceiving. We could then say that a world, w₂, is accessible to a world, w₁, if w₂ is conceivable by someone living in w₁, and this will make accessibility a relation between worlds... (p. 77)

Hughes and Cresswell explained the dyadic accessibility relation in terms of a triadic relation of "conceivability" by someone living in one imaginary world who tries to imagine another one. Their explanation suggests that the person who determines which worlds are conceivable is at least as significant to the semantics of modality as the worlds themselves. To capture the "primary, intended interpretation," the formalism must show how the accessibility relation can be derived from some agent's conceptions.

By relating the modal axioms to model structures, Kripke showed the interrelationships between the axioms and the possible worlds. But the meaning of those axioms remains hidden in the accessibility relation R and the evaluation function F. The functional notation F(w,p)=T gives the impression that F computes a truth value. But this impression is an illusion: the set of worlds K is an undefined set given a priori; the relation R and function F are merely assumed, not computed. Nonconstructive assumptions cannot be used to compute anything, nor can they explain how Quine's possible fat men and thin men might be "accessible" from the real world with an empty doorway.

Hintikka's model sets. Instead of assuming possible worlds, Jaakko Hintikka (1961, 1963) independently developed an equivalent semantics for modal logic based on collections of propositions, which he called model sets. He also assumed an alternativity relation between model sets, which serves the same purpose as Kripke's accessibility relation between worlds. As collections of propositions, Hintikka's model sets describe Kripke's possible worlds:

("w:World)($M:ModelSet) M = { p | w|=p}.

This formula defines a mapping from Kripke's worlds to Hintikka's model sets: for any possible world w, there exists a model set M, which consists of all propositions p that are semantically entailed by w. In effect, the model set M describes everything that can be known about w.

By replacing the imaginary worlds with sets of propositions, Hintikka took an important step toward making them more formal. The mapping from possible worlds to model sets enables any theory about real or imaginary worlds to be restated in terms of the propositions that describe those worlds. But that mapping, by itself, does not address Quine's criticisms. Hintikka's alternativity relation between model sets is just as mysterious and undefined as Kripke's accessibility relation between worlds. Sets of formulas with an undefinable relation between them do not explain why one set is considered "accessible" from another.

Barwise and Perry's situations. To avoid infinite worlds with all the complexity of William James's example, Barwise and Perry (1983) proposed situation semantics as a theory that relates the meaning of sentences to smaller, more manageable chunks called situations. Each situation is a configuration of some aspect of a world in a bounded region of space and time. It may be a static configuration that remains unchanged for some period of time, or it may be a process that is causing changes. It may include people and things with their actions and speech; it may be real or imaginary; and its time may be past, present, or future.

In their book, Barwise and Perry identified a situation with a bounded region of space-time. But as William James observed, an arbitrary region of space and time contains "disjointed events" with no "rational bond between them." A meaningful situation is far from arbitrary, as the following examples illustrate:

A college lecture could be considered a situation bounded by a 50-minute time period in a spatial region enclosed by the walls of a classroom. But if the time were moved forward by 30 minutes, the region would include the ending of one lecture and the beginning of another. That time shift would create an unnatural "situation."
If the space were shifted left by half the width of a classroom, it would include part of one class listening to one teacher, part of another class listening to a different teacher speaking on a different topic, and a wall between the two lectures. That shift would create an even more unnatural situation than the time shift.
Another transformation might fix the coordinate system relative to the sun instead of the earth. Then the region that included the class at the beginning of the lecture would stay behind as the earth moved. Within a few minutes, it would be in deep space, containing nothing but an occasional hydrogen atom.

Even more complex situations would be needed for the referents of the Sherlock Holmes stories or the U.S. legal history. The first is fictional, and the second is intertwined with all the major events that happened in the United States from 1776 to the present. The space-time region for a fictional situation does not exist, and the space-time region for the U.S. legal history cannot be separated from the region of its political, economic, or cultural history.

In discussing the development of situation theory, Keith Devlin (1991a) observed that the definitions were stretched to the point where situations "include, but are not equal to any of simply connected regions of space-time, highly disconnected space-time regions, contexts of utterance (whatever that turns out to mean in precise terms), collections of background conditions for a constraint, and so on." After further discussion, Devlin admitted that they cannot be defined: "Situations are just that: situations. They are abstract objects introduced so that we can handle issues of context, background, and so on."

McCarthy's contexts. John McCarthy is one of the founding fathers of AI, whose collected work (McCarthy 1990) has frequently inspired and sometimes revolutionized the application of logic to knowledge representation. In his "Notes on Formalizing Context," McCarthy (1993) introduced the predicate ist(C,p), which may be read "the proposition p is true in context C." For clarity, it will be spelled out in the form isTrueIn(p,C). As illustrations, McCarthy gave the following examples:

isTrueIn("Holmes is a detective", contextOf("Sherlock Holmes stories")).
isTrueIn("Holmes is a Supreme Court Justice", contextOf("U.S. legal history")).

In these examples, the context disambiguates the referent of the name Holmes either to the fictional character Sherlock Holmes or to Oliver Wendell Holmes, Jr., the first appointee to the Supreme Court by President Theodore Roosevelt. In effect, names behave like indexicals whose referents are determined by the context.

One of McCarthy's reasons for developing a theory of context was his uneasiness with the proliferation of new logics for every kind of modal, temporal, epistemic, and nonmonotonic reasoning. The ever-growing number of modes presented in AI journals and conferences is a throwback to the scholastic logicians who went beyond Aristotle's two modes necessary and possible to permissible, obligatory, doubtful, clear, generally known, heretical, said by the ancients, or written in Holy Scriptures. The medieval logicians spent so much time talking about modes that they were nicknamed the modistae. The modern logicians have axiomatized their modes and developed semantic models to support them, but each theory includes only one or two of the many modes. McCarthy (1977) observed,

For AI purposes, we would need all the above modal operators in the same system. This would make the semantic discussion of the resulting modal logic extremely complex.

Instead of an open-ended number of modes, McCarthy hoped to develop a simple, but universal mechanism that would replace modal logic with first-order logic supplemented with metalanguage about contexts. His student R. V. Guha (1991) implemented contexts in the Cyc system and showed that a first-order object language supplemented with a first-order metalanguage could support versions of modal, temporal, default, and higher-order reasoning. Stuart Shapiro and his colleagues have implemented versions of propositional semantic networks, which support similar structures in a form that maps more directly to logic (Shapiro 1979; Maida & Shapiro 1982; Shapiro & Rappaport 1992). Shapiro's propositional nodes serve the same purpose as Peirce's ovals and McCarthy's contexts.

McCarthy, Shapiro, and their colleagues have shown that contexts are valuable for building knowledge bases, but they have not clearly distinguished the syntax of contexts from the semantics of some subject matter. McCarthy's predicate isTrueIn mixes the syntactic notion of containment (is-in) with the semantic notion of truth (is-true-of). One way to resolve the semantic status of contexts is to derive them from Barwise and Perry's situations:

("s:Situation)($C:Context) { p | isTrueIn(p,C)} = { q | s|=q}.

This formula maps situations to contexts: for every situation s, there exists a context C, whose set of true propositions is the same as the set of propositions entailed by s. Devlin (1991b) coined the term infon for a propostion that is entailed by a situation. With that terminology, the formula could be summarized by the following English statement:

The set of all infons entailed by a situation constitute the set of all true propositions of some context.

This principle enables the research on situation theory to be translated to equivalent results about contexts, but it still does not answer the question of how situations or contexts should be defined. For Devlin, situations are undefinable objects whose purpose is to simplify the problems of reasoning about contexts. For McCarthy, contexts are undefinable objects whose purpose is to simplify the problems of reasoning about situations. Peirce and James, the two founders of pragmatism, focused on what is common in these two circular definitions: the notion of purpose. Space-time coordinates are not sufficient to distinguish a meaningful situation from a disjointed collection of events. Some agent for some purpose must pick and choose what is relevant.

4. Laws and Legislation

Computability, although desirable, is not sufficient to explain meaning. Truth is more significant than computability, but truth conditions, by themselves, cannot determine relevance. As William James observed, infinitely many true statements could be made about the real world or any model of it, and the overwhelming majority of them are irrelevant to what anyone might want to say or do. The verbs that express the kind of relevance, such as wanting, fearing, or hoping, are called propositional attitudes. But their semantics depends critically on the agent whose intention or attitude toward some situation for some purpose determines the relevance of propositions about it.

Figure 7 shows how different kinds of contexts may be distinguished by their relationship to what is actual and to the intentions of some agent. An actual context represents something that is true. A modal context represents something that is related to what is actual by some modality, such as possibility or necessity. An intentional context is related to what is actual by some agent who determines what is intended.

Figure 7: Three kinds of contexts

In the upper left of Figure 7, the context labeled actual contains a graph that represents something that is true of some aspect of the world. That graph might be an EG or CG that states a proposition, or it might be a Tarski-style model, in which the nodes and arcs represent individuals and relations in the world. For an actual context, Tarski's model theory or Peirce's logically equivalent method of endoporeutic can determine the truth in terms of an actual state of affairs without considering any other possibilities or anyone's intentions about them.

In the upper right, the modal context represents a possibility relative to what is actual. To define the semantics of modality, Kripke extended Tarski's single model to an open-ended, possibly infinite family of models related by a dyadic accessibility relation R(w₀,w₁), which says that the world (or model) w₁ is accessible by some modification of the actual world w₀. Kripke's theory, however, treats the relation R as an undefined primitive: it does not specify the conditions for an accessible modification of w₀ to form w₁.

The diagram at the bottom shows an agent whose intention relates an actual context to an intended context. The simplest way to extend Kripke's theory to handle intentionality is to add an extra argument in the accessibility relation to name the agent. Philip Cohen and Hector Levesque (1990) used that approach to define two new kinds of accessibility relations:

Belief accessibility relation. B(w₁, p, t, w₂) relates a course of events w₁, an agent p, and a time point t to some course of events w₂, which is accessible from w₁ according to p's beliefs.
Goal accessibility relation. G(w₁, p, t, w₂) relates a course of events w₁, an agent p, and a time point t to some course of events w₂, which is accessible from w₁ according to p's goals.

By including the agent, this approach takes a step in the right direction, but it does not say how the agent determines what is accessible. Like Kripke's relation R, Cohen and Levesque's relations B and G are undefined primitives, which do not specify the conditions for an accessible modification of w₁ to form w₂. Putting the agent into the accessibility relation is important, but more information is needed to show how the agent's beliefs or goals affect the accessibility.

Dunn's laws and facts. If the accessibility relation is assumed as a primitive, modality and intentionality cannot be explained in terms of anything more fundamental. To make accessibility a derived relation, Michael Dunn (1973) replaced Kripke's undefined worlds a more detailed construction in terms of laws and facts. For every Kripke world w, Dunn assumed a pair <M,L>, where M is a Hintikka-style model set called the facts of w and L is a subset of M called the laws of w. Finally, Dunn showed how the accessibility relation from one world to another can be derived from constraints on which propositions are chosen as laws. As a result, the accessibility relation is no longer primitive, and the modal semantics does not depend on imaginary worlds. Instead, modality depends on the choice of laws, which could be laws of nature or merely human rules and regulations.

Philosophers since Aristotle have recognized that modality is related to laws; Dunn's innovation lay in making the relationships explicit. Let <M₁,L₁> be a pair of facts and laws that describe a possible world w₁, and let the pair <M₂,L₂> describe a world w₂. Dunn defined accessibility from the world w₁ to the world w₂ to mean that the laws L₁ are a subset of the facts in M₂:

R(w₁,w₂) º L₁ÌM₂.

According to this definition, the laws of the first world w₁ remain true in the second world w₂, but they may be demoted from the status of laws to just ordinary facts. Dunn then restated the definitions of possibility and necessity in terms of laws and facts. In Kripke's version, possibility àp means that p is true of some world w accessible from the real world w₀:

àp º ($wWorld)(R(w₀,w) Ù w|=p).

By substituting the laws and facts for the possible worlds, Dunn derived an equivalent definition:

àp º ($M:ModelSet)(laws(M)ÌM₀ Ù pÎM).

Now possibility àp means that there exists a model set M whose laws are a subset of the facts of the real world M₀ and p is a fact in M. By the same substitutions, the definition of necessity becomes

op º ("M:ModelSet)(laws(M)ÌM₀ É pÎM).

Necessity op means that in every model set M whose laws are a subset of the facts of the real world M₀, p is also a fact in M.

Dunn performed the same substitutions in Kripke's constraints on the accessibility relation. The result is a restatement of the constraints in terms of the laws and facts:

System T. The two axioms opÉp and pÉàp require every world to be accessible from itself. That property follows from Dunn's definition because the laws L of any world are a subset of the facts: LÌM.
System S4. System T with Lewis's axiom S4, opÉoop, requires that R must also be transitive. It imposes the tighter constraint that the laws of the first world must be a subset of the laws of the second world: L₁ÌL₂.
System S5. System S4 with axiom S5, àpÉoàp, requires that R must also be symmetric. It constrains both worlds to have exactly the same laws: L₁=L₂.

In Dunn's theory, the term possible world is an informal metaphor that does not appear in the formalism: the semantics of op and àp depends only on the choice of laws and facts. In the pairs <M,L>, all formulas are purely first order, and the symbols o and à never appear in any of them.

Dunn's theory is a compatible refinement of Kripke's theory, since any Kripke model structure (K,R,F) can be converted to one of Dunn's model structures in two steps:

Replace every world w in K with the set M of propositions that are true in w and the set L of propositions that are necessary in w.
Define Kripke's primitive accessibility relation R(u,v) by the constraint that the laws of u are be true in v.

Every axiom and theorem of Kripke's theory remains true in Dunn's version, but Dunn's theory makes the reasons for modality available for further examination. For theories of intentionality, Dunn's approach can relate the laws to the goals and purposes of some agent, who in effect legislates which propositions are to be considered laws.

Databases and Knowledge bases. For computational purposes, Kripke's possible worlds must be converted to symbolic representations that can be stored and manipulated in a database or knowledge-based system. The mapping from Kripke's models to Dunn's models is essential for replacing the physical world or some imaginary world with a computable symbolic representation. Following are the correspondences between Dunn's semantics and the common terminology of databases and knowledge bases:

Laws. An ontology in AI terminology or a conceptual schema in database terminology is a collection of axioms and definitions that determine the laws for any of Kripke's real or imaginary worlds. Those laws, which have also been called database constraints or business rules, must be true of every database that correctly represents something in the real world or a possible world.
As an examples, a law or DB constraint might state that every person has two parents, one male and one female; and each person's age must be less than the age of either parent. Even if the facts stored in the database are incomplete, there must be room for adding the names and birthdates of the parents when they become known.
Facts. The tables in a relational database correspond to ground level facts in a theory. All the laws (axioms and definitions or constraints and business rules) must be true of the database at all times. Therefore, anything deducible from the laws and the database must also be a fact. In Dunn's models, the set of all facts is the deductive closure of the laws and the ground-level facts.
The ground-level facts, for example, might state the names and birthdates for everyone known to the system. The laws could be used to verify that nobody has more than two parents and to deduce family relationships such as siblings, grandparents, and cousins. During normal operations, the laws would not change, but the ground-level facts could be updated to record births, deaths, and marriages.
Accessibility relation. Kripke's accessibility relation R(u,v) from a world u to a world v corresponds to a permissible update from a database that represents u to a database that represents v. In database terms, accessibility corresponds to a update that observes the constraints: every law of u must remain true in the updated database that represents v.
If some person already has two parents named in the database, no update that named a third parent for that person would be permitted.

After the possible worlds have been mapped to collections of laws and facts in a database or knowledge base, the possible worlds become unnecessary and ontologically dubious baggage. All further theories and computation can be stated in terms of the laws and facts.

The various axioms for modality correspond to different options for permissible changes to a database or knowledge base. In effect, they specify policies for how a knowledge engineer or database administrator could modify the laws to accommodate changes in the known scientific principles or changes in the business or social structures. Following are the policies that correspond to the modal systems T, S4, and S5:

System T. The basic modal logic, System T, would allow the DB administrator to promote any fact in the current database to the privileged status of a law or to demote any law to the status of a fact. However, no such change would modify the truth value of any law or fact, since both facts and laws are assumed to be true.
System S4. The axiom S4, when added to System T, would allow the DB administrator to promote facts to laws, but not demote any laws to the status of mere facts. S4 would allow new laws to be made, but no laws could be repealed.
System S5. The addition of axiom S5 to System S4 would require all accessible databases to have exactly the same laws. Therefore, the DB administrator would not be allowed make any changes to the collection of laws.

As an example, the constraint that every person must have two parents of opposite sexes might be modified to accommodate sex-change operations. The possibility of cloning might mean that some people might have only one parent. The axioms of System T would permit such modifications to the laws, but System S4 would prohibit them because earlier laws could not be deleted. S5 would prohibit any changes to the laws.

Mapping Possible Worlds to Contexts. The primary difference between model sets and contexts is size: Hintikka defined model sets as maximally consistent sets of propositions that could describe everything in the real world or any possible world. But as William James observed, a collection of information about the entire world is far too large and disjointed to be comprehended and manipulated in any meaningful way. A context is an excerpt from a model set in the same sense that a situation is an excerpt from a possible world. It could contain a finite set of propositions that describe some situation, even though the deductive closure of that set might be infinite.

Figure 8: Ways of mapping worlds to contexts

Figure 8 shows mappings from a Kripke possible world w to a description of w as a Hintikka model set M or a finite excerpt from w as a Barwise and Perry situation s. Then M and s may be mapped to a Peirce-McCarthy context C. This is an example of a commutative diagram, which shows a family of mappings that lead to the same result by multiple routes. Those routes result from the different ways of combining two kinds of mappings:

Excerpt. The two arrows that point to the right take excerpts: a situation is a part of a world, and a context is a subset of the propositions in a model set. For any part extracted by the top arrow w®s, there is a corresponding subset extracted by the bottom arrow M®C.
Entailment. The two arrows that point down represent the operator |= for semantic entailment. The arrow w®M maps the world w to the set of propositions entailed by w, and the arrow s®C maps the situation s to the set of propositions entailed by s.

The combined mappings in Figure 8 replace the mysterious possible worlds with finite contexts. Hintikka's model sets support operations on formulas instead of possible worlds, but they may still be infinite. Situations are finite, but they consist of physical or fictitious objects that are not computable. The contexts in the lower right of Figure 8 are the only things that can be represented and manipulated in a digital computer or the human brain.

Completing the pushout. In the commutative diagram of Figure 8, the downward arrow on the left corresponds to Dunn's mapping of possible worlds to laws and facts, and the rightward arrow at the top corresponds to Barwise and Perry's mapping of possible worlds to situations. The branch of mathematics called category theory has methods of completing such diagrams by deriving the other mappings. Given the two arrows at the left and the top, the technique called a pushout defines the two arrows on the bottom and the right:

Left. The downward arrow on the left side of the diagram maps each possible world w to a model set M of facts that describe w. Since every proposition p in M must be true of w, the evaluation function F(p,w) corresponds to membership: pÎM. The laws L of w can be chosen as any subset of propositions in M whose deductive closure is in the intersection of all model sets for worlds accessible from w.
Top. The rightward arrow at the top maps w to a finite region s selected from w called a situation. The excerpt is chosen by some agent who decides how much of the world is relevant and what level of granularity is appropriate for its description.
Bottom. The rightward arrow on the bottom is determined by the choice of s at the top. Every proposition p in M that is true of s (s|=p) is copied to the context C. The laws of C are those laws in M that happened to be copied: CÇL.
Right. The downward arrow at the right maps the situation s to a context C that describes s. It produces the same result as the mapping on the bottom, because it must satisfy the same constraints: every proposition p in M that is true of s must be in C; the laws of C are the laws in M that are included in CÇL.

Since the domains of each of the four mappings in Figure 8 are noncomputable structures, the mappings themselves are not computable. Their purpose is not to support computation, but to determine how theories about noncomputable possible worlds and situations can be adapted to computable contexts. After the theories have been transferred to contexts, the possible worlds and situations are unnecessary for further reasoning and computation.

Situations as pullbacks. The inverse of a pushout, called a pullback, is an operation of category theory that "pulls" some structure or family of structures backward along an arrow of a commutative diagram. For the diagram in Figure 8, the model set M and the context C are symbolic structures that have been studied in logic for many years. The situation s, as Devlin observed, is not as clearly defined. One way to define a situation is to assume the notion of context as more basic and to say that a situation s is whatever is described by a context C. In terms of the diagram of Figure 8, the pullback would start with the two mappings from w to M and from M to C. Then the situation s in the upper right and the two arrows w®s and s®C would be derived by a pullback from the starting arrows w®M and M®C.

The definition of situations in terms of contexts may be congenial to logicians for whom abstract propositions are familiar notions. For people who prefer to think about physical objects, the notion of a situation as a chunk of the real world may seem more familiar. The commutative diagram provides a way of reconciling the two views: starting with a situation, the pushout determines the propositions in the context; starting with a context, the pullback defines the situation. The two complementary views are useful for different purposes: for a mapmaker, the context is derived as a description of some part of the world; for an architect, the concrete situation is derived by some builder who follows an abstract description.

Legislating the laws. Although Dunn's semantics explains the accessibility relation in terms of laws, the source of the laws themselves is never explained. In the semantics for intentionality, however, the laws are explicitly chosen by some agent who may be called the lawgiver. The entailment operator s|=p relates an entity s to a proposition p that is entailed by s. A triadic relation legislate(a,p,s) could be used to relate an agent a who legislates a proposition p as a law, rule, or regulation for some entity s. The following formula says that Tom legislates some proposition as a rule for a lottery game:

($p:Proposition)($s:LotteryGame)(person(Tom) Ù legislate(Tom,p,s)).

By Dunn's convention, the laws L of any entity s must be a subset of the facts entailed by s. That condition may be stated as an axiom:

("a:Agent)("p:Proposition)("s:Entity)(legislate(a,p,s) É s|=p)).

This formula says that for every agent a, proposition p, and entity s, if a legislates p as a law of s, then s entails p. Together with Dunn's semantics, the triadic legislation relation formalizes the informal suggestion by Hughes and Cresswell: "a world, w₂, is accessible to a world, w₁, if w₂ is conceivable by someone living in w₁." Some agent's conceptions become the laws that determine what is necessary and possible in the imaginary worlds. The next step of formalization is to classify the kinds of conceptions that determine the various kinds of modality and intentionality.

5. Peirce's Classification of Contexts

In 1906, Peirce introduced colors into his existential graphs to distinguish various kinds of modality and intentionality. Figure 3, for example, used red to represent possibility in the EG for the sentence You can lead a horse to water, but you can't make him drink. To distinguish the actual, modal, and intentional contexts illustrated in Figure 7, three kinds of colors would be needed. Conveniently, the heraldic tinctures, which were used to paint coats of arms in the middle ages, were grouped in three classes: metal, color, and fur. Peirce adopted them for his three kinds of contexts, each of which corresponding to one of his three categories: Firstness (independent conception), Secondness (relative conception), and Thirdness (mediating conception).

Actuality is Firstness, because it is what it is, independent of anything else. Peirce used the metallic tincture argent (white background) for "the actual or true in a general or ordinary sense," and three other metals (or, fer, and plomb) for "the actual or true in some special sense."
Modality is Secondness, because it distinguishes the mode of a situation relative to what is actual: whenever the actual world changes, the possibilities must also change. Peirce used four heraldic colors to distinguish modalities: azure for logical possibility (dark blue) and subjective possibility (light blue); gules (red) for objective possibility; vert (green) for "what is in the interrogative mood"; and purpure (purple) for "freedom or ability."
Intentionality is Thirdness, because it depends on the mediation of some agent who distinguishes the intended situation from what is actual. Peirce used four heraldic furs for intentionality: sable (gray) for "the metaphysically, or rationally, or secondarily necessitated"; ermine (yellow) for "purpose or intention"; vair (brown) for "the commanded"; and potent (orange) for "the compelled."

Throughout his analyses, Peirce distinguished the logical operators, such as Ù, ~, and $, from the tinctures, which, he said, do not represent

...differences of the predicates, or significations of the graphs, but of the predetermined objects to which the graphs are intended to refer. Consequently, the Iconic idea of the System requires that they should be represented, not by differentiations of the Graphs themselves but by appropriate visible characters of the surfaces upon which the Graphs are marked.

In effect, Peirce did not consider the tinctures to be part of logic itself, but of the metalanguage for describing how logic applies to the universe of discourse:

The nature of the universe or universes of discourse (for several may be referred to in a single assertion) in the rather unusual cases in which such precision is required, is denoted either by using modifications of the heraldic tinctures, marked in something like the usual manner in pale ink upon the surface, or by scribing the graphs in colored inks.

Peirce's later writings are fragmentary, incomplete, and mostly unpublished, but they are no more fragmentary and incomplete than most modern publications about contexts. In fact, Peirce was more consistent in distinguishing the syntax (oval enclosures), the semantics ("the universe or universes of discourse"), and the pragmatics (the tinctures that "denote" the "nature" of those universes).

Classifying contexts. The first step toward a theory of context is a classification of the types of contexts and their relationships to one another. Any of the tinctured contexts may be nested inside or outside the ovals representing negation. When combined with negation in all possible ways, each tincture can represent a family of related modalities:

The first metallic tincture, argent, corresponds to the white background that Peirce used for his original existential graphs. When combined with existence and conjunction, negations on a white background support classical first-order logic about what is actually true or false "in an ordinary sense." Negations on the other metallic backgrounds support FOL for what is "actual in some special sense." A statement about the physical world, for example, would be actual in an ordinary sense. But Peirce also considered mathematical abstractions, such as Cantor's hierarchy of infinite sets, to be actual, but not in the same sense as ordinary physical entities.
In the algebraic notation, àp means that p is possible. Then necessity op is defined as ~à~p. Impossibility is represented as ~àp or equivalently o~p. Instead of the single symbol à, Peirce's five colors represent different versions of possibility; for each of them, there is a corresponding interpretation of necessity, impossibility, and contingency:
- Logical possibility. A dark blue context, Peirce's equivalent of àp, would mean that p is consistent or not provably false. His version of op, represented as dark blue between two negations, would therefore mean that p is provable. Impossible ~àp would mean inconsistent or provably false.
- Subjective possibility. In light blue, àp would mean that p is believable or not known to be false. op would mean that p is known or not believably false. This interpretation of à and o is called epistemic logic.
- Objective possibility. In red, àp would mean that p is physically possible. As an example, Peirce noted that it was physically possible for him to raise his arm, even when he was not at the moment doing so. op would mean physical necessity according to the laws of nature.
- Interrogative mood. In green, àp would mean that p is questioned, and op would mean that p is not questionably false. This interpretation of àp corresponds to a proposition p in a Prolog goal or the where-clause of an SQL query.
- Freedom. In purple, àp would mean that p is free or permissible; op would mean that p is obligatory or not permissibly false; ~àp would mean that p is not permissible or illegal; and à~p would mean that p is permissibly false or optional. This interpretation of à and o is called deontic logic.
The heraldic furs represent various kinds of intentions, but Peirce did not explore the detailed interactions of the furs with negations or with each other. Don Roberts (1973) suggested some combinations, such as negation with the tinctures gules and potent to represent The quality of mercy is not strained.

Although Peirce's three-way classification of contexts is useful, he did not work out their implications in detail. He wrote that the complete classification of "all the conceptions of logic" was "a labor for generations of analysts, not for one."

Multimodal reasoning. As the multiple axioms for modal logic indicate, there is no single version that applies to all problems. The complexities increase when different interpretations of modality are mixed, as in Peirce's five versions of possibility, which could be represented by colors or by subscripts, such as à₁, à₂, ..., à₅. Each of those modalities is derived from a different set of laws, which interact in various ways with the other laws:

The combination o₃à₁p, for example, would mean that it is subjectively necessary that p is logically possible.
According to the definition of o₃, someone must know that à₁p.
Since what is known must be true, the following theorem would hold for that combination of modalities:
o₃à₁p É à₁p.

Similar analysis would be required to derive the axioms and theorems for all possible combinations of the five kinds of possibility with the five kinds of necessity. Since subjective possibility depends on the subject, the number of combinations increases further when multiple agents interact.

By introducing contexts, McCarthy hoped to reduce the proliferation of modalities to a single mechanism of metalevel reasoning about the propositions that are true in a context. By supporting a more detailed representation than the operators à and o, the dyadic entailment relation and the triadic legislation relation support metalevel reasoning about the laws, facts, and their implications. Following are some implications of Peirce's five kinds of possibility:

Logical possibility. The only statements that are logically necessary are tautologies: those statements that are entailed by the empty set. No special lawgiver is needed for the empty set; alternatively, every agent may be assumed to legislate the empty set:
{} = {p:Proposition | ("a:Agent)("x:Entity)legislate(a,p,x)}.
The empty set is the set of all propositions p that every agent a legislates as a law for every entity x.
Subjective possibility. A proposition p is subjectively possible for an agent a if a does not know p to be false. The subjective laws for any agent a are all the propositions that a knows:
SubjectiveLaws(a) = {p:Proposition | know(a,p)}.
That principle of subjective possibility can be stated in the following axiom:
("a:Agent)("p:Proposition)("x:Entity) (legislate(a,p,x) º know(a, x|=p)).
For any agent a, proposition p, and entity x, the agent a legislates p as a law for x if and only if a knows that x entails p.
Objective possibility. The laws of nature define what is physically possible. The symbol God may be used as a place holder for the lawgiver:
LawsOfNature = {p:Proposition | ("x:Entity)legislate(God,p,x)}.
If God is assumed to be omniscient, this set is the same as everything God knows or SubjectiveLaws(God). What is subjective for God is objective for everyone else.
Interrogative mood. A proposition is not questioned if it is part of the common knowledge of the parties to a conversation. For two agents a and b, common knowledge can be defined as the intersection of their subjective knowledge or laws:

CommonKnowledge(a,b) = SubjectiveLaws(a) Ç SubjectiveLaws(b).
Freedom. Whatever is free or permissible is consistent with the laws, rules, regulations, ordnances, or policies of some lawgiver who has the authority to legislate what is obligatory for x:
Obligatory(x) = {p:Proposition | ($a:Agent) (authority(a,x) Ù legislate(a,p,x)}.
This interpretation, which defines deontic logic, makes it a weak version of modal logic since consistency is weaker than truth. The usual modal axioms opÉp and pÉàp do not hold for deontic logic, since people can violate the laws.

Reasoning at the metalevel of laws and facts is common practice in courts. In the United States, the Constitution is the supreme law of the land; any law or regulation of the U.S. government or any state, county, or city in the U.S. must be consistent with the U.S. Constitution. But the tautologies and laws of nature are established by an even higher authority. No one can be forced to obey a law that is logically or physically impossible.

6. Nested Graph Models

To prove that a syntactic notation for contexts is consistent, it is necessary to define a model-theoretic semantics for it. But to show that the model captures "the primary intended interpretation," it is necessary to show how it represents the entities of interest in the application domain. For consistency, this section defines model structures called nested graph models (NGMs), which can serve as the denotation of logical expressions that contain nested contexts. Nested graph models are general enough to represent a variety of other model structures, including Tarski-style "flat" models, the possible worlds of Kripke and Montague, and other approaches discussed in this paper. The mapping from those model structures to NGMs shows that NGMs are at least as suitable for capturing the intented interpretation. Dunn's semantics allows NGMs to do more: the option of representing metalevel information in any context enables statements in one context to talk about the laws and facts of nested contexts and about the intentions of agents who may have legislated the laws.

To illustrate the formal definitions, Figure 9 shows an informal example of an NGM. Every box or rectangle in Figure 9 represents an individual entity in the domain of discourse, and every circle represents a property (monadic predicate) or a relation (predicate or relation with two or more arguments) that is true of the individual to which it is linked. The arrows on the arcs are synonyms for the integers used to label the arcs: for dyadic relations, an arrow pointing toward the circle represents the integer 1, and an arrow pointing away from the circle represents 2; relations with more than two arcs must supplement the arrows with integers. Some boxes contain nested graphs: they represent individuals that have parts or aspects, which are individual entities represented by the boxes in the nested graphs. The relations in the nested boxes may be linked to boxes in the same graph or to boxes in some area outside the box in which they are contained. No relations are ever linked to boxes that are more deeply nested than they are.

Figure 9: A nested graph model (NGM)

Formally, an NGM can be defined in equivalent ways with the terminology of either hypergraphs or bipartite graphs. For convenience in relating the formalism to diagrams such as Figure 9, a nested graph model G is defined as a bipartite graph with four components, G=(A,B,C,L):

Arcs. A is a set of arcs, each of which is a pair (c,b), where c is a node in C and b is either a node in B or a box node of some NGM in which the NGM G is nested.
Boxes. B is a set of nodes called boxes. If b is any box in B, there may be a nested graph model H that is said to be contained in b and directly nested in G. An NGM is said to be nested in G either if it is directly nested in G or if it is directly nested in some other NGM that is nested in G. The NGM G may not be nested in itself, and any NGM H nested in G shall be contained in exactly one box of G or some NGM nested in G.
Circles. C is a set of nodes called circles. If c is any circle in C, any arc (c,b) in A, for which b is some box in B, is said to belong to c. For any circle c, the number n of arcs that belong to c is finite; and for each i from 1 to n, there is one and only one arc a_i, which belongs to c and for which label(a_i)=i.
Labels. L is a set of entities called labels, for which there exists a function label: AÈBÈC®L. If a is any arc in A, label(a)=i is a positive integer; no two arcs that belong to the same circle may have identical labels. If b is any box in B, label(b) is said to be an individual name; no two boxes in B may have identical labels. If c is any circle in C, label(c) is said to be a relation name; multiple circles in C may have identical labels.

The terms box and circle for the two kinds of nodes in the bipartite graphs suggest a way of drawing them in informal diagrams, such as Figure 9. Those diagrams, however, have no formal status in the theory, and any other notation that shows the equivalent relationships would serve equally well. Any NGM that is drawn on paper or stored in a computer must be finite, but for generality, there is no theoretical limit on the cardinality of any of the sets A, B, C, or L. In computer implementations, character strings are usually chosen as names to label the boxes and circles, but in theory, any sets, including images or even uncountably infinite sets, could be used as names.

An NGM may consist of any number of levels of nested NGMs, but no NGM is nested within itself, either directly or indirectly. If infinite nesting depth is permitted, an NGM could be isomorphic to another NGM nested in itself. In a computer implementation, such nesting could be simulated with a pointer from an inner node to an outer node; but in theory, the outer NGM and the nested NGM are considered to be distinct. In any computer implementation, there must be exactly one outermost NGM in which all the others are nested. In theory, however, infinite NGMs with no outermost level could be considered.

Mapping other models to NGMs. Nested graph models are set-theoretical constructions that can serve as models for a wide variety of logical theories. They can be specialized in various ways to represent many other model structures:

Tarski's flat models. A Tarski-style model M=(D,R) consists of a set D called a domain of individuals and a set R of relations defined over D. By the usual methods for translating relations to graphs, the model M can be represented as a flat NGM G=(A,B,C,L), which contains no nested NGMs:
1. Each individual x in the domain D is represented by exactly one box b in B whose label is x: label(b)=x. For any x in D, the inverse label^-1(x) is the box in B that represents x.
2. Each tuple t=(x₁,...,x_n) of any relation r in R is represented by a circle c in C, which is labeled with the name of r and for which there are n arcs in A, which belong to c.
3. For each i from 1 to n, the i-th arc of the circle c for the tuple t is the pair (c, label^-1(x_i)). That arc is labeled with the integer i, and it is said to link c to the box labeled x_i.
4. The sets A, B, C, and L contain no elements other than those specified in steps 1, 2, and 3.
If the sets D and R happen to be uncountably infinite, there would not be enough character strings to serve as labels in L. Therefore, the elements of D and R themselves may be used as labels for the boxes and circles.
Kripke's possible worlds. Any Kripke-style model structure M=(K,R,Φ) can be translated to a nested graph model G=(A,B,C,L) with one level of nesting. G may be constructed from M in the following steps:
1. For each possible world w_i in K, the set B shall have a box b_i, which contains a flat NGM W that represents a Tarski-style model for w_i. The label of the box b_i shall be the world w_i.
2. For each tuple in the accessibility relation R, the set C shall have a circle c with the label "R", and the set A shall have two arcs: an arc (c,b_i) with the label "1" and an arc (c,b_j) with the label "2".
3. The sets A, B, and C shall have no other elements than those specified in steps 1 and 2. The set L shall be the union of all the labels specified in those steps: L=K∪{"R","1","2"}. The arcs, boxes, circles, and labels in any NGM W nested in a box b_i in B are derived from the Tarski-style model for the world w_i.
For finite models, these steps can be translated to a computer program that constructs G from M. For infinite models, they should be considered a specification rather than a construction.
Models for quantified modal logic. Kripke's original model theory was designed for propositional modal logic, which does not support quantifiers and variables that range over individuals. With quantified modal logic, the quantifiers may interact with the modal operators, as in the Barcan formula:
("x)oP(x) É o("x)P(x).
This axiom says that if for every x, some predicate P(x) is necessarily true, then it is necessary that for every x, P(x). It implies that all worlds accessible from a given world must have exactly the same individuals.
To allow quantification over the individuals in the possible worlds, A model G=(A,B,C,L) can be specified by starting with the first three steps for constructing an NGM H for a Kripke-style model and continuing with the following steps:
1. Let I be the union of the labels of all the boxes of every NGM W nested in H. Let the labels of L be the union of I with the labels of H.
2. For every label i in I, let there be a box b_i whose label is i. Let the boxes of B be the union of these boxes with the boxes of H.
3. For every box b of every NGM W nested in H, add a circle c with label "Identity" to the circles of the corresponding NGM V nested in G. Add two arcs to the set of arcs of V: an arc (c,b) with label "1", and an arc (c,d) with label "2", where d is the box in B for which label(d)=label(b).
4. The sets of G and of all NGMs nested in G shall be the union of the corresponding sets in H with all and only the additions specified in steps 1, 2, and 3.
Since the Barcan formula requires the possible worlds to be partitioned in equivalence classes that have the same individuals, any NGM that satisfies it would require the nested NGMs to be partitioned in equivalence classes with the same labels on their boxes. Nested graph models, however, can support more general models than those that satisfy the Barcan formula. They could, for example, support the notion of counterparts: some privileged NGM V₀ would represent the real world, and its boxes would be linked to boxes in the outer NGM G by circles with the label "Identity"; the boxes in other nested NGMs would be linked to the outer boxes by circles with the label "Counterpart". Two individuals in different possible worlds would be considered identical if their corresponding boxes were linked to the same box in G by circles labeled "Identity"; they would be considered counterparts if one circle was labeled "Counterpart" and the other was labeled either "Identity" or "Counterpart".
Montague's intensional models.
Barwise and Perry's situations.
Models for temporal logic.
Models for access-limited logics. Crawford and Kuipers (1991) defined access-limited logics, which partition the propositions of a knowledge base into frame slots, which are similar to the contexts of Peirce, Shapiro, Sowa, and McCarthy.

[This section is incomplete.]

As Quine observed, consistency in terms of a model does not ensure that the model captures "the primary intended interpretation." Six different reasons, organized in three pairs:

Partitioning the world.
Partitioning the namespace.
Grouping closely related entities.
Factoring out common information.
Reducing the interconnections between groups.
Reducing the search space for relevant information.

Representing Situations and Contexts. The conceptual graph in Figure 10 shows how conceptual graphs can make implicit semantic relationships explicit. At the top is a concept of type Situation, linked by two image relations (Imag) to two different images of that situation: a picture and the associated sound. The description relation (Dscr) links the situation to a proposition that describes some aspect of it. That proposition is linked by three statement relations (Stmt) to statements of the proposition in three different languages: an English sentence, a conceptual graph, and a formula in the Knowledge Representation Language (KIF).

Figure 10: A CG representing a situation of a plumber carrying a pipe

The Imag relation links an entity to an icon that shows what it looks like or sounds like. The Dscr relation or the corresponding predicate dscr(x,p) links an entity x to a proposition p that describes some aspect of x. In the metatheory about logic, the symbol |=, called the double turnstile, is used to say that some proposition p is entailed by some entity x. Semantic entailment x|=p means that the proposition p makes a true assertion about some entity x; an alternate terminology is to say that the entity x satisfies p. Semantic entailment is equivalent to the description predicate dscr(x,p):

("x:Entity)("p:Proposition)(dscr(x,p) º x|=p).

Literally, for every entity x and proposition p, x has a description p if and only if x semantically entails p. Informally, the terms semantic entailment, description, and satisfaction have been used by different philosophers with different intuitions, but formally, they are synonymous.

As Figure 10 illustrates, the proposition expressed in any of the three languages represents a tiny fraction of the total information available. Both the sound image and the picture image capture information that is not in the sentence, but even they are only partial representations. A picture may be worth a thousand words, but a situation can be worth a thousand pictures. Yet the less detailed sentences have the advantage of being easier to think about, talk about, and compute.

7. Nesting and Flattening

[This section is incomplete.]

A Tarski-style model or any of its generalizations by Hintikka, Kripke, Montague and others is a formal basis for determining the truth of a statement in terms of a model of the world.

In his presentation of model-theoretic semantics, Tarski (1935) insisted that his definition of truth applied only to "formalized languages" and that any attempt to apply it to natural language is fraught with "insuperable difficulties." He concluded that "the very possibility of a consistent use of the expression "true sentence" which is in harmony with the laws of logic and the spirit of everyday language seems to be very questionable, and consequently the same doubt attaches to the possibility of constructing a correct definition of this expression." As many logicians have observed, the most that model theory can do is to demonstrate the consistency of one set of axioms relative to another set that is better known or more widely accepted. For pure mathematics, for which applications are irrelevant, consistency is sufficient for truth: it guarantees that the axioms of a theory are satisfiable in some Platonic realm of ideas.

The requirement for introducing agents and their intentions demotes model theory from its role as the source of all meaning, but a model is still useful as a consistency check. For the theory of contexts presented in this paper, a single example is sufficient to prove consistency:

Let the language L₀ be ordinary first-order logic in any suitable notation, and let the domain D₀ be some collection of physical objects, such as the SHRDLU blocks world.
Let the context C₀ be the deductive closure of some statements written in L₀ that happen to be true of the objects in D₀. By construction, C₀ must be consistent.
Let the language L₁ be an extension of L₀ and its domain to include the blocks of D₀, all the syntactic features of L₀, and the predicate isTrueIn(p,C₀) for any formula p in L₀.
Let the context C₁ be a statement of Tarski's model theory for L₀ formulated in language L₁ with the predicate isTrueIn(p,C₀) replacing the predicate isTrue(p). If Tarski's model theory is consistent, then context C₁ must also be consistent.
To continue this technique, any established logical theory could be stated in contexts that include C₀ and C₁. A context C₂, for example, could state the first-order rules of inference for language L₀ and prove that those rules preserve truth as defined by the model theory in C₁.
The framework also allows statements in a context C_n to use the triadic legislate predicate to formalize theories about the agents in lower contexts and their knowledge and intentions. At level n, such theories would be stated in first-order logic, but they would have the effect of a modal or higher-order logic in contexts below n.

This construction demonstrates that the framework of nested contexts by itself does not introduce any inherent inconsistencies. People who use that framework, however, might introduce inconsistent statements into one or more of the contexts, but such inconsistencies would be the fault of the people who use the framework, not of the framework itself. The framework provides a mechanism for talking about relationships between languages and the world and about the agents who use those languages for talking about the world to one another. It even permits agents at one level to talk about inconsistencies at lower levels and about other agents (or themselves) who may be responsible for introducing them.

For applied mathematics, the truth of a theory requires a correspondence with structures that are more tangible than Platonic ideas. For applications, model theory must be supplemented with methods of observation and measurement for determining how well the abstract symbols of the theory match their real-world referents and the predicted relationships between them. Yet as philosophers from Hume to Quine have insisted, such a correspondence falls short of an explanation: a mere correspondence with observations could be accidental. A famous example is Bode's "law" for predicting the distance of planets from the sun; it matched the observed orbits for the planets up to Uranus, but it failed when Neptune and Pluto were discovered.

In Peirce's terms, correspondence is an example of Secondness: a dyadic relationship between symbols and their referents. That relationship is a prerequisite for truth, but not an explanation. Explanation requires Thirdness: a triadic predicate that relates a law-like regularity in the universe to the symbols and their referents. For physical laws, the lawgiver who is responsible for that regularity may be personified as God or Nature. For legal, social, contractual, and habitual regularities, the lawgiver is some mortal agent ¾ human, animal, or robot. Any theory of meaning that goes beyond a simple catalog of observations must be stated in terms of agents and their deliberate or habitual legislations.

Stratified levels. To simplify metalevel reasoning, Tarski advocated a method of separating or stratifying the metalevels and the object level. If the object language L₀ refers to entities in a universe of discourse D, the metalanguage L₁ refers to the symbols of L₀ and their relationships to D. The metalanguage is still first order, but its universe of discourse is enlarged from D to L₀ÈD. The metametalanguage L₂ is also first order, but its universe of discourse is L₁ÈL₀ÈD. To avoid paradoxes, Tarski insisted that no metalanguage L_n could refer to its own symbols, but it could refer to the symbols or the domain of any language L_i where 0£i<n.

In short, metalevel reasoning is first-order reasoning about the way statements may be sorted into contexts. After the sorting has been done, the propositions in a context can be handled by the usual FOL rules. At every level of the Tarski hierarchy of metalanguages, the reasoning process is governed by first-order rules. But first-order reasoning in language L_n has the effect of higher-order or modal reasoning for every language below n. At every level n, the model theory that justifies the reasoning in L_n is a conventional first-order Tarskian theory, since the nature of the objects in the domain D_n is irrelevant to the rules that apply to L_n.

Example. To illustrate the interplay of the metalevel transformations and the object-level inferences, consider the following statement, which includes direct quotation, indirect quotation, indexical pronouns, and metalanguage about belief:

Joe said "I don't believe in astrology, but they say that it works even if you don't believe in it."

This statement could be translated word-for-word to a conceptual graph in which the indexicals are represented by the symbols #I, #they, #it, and #you. Then the resolution of the indexicals could be performed by metalevel transformations of the graph. Those transformations could also be written in stylized English:

First mark the indexicals with the # symbol, and use square brackets to mark the multiple levels of nested contexts:

Joe said
  [#I don't believe [in astrology]
    but #they say
      [[#it works]
        even if #you don't believe [in #it]]].

The indexical #I can be resolved to the speaker Joe, but the other indexicals depend on implicit background knowledge. The phrase "they say" in English, like the French "on dit" or the German "man sagt," refers to the commonly accepted wisdom of Joe's community; it could be translated "every person believes." The two occurrences of #it refer to astrology, but the three nested contexts about astrology have different forms; for simplicity, they could all be rewritten "astrology works." When no explicit person is being addressed, the indexical #you can be interpreted as a reference to any or every person who may be listening. For this example, it could be assumed to be coreferent with "every person" in the community. With these substitutions, the statement becomes
```
Joe said
  [Joe doesn't believe [astrology works]
    but every person x believes
      [[astrology works]
        even if x doesn't believe
          [astrology works] ]].
```
If Joe's statement was sincere, Joe believes what he said. The word but could be replaced with the word and, which preserves the propositional content, but omits the contrastive emphasis. A statement of the form "p even if q" means that p is true independent of the truth value of q. It is equivalent to ((qÉp) Ù ((~q)Ép)), which implies p by itself. The statement can therefore be rewritten
```
Joe believes
  [Joe doesn't believe [astrology works]
    and every person x believes
      [astrology works] ].
```
Inside the context of Joe's beliefs, the detailed syntax of the nested context [astrology works] can be ignored. Therefore, a first-order rule of inference can be applied to substitute the constant "Joe" for the quantifier "every person x":
```
Joe believes
  [Joe doesn't believe [astrology works]
    and Joe believes [astrology works] ].
```
At this stage, the context of Joe's beliefs can be translated to propositional logic by using the symbol p for the sentence "Joe believes [astrology works]":
```
Joe believes [p Ù ~p].
```
This transformation exposes the contradiction in the context of Joe's beliefs.

For computer analysis of language, the most difficult task is to determine the conversational implicatures and the background knowledge needed for resolving indexicals. After the implicit assumptions have been made explicit, the translation to logic and further deductions in logic are straightforward.

In the process of reasoning about Joe's beliefs, the context [astrology works] is treated as an encapsulated object, whose internal structure is ignored. When the levels interact, however, further axioms are necessary to relate them. Like the iterated modalities ààp and àop, iterated beliefs occur in statements like Joe believes that Joe doesn't believe that astrology works. One reasonable axiom is that if an agent a believes that a believes p, then a believes p:

("a:Agent)("p:Proposition)(believe(a,believe(a,p)) É believe(a,p)).

This axiom enables two levels of nested contexts to be collapsed into one. The converse, however, is less likely: many people act as if they believe propositions that they are not willing to admit. Joe, for example, might read the astrology column in the daily newspaper and follow its advice. His actions could be considered evidence that he believes in astrology. Yet when asked, Joe might continue to insist that he doesn't believe in astrology.

References

All references have been moved to the combined bibliography.

Send comments to John F. Sowa.

Last Modified: