Automating Ontology Development

Next  

John F. Sowa



Source:  http://www.jfsowa.com/pubs/autotalk.htm

Abstract:  Large ontologies are required for two different, but related purposes: understanding unrestricted natural language text; and merging and aligning independently developed knowledge bases and databases. The largest ontologies currently available for these purposes, Cyc, WordNet, and EDR, have been developed at enormous expense by organizing and encoding the most critical information by hand. The high cost and slow pace of the development indicate that handcoding techniques are obsolete, inflexible, and inappropriate for large-scale ontology development. Although fully-automated ontology development is not yet feasible, many automated and semiauomated techniques have been implemented for performing some of the required subtasks. This talk surveys some of those techniques and proposes a framework for integrating them into a system of tools that could support more efficient and flexible methods of ontology development and customization.





 

Questions to Consider







 

Large Hand-Coded Ontologies

Three large ontologies:

Building these things requires a great deal of time and money.







 

Small Hand-Coded Ontologies

Can such simple systems coexist peacefully with the grand ontologies?







 

Purpose of This Talk







 

2N+2 Hierarchies

Two hierarchies for each natural language:

Two language-independent hierarchies:







 

Some Automated Techniques

A small sample of many techniques — similar, related, or radically different:







 

MindNet

Automated development of semantics for the MS-NLP project







 

Ariosto-Lex and Trevi

Ongoing project to develop integrated tools for NLP engineering







 

Formal Concept Analysis

Techniques for analyzing concepts and creating lattices







 

Table of Attributes and Categories

Attributes
Concept Typesnonalcoholichot alcoholiccaffeinicsparkling
HerbTea x x      
Coffee x x   x  
MineralWater x       x
Wine     x    
Beer     x   x
Cola x     x x
Champagne     x   x

Table of beverage types and attributes







 

A Lattice of Beverages


Problem:  No distinction between Beer and Champagne.







 

Revised Lattice of Beverages


Solution:  Add attributes madeFromGrapes and madeFromGrain.







 

Disagreements Lead to Distinctions







 

Collaborative Development

Semiautomated development:

Collaborative development:







 

BIAIT

Business Information Analysis and Integration Technique — based on seven binary distinctions:

  1. Bill.  Does the supplier bill the customer, or does the customer pay by cash?

  2. Future.  Does the supplier deliver the product at some time in the future, or does the customer take the order from stock?

  3. Profile.  Does the supplier keep a profile of the customer, or is every transaction a surprise?

  4. Negotiate.  Is the price negotiated or fixed?

  5. Rent.  Is the product rented or purchased?

  6. Track.  Does the supplier keep track of the product after it is sold or not?

  7. Make to order.  Is the product made to order, or prefabricated?






 

Building Theories with BIAIT

Combinatorial construction of theories from conjunctions of axioms:







 

The Lattice of Theories

An infinite lattice of all possible theories, also called a Lindenbaum lattice:







 

Navigating the Lattice of Theories


Example:  earth and sun map to the hydrogen atom.







 

Special-Purpose Theories

Belief-revision operators can accommodate such theories







 

Summary







 

References

For the slides used in this talk, see

http://www.jfsowa.com/pubs/autotalk.htm

For further discussion of the hierarchies, see

http://www.jfsowa.com/pubs/signtalk.htm

For even more detail, see the [unfinished] paper:

http://www.jfsowa.com/pubs/signproc.htm

All other references are [or will be] in the bibliography:

http://www.jfsowa.com/bib.htm






Copyright ©2001, John F. Sowa