A fundamental concept of AFS is that all storage internal to the system is managed automatically: the programmer refers to data and other objects by symbolic names rather than by physical addresses. Storage management would extend over levels from high-speed registers and monolithic memories up through Comanche files, optical storage devices, and even cataloged off-line storage such as tape libraries. Logically, all such storage is an integral part of the system, distinctions between levels are invisible to the programmer, and it is considered almost unlimited in size.
When independent formulations of a problem give rise to similar concepts, those concepts probably contain an essential element of the problem that is invariant under change of notation or frame of reference. The problem of distinguishing between objects and the mechanism for referencing them is a fundamental one that every computer system, programming language, and theory of computation must face:
For defining indices and pointers, storage addresses are useful, but the housekeeping they entail far outweighs their usefulness. The storage cell in AFS is a logical location capable of holding any object or collection of objects, no matter how large; its characteristics of a location simplify the definition of indices and pointers, but it involves no housekeeping burden because the storage management system makes the cell appear as large as necessary and automatically moves it to any device that may need to process its contents.
Definition: A storage cell is a logical location identified by a unique cell name. Each storage cell contains one and only one object; there is no upper limit on the size of a storage cell. The cell name is an internal identifier (abbreviated iid) whose representation is invisible to the user.
This definition is nonconstructive: it defines a storage cell by axioms or characteristics that are visible to the programmer, not by an explicit construction from something more primitive. The advantage of nonconstructive definitions is that the implementer has maximum freedom in his choice of representations and hardware design. The disadvantage of such definitions is that they don't prove that an efficient implementation (or even any implementation) is possible. To remedy that situation, the informal notes between definitions will illustrate the abstract concepts with a sample implementation. Since the illustration will not necessarily be the optimum engineering solution, the implementers are free to choose any design that satisfies the axioms.
Definition: A buffer is a temporary storage cell created for the purpose of holding an object until it can be processed or moved.
Buffers are intimately related to the mechanism for passing messages between objects such as arguments to functions and results from functions. Normally what is passed is the cell name of some storage cell containing the message. In computing X[I], for example, the select function returns the cell name for the storage cell containing X[I]. However, when the sum function computes (A+B), there is no permanently allocated storage cell containing the result. Therefore, the interpreter that is interpreting the function obtains a temporary storage cell, called a buffer, to hold the result. Buffers correspond to I/O buffers in current systems as well as to registers in the CPU or on a pushdown stack.
A particular implementation of the storage cell concept is discussed in the System Architecture Manual. The Storage Management Subsystem (SMS) described there provides spaces identified by unique space numbers; each space is linearly addressable by an offset from the beginning of the space. A collection of storage cells can be implemented as a space divided into a number of fixed-length blocks holding object images, also known as DAPOVs (Descriptor And Pointer Or Value). The cell name corresponds to the space number and offset to the object image; the uniqueness of space numbers guarantees the uniqueness of cell names.
The concept of process is fundamental to all levels of an information handling system: CPU, channels, operating systems, and external devices. A rationally designed system must have a precise concept of process and of the possible interactions between processes. In AFS, the definition of process is based on the well developed foundation of automata theory and is designed to facilitate the implementation of multiprocessing systems.
Definition: A process is an automaton with a set of states S and a set of states W contained in S, in which it waits for input. Processes can be best described by assuming they have three parts:
The process status record keeps track of all information that defines the current state of a process. In automata theory, a PSR is analogous to the instaneous description of a Turing machine. In a System/360 CPU, a PSR is analogous to the program status word with the contents of the fixed and floating registers. In the CDC 7600, the exchange jump package is the equivalent of the PSR. In the Burroughs 6700, the pushdown stack together with control words that may be stored in it form the equivalent of a PSR.
Above the SL level, a procedural description could be a read-only program. Beneath that level, procedural descriptions may be in microcode or hard wiring. The reason for separating the procedural description from other parts of a process is to allow a number of re-entrant processes to use the same description simultaneously. For primitive objects, the hardware may take shortcuts during high-speed execution and not separate the three parts of a process. For error logouts or responses to a diagnostic programmer, however, the system must generate a PSR that effectively represents the current state of an object.
The interpreter is the motive power that causes a process to move from one state to the next. It is the logical abstraction of active servers like CPUs and channels, but is more general since it includes software interpreters as well as special devices that may be attached as RPQs. The AFS logical architecture has deliberately avoided the concept of a CPU. Instead, the more general concept of process allows the engineer greater freedom to build distributed execution units, special purpose devices, and multiple processing units to improve performance without changing any logical interfaces.
The definition of process sets the stage for later discussion of wait states, exceptions, and suspensions. When a process needs input, it stays in one of its wait states indefinitely. A waiting process is considered asleep, and sending it input corresponds to a wake-up call. Exceptions are unusual conditions like arithmetic overflow or violations of access rights. When an exception occurs, the process in which it occurs generates a message for another process called a monitor and then goes into a wait state until it receives a message from the monitor. A suspension occurs when the motive force, the interpreter, is removed from a process, and the process stops because there is nothing to make it go. Suspensions result from time-sharing the interpreter among many processes so that only one can be running at any given time, but they can also occur when a process has run out of money (using too much time or space) or when it is stopped because of some other event like an attention signal from the programmer who started it.
Processes occur at all levels of a system. When concepts are not clearly defined, engineers and programmers working on different levels may be unaware that they are working on similar problems and duplicating functions that are performed on other levels. In System/370, for example, there are processes executing in channels and I/O untits, in microcode in the CPU, and at the instruction level for subroutines and tasks. The concepts, terminology, and data formats at the various levels completely obscure any similarities among these processes:
In AFS, the object is a generalization from two sources: descriptor/value pairs and resource/process associations. Descriptors are maintained with data in data management systems, APL, EULER, and the dynamically varying parts of PL/I. The type field in a descriptor can be interpreted as the name of a machine for accessing the value part. Although the few bits that describe a floating-point number don't exhibit many characteristics of a procedure, the generality of an access machine or procedure is valuable for complex arrays and structures and is essential for the intricate relationships in a large data base.
The association of a process with every resource derives from Dijkstra's approach in T.H.E. Multiprogramming System and from Ole-Johann Dahl's approach in Simula 67. Dijkstra associates a process with every resource in his system; the process is solely responsible for allocating that resource and acts as a central clearinghouse for all accesses to it. Chapter 2.5 shows that all objects in AFS have the properties of Dijkstra's resources and naturally fit into a general scheme of resource management. Alan Perlis suggested that simulation languages might provide a suitable basis for an operating systems language since they have the best developed concepts of event and process. The AFS concept of objects as processes is a generalization of the objects in the simulation language Simula 67.
Definition: An object is the basic entity in the system. It has an active part called an access machine and a passive part called an owned resource. Its active part responds to requests by other objects and may in turn generate requests of its own.
Since this definition is general enough to accommodate source-sink I/O devices as well as objects as powerfulas a Turing machine, it can include any conceivable device within the standard accessing and allocating method. For a floating-point number, the implementation could specify a fixed-length bit string as the resource and a few bits to identify a hardware unit as the access machine. For I/O devices, the object internal to the system would be called a port, whose resource would be a logical connection to the external device and whose access machine could be a hardware or microcoded control unit. Since the internal structure of an object is invisible to the caller, an object implemented in hardware or microcode on one system could be implemented in software on another.
As in Simula 67, a software access machine is a procedure that defines a potentially infinite class of activations. An object corresponds to a process executing in one such activation. A ready state is a point in the procedure where the process waits for input. And the resource is a set of [local] variables used by the activation. Logically, all objects are processes. Even a floating-point variable is a process that is normally waiting, but must occasionally answer requests to deliver a value or to stow one away.
Definition: A primitive object is one that cannot be constructed from other objects in the system. The PSR, interpreter, and procedural description that make up its access machine are not objects normally defined in the logical architecture.
Somewhere underneath all the logical data structures, there must be primitive building blocks from which everything else can be constructed by software. Although the logical definitions of primitive objects are parallel to the constructions of other objects, their substructure is visible only to the engineers and diagnostic programmers.
Definition: A reducible object is one that can be constructed from other objects. The PSR, interpreter, and procedural description of its access machine are AFS objects that can be manipulated by SL.
Primitive objects are defined axiomatically in terms of their effects on other parts of the system. Sometimes, reducible objects are defined axiomatically, but most reducible objects are defined by an explicit construction in terms of primitive objects. All primitive objects are implementation defined. Many reducible objects are implementation defined, and others can be user defined. For efficiency, reducible implementation-defined objects may be built out of hardware or microcode even though they can be constructed out of more primitive objects. Logically, however, all reducible objects have the same status whether they are implementation defined or user defined.
Definition: The primitive object nil has an access machine with only one state. For every request, nil returns a copy of itself. For operations on lists, nil has the properties of a zero element list.
In APL\360, the empty vectors are similar to nil, but they have additional type information. The empty charcter vector has a description that indicates that it is of type character, and it expands into blanks. The empty numeric vector is of type numeric, and it expands into zeroes. [In AFS,] nil is of type any, and it expands into a list of undefined objects. [Note: a better option might be to have nil expand into a list of nil objects rather than undefined objects.]
Definition: The primitive object undef has an access machine with only one internal state. For every request except destroy, undef raises an error exception.
Logical storage cells can never be empty. If nothing else has been put in them, they contain an undefined variable object. The object nil is a general neutral element. It responds without error exception to any request, although some functions such as + or − may themselves raise error exceptions when given a nil operand. The object undef is a general undefined element; it always raises error exceptions except when being copied or destroyed.
Primitve objects are so basic to the structure of the system that they cannot be constructed by software. Hardware devices may not be primitive in the same sense because a disk drive, for example, could be simulated by a software routine that duplicates the interface and uses the storage management system to perform the same functions; but there is no sequence of instructions that could createa new disk drive in the corner of the machine room and physically attach it to the computer. Therefore, certain objects must be built in from the beginning, and others may be attached as the system expands or be removed when they fail. As long as the physical interface provides circuitry that matches voltage levels and makes the device look like a procedure, the logical interface can make room for it in the object base and can define synonyms and access machines that make it respond to any protocol expected of it.
Definition: A port is an object that communicates with the world outside the system. Its access machine handles the interface, and its owned resource is a logical connection to a physical object.
Since ports are objects, they have the same interface as all other objects. They have a well-defined status with respect to the accessibility graph, environment tree, and dependency graph, and they respond to requests in the same way as other objects. Therefore, it is always possible to replace a port with a software object that has the same interface. Programmers can create logical printers, simulated 2314 disks, and even simulated networks of machines. If a graphic object has an unusual interface, the real port to the device can be replaced by a logical port that behaves like a printer, but that contains a program to massage control information passed with a request and send it to the graphic device in the appropriate format. To make network communication more transparent to the user, the system will provide identical interfaces for a virtual System/370 emulated inside the system and for a real System/370 at the far end of a telephone line.
If communication with a system were in the character set format of typewriters and printers, the internal representation of an object would be of no concern to programmers and could remain totally invisible. But since data may be interchanged between systems, either conversationally or by removable storage media, there must be a standard representation of an object that can be recorded on an external medium and be reconstructed on a different system. This standard representation is called an object image. Every system is free to use its own internal forms, but they must all be directly mappable to the standard form for an object image. [Note: the term "object image" was adopted from the APL\360 system, but the basic idea was implemented in the 1950s for LISP. The currently popular term is "serialization".]
Definition: An object image is an external representation of an object. The object image has two parts corresponding to the two parts of the object: a descriptor that specifies the access machine and a representation of the owned resource.
The object image is an external form of the DAPOV (Descriptor And Pointer Or Value) discussed in the System Architecture Manual. Although a DAPOV on a small system may be different from a DAPOV on a large one, the object images will be the same for all. The object image may be considered as the DAPOV for an abstract implementation of the AFS. It may turn out to be identical to the internal DAPOVs of one or more actual implementations, or it may be a compressed encoding of the internal DAPOVs.
Definition: The object base is the set of all objects in the system.
The term object base is more general than the term data base, since it also includes the logical interfaces to hardware resources. Because of the generality, all hardware devices have descriptors and can have synonyms defined upon them. Whenever a device breaks down, its descriptor can be changed to point to another device or to a software simulator that can replace it. All of the advantages of late binding therefore apply to devices as well as data. Instead of doing a SYSGEN for every configuration, implementers can provide standard logical facilities, make descriptors for nonexistent facilities point to substitutes, and keep the logical appearance constant as descriptors are changed one by one to reflect the current configuration.
The definition of object given above implies that all objects are serially reusable resources. Nonreusable objects can be implemented by making the access machine destroy the object after its first (or n-th) use. No requests can bypass this check, since an object cannot be used except through its access machine.
[In effect,] re-entrant procedures and time-shared devices correspond to a potentially infinite class of serially reusable objects. By subdividing storage, a single re-entrant procedure can provide [local] variables for as many activations as requested. By subdividing time, a time-slicing routine can provide multiple logical devices that all perform the same function as a single physical device. The AFS view of objects as processes treats the problem of resource management as a problem of interprocess communication.
Definition: A request on an object is a triple (T;P;D), where T identifies the request type, P is information proper to that type, and D is the destination or object that is to receive the answer. Normally, the access machine of the object will execute the request and return a result to the object D. In some cases, the access machine will cause an event called an exception. See Section 2.4.1 for a definition of exceptions and the ensuing events.
[Note: If an object x sends a request (T;P;D) to an object y, the destination D of the result may be the sender x or some third object z. This convention allows objects to support any kind of synchronous or asynchronous protocol. In a simple synchronous request, the object x would send a request (T;P;x), in which the destination is the sender; after sending the result, x could wait until the answer came back. To allow more parallelism, the sender x could send a request (T;P;x) and not wait for the answer; at some later time, x could check its input queue to see whether the answer was present. The definition, however, is general enough to support a wide range of protocols for blackboards or whiteboards that may hold messages that might be answered by objects whose identity was unknown to the object that sent the original request. For example, the sender x might generate a request (T;(P;x);y) for a blackboard manager y, which might use some private method for determining which object z could process the request type T. Then y would send the request (T;P;x) to z, which would send the result back to x.]
Definition: The dependency graph is a structure defined over the object base. If an object x has a request on its input queue that specifies an object y as its destination, then y is said to depend on x, and (y,x) is an edge of the dependency graph.
Later chapters will bring out the implications of the dependency graph in resource management, process dispatching, and deadlock determination. Chains of subroutine calls form a subgraph of the dependency graph called the activation tree. If x is an activation of a program that calls a subroutine y, then x is dependent on an activation of y until it returns. [Note that a re-entrant program p may have many activations, each of which is an object that is distinct from p and from any other activation of p.]
Since every object has an access machine, it always has an active element available to perform necessary functions. A typical function is that of monitoring. During debug mode, the programmer may wish to monitor all accesses to a particular variable and then perform a specific action such as recording the access, calling some procedure, or waiting for instructions from the terminal. For sensitive data, all requests on an object may cause its access machine to check the identity of the caller and to notify a security officer of an access attempt by an unauthorized user. For proprietary software on lease, the access machine might destroy the object after thousand uses. All these applications rely on the invisibility of an object's internal structure — when an ordinary variable is replaced by one that is being monitored, its normal interface remains unchanged.
Definition: An access machine has the following external interface:
Definition: In order to specify requests, a primitive request constant is defined for each of the request types. The names of the request constants are formed by adding 's' to the corresponding request names: authorizes, copies, deletes, destroys, evaluates, identifies, inserts, selects, starts, and stows.
[More to come.]
Definition: A function is
Definition: The triadic function request
Definition: The monadic function evaluate
Definition: The dyadic function stow
Definition: A synonym is
Definition: The dyadic function authorize
Definition: The monadic function syn
Theorem: If a request of type T is made on an object through a chain of synonyms, then T must be in the intersection of the rights of all synonyms in the chain.
Definition: A metonym is
Definition: A collective object is
Definition: If x is a collective object
Definition: An elementary object is
Definition: The ownership relation between collective objects and storage cells has the following properties:
Theorem: Every object except the system root is an element of one and only one collective object.
Theorem: The ownership relation defines a tree structure over the object base. The system root is the root of the tree, collective objects are at branching nodes, and elementary objects are at leaves of the tree. Call this tree the ownership tree.
Definition: The index set of an object x is
Definition: A list L is
Definition: The monadic function ilist
Definition: The dyadic function select
Definition: An object x is directly accessible from y if
Definition: An object x is indirectly accessible from y if
Definition: An object x is accessible from y if
Definition: The accessibility graph is a union of the ownership tree and the chains of synonyms: (x,y) is an edge of the graph if either x is a synonym for y, or y is an element of x.
Theorem: If x is an element of y or if x is indirectly accessible from some object z which is an element of y, then there exists an element n in the index set of y such that x=select(n;y). Call n a simple name for accessing x from y.
Definition: A path from an object y to an object x is
Theorem: If x is accessible from y, then there exists a path name from y to x.
Definition:
Definition:
Definition:
Definition:
Theorem:
Theorem:
2.1.7.1 Lists
Definition:
2.1.7.2 Structures
Definition:
Definition:
Definition:
Definition:
Definition:
2.1.7.3 Arrays
Definition:
Definition:
Definition:
Definition:
Definition:
Definition:
Definition:
Definition:
[There is more to this chapter that has not yet been transcribed.]