William Kent, "A Framework for Object Concepts", HPL-90-30, Hewlett-Packard Laboratories, April 1990. [10 pp]
William Kent
April 1990
> ABSTRACT
> 1 INTRODUCTION . . . 2
> 2 WHY FOCUS ON BEHAVIOR? . . . 2
> 3 THE CORE BEHAVIOR . . . 3
> 4 DIFFERENT ROLES IN THE PARADIGM . . . 3
> 5 DIFFERENT SORTS OF OPERANDS . . . 4
> 6 DIFFERENT NOTIONS OF METHOD . . . 6
> 7 CLIENTS . . . 6
> 8 CONTEXT . . . 7
> 9 CHANNELS . . . 7
>> 9.1 Method Selection Modes . . . 7
>> 9.2 Other Channel Activities . . . 8
>>> 9.2.1 Binding . . . 8
>>> 9.2.2 Accessibility Control . . . 8
>>> 9.2.3 Marshaling Arguments . . . 9
>>> 9.2.4 Communication With the Method . .
. 9
>>> 9.2.5 Returning Results . . . 9
> 10 CONCLUSIONS . . . 9
> 11 ACKNOWLEDGMENTS . . . 10
> 12 REFERENCES . . . 10
A framework is proposed in which to compare what the essential concept of "object" is in various object models.
Many people know what an object is, but few agree. We use a common vocabulary - identity, encapsulation, inheritance, messages, polymorphism, types and classes, etc. - but we're not talking about the same thing. The problem goes deeper than the different definitions of such things as type and inheritance. We have fundamentally different views as to what an object is in the first place.
At a meeting not too long ago, a group of us agreed that portability of objects was an essential requirement. A little probing behind this apparent consensus revealed that some of us were talking about moving documents around, others thought we were talking about installing the document class, while yet another group thought we were talking about porting the editor software. This paper tries to establish a framework within which various object models can be placed, in order to clarify their differences. The most useful organizing principle seems to be one centered on the behavior of objects, i.e., how they are involved in the servicing of requests. We begin the framework by describing various roles in a scenario of request processing. Object models will differ at the outset in their assumptions as to which role is the essence of "objectness". Then we will examine alternative definitions of the various roles. Even when models agree on the role played by objects, they often assume significantly different interpretations of those roles.
The work is by no means finished; it needs to be validated, and probably modified. Proponents of various object models are invited to test whether and how their model fits into the framework. If the fit is not too good, then the framework needs adjusting. If the model does fit, then we have a reasonable basis for comparison with other models. The differences can be identified and articulated.
This paper has a surprisingly narrow focus. It does not address different interpretations of such concepts as identity, encapsulation, inheritance, polymorphism, types and classes, etc. There's enough challenge in just exploring the essential nature of what an object is.
Object-orientation, at least in the database world, is divided into "structural" and "behavioral" approaches.
The behavioral view is more fundamental, and explains both.
The structural approach extends the paradigm of programming language data structures and database data models. It is essentially a visual, spatial metaphor, resting on a static vision of how things are laid out in space. Structural object orientation is attained by complex structures capable of mirroring the complex configurations of the things many applications deal with, such as documents and engineering designs. Such structures are made uniformly available for all application domains. The difference between structurally object-oriented data models and prior data models is the complexity of their data structures.
Behavioral object orientation, in contrast, extends the paradigm of the abstract data type, founded on the notion that you know a thing by what it does rather than how it looks. Behavioral object orientation allows users to create operations designed to be semantically meaningful in their specific application domains. Instead of inserting tuples into relations, they hire employees, connect electrical circuits, and insert diagrams into documents. These operations may be realized in various combinations of programmed procedures and data structures - which may or may not be structurally object oriented.
Is the distinction here really between structure and behavior?
How do we know that we are manipulating a data structure, such as a relation? Because we have operations that make us believe we are inserting and retrieving tuples in them, and because we have user interfaces that paint pictures of tables with rows and columns on our screens. But most of us know very well that you won't really find tables with rows and columns in the machine. The data is really scattered around in different sorts of address spaces, pages, buffers, and physical device configurations, differently in different implementations. The whole point of the relational model, or any data model, is that it is an abstraction designed to provide common behaviors on top of a variety of different physical implementations.
The only way we know we're dealing with a relational model is because we have relational operations providing its behavior. There's lot of software working very hard down there to provide this illusionary behavior.
Data structures are sometimes considered essential to explain state and update, but these can be characterized in purely behavioral terms. State can be described in terms of the results returned by operations. Update, or change of state, can be described as alterations in the behavior of operations. State essentially captures the cumulative effect of operations.
"Structural vs. behavioral" is a handy slogan, but it misses the point. The real distinction has to do with who gets to define the operations. The "structures" of structural object orientation are embodied in a fixed set of operations defined by the builders of the data management system. The operations are intended to be universal, equally useful in any application domain. Structural object orientation simply means that the operations available to users are defined by the builders of the data management systems rather than by the users themselves.
We therefore rest the general object paradigm on a behavioral foundation.
A request, referring to an operation and some operands, issued by a client in some context, is communicated via a channel to a method which does something in response, perhaps returning some result to the client. The choice of method may be influenced by the classes to which the operands belong as instances, as well as the operation and the context.
The first question for the reader is whether this operational paradigm is the right organizing principle for comparing different concepts as to what an object essentially is. If not, what is?
It would be useful to determine, for example, whether some readers thought the essence of object orientation had something to do with complexity of data structures, or iconic styles of user interface, or management of multi-media data.
The reader will note a deliberate omission. The operational paradigm doesn't mention objects. That leaves room to discuss a central ambiguity: which of those roles epitomizes the notion of object? If you were to put the word "object" into the core paradigm, where would you put it?
That's the second question.
Consider a request from an information retrieval program to print a certain article on a certain printer. There is an editor which does printing as well as other document services. Other programs also provide document services, such as an independent spelling checker.
Different models tend to focus on different participants as being "the object" in this scenario. Prime candidates are:
The exact question isn't about which of these is an object. All of them might be treated as an object at one time or another. They might all be the operand of a destroy operation. The client and the method might each be the operand of compile or trace or edit operations. The operation and the channel are themselves objects in some models.
The question has more to do with which of these roles does the model tend to focus on as the essence of "objectness". This is often implicit in underlying assumptions. For example, in saying that "objects must be portable", the speaker might have in mind the ability to move the article elsewhere, or the ability to install the document class, or the ability to port either the information retrieval program or the editor program.
A neutral view would take that scenario to be the behavioral core of the object paradigm without prejudice as to which are the objects. However, this paper might reveal a bias toward thinking of the operands in the request as being the focal objects.
We next examine alternative definitions for each of the roles in the paradigm.
When the focus is on the operands of operations, we tend to include the results of the operations as well. Even here, there is room for a variety of interpretations, starting with this general one:
Focusing on objects as the passive subjects of activity, we arrive at a narrower notion of object:
Thus, for example, in a Connect(wire1,pin2) operation, wire1 plays the first role and pin2 plays the second role.
That distinction between operands and results becomes significant in some models which define object interfaces or protocols only in terms of operations for which they are operands, not for which they are results.
The messaging paradigm is introduced by distinguishing one operand as the recipient of the operation. The operation can then be called a message, and the other operands are then called parameters. Thus the further restriction:
One syntactic convention is to so distinguish the first operand. Under this convention, the Connect(wire1,pin2) operation is considered a message to wire1, taking pin2 as a parameter. Another syntax places the recipient first: wire1.Connect(pin2).
The operation and messaging paradigms are indistinguishable for unary (single-operand) operations. Thus Length(wire1) is both an operation applied to wire1 and a message sent to wire1, though in message syntax it may take the form wire1.Length.
The behavior of some operations may change. If IsConnected(wire1,pin2) returns True or False depending on whether they are connected, then its behavior may be changed by Connect(wire1,pin2). Operations whose behavior can be so modified are side-effectable, and operations causing such modification are said to have side effects. Thus IsConnected is side-effectable, and Connect has side effects.
Side-effectable operations provide the basis for a behavioral characterization of state. The state of a set of objects can be defined as the current values of a set of side-effectable operations taking those objects as operands. Thus the current value of IsConnected(wire1,pin2) tells us something about the current state of wire1 and pin2, namely whether or not they are connected to each other.
This leads to a refinement of (2):
In this form, we have one kind of encapsulation. The states of objects are known only by the behaviors of operations, and not in terms of internal structure. The states of objects are thus collectively encapsulated from users of the objects - but not from each other. State, as in the connection example, may jointly characterize several objects.
A stronger form of encapsulation, common in many object models, also isolates the states of objects from each other. The IsConnected(wire1,pin2) operation, perceived as a message to wire1, is perceived to reflect the state of wire1 but not of pin2. This isolated-state encapsulation model is obtained as the intersection of (3) and (4):
If all the side-effectable operations are unary, such as Length, then (4) and (5) are the same.
Operations may return large or small literal values (numbers, strings), aggregate objects (sets, lists), or any other sorts of objects. But objects so far have no sense of content. Operations such as Display, Move, or Copy don't have any meaning. We can introduce content by means of:
Thus a file may be modelled as an object having a Content operation which returns a long string.
The above definitions apply equally to numbers under the arithmetic operations; numbers are sometimes considered to be objects without state. Some models reserve the term "object" for things which are explicitly created. Things which are syntactically recognizable, such as numeric values and extensional sets, are not considered objects. We thus have
That restriction can be combined with some of the other definitions given above.
Finally, to accommodate object models founded on a structural approach, we offer a different kind of definition:
The method is on the implementation side of the encapsulation boundary, and is not directly apparent to clients of objects, i.e., invokers of operations.
Some object models assume a method implements one operation. Others assume a method implements all operations for a class; this is a common model in the PC environment, where all services for a given object are provided in a single document processor (editor) or spreadsheet processor.
A third possibility is that a method might support several but not all operations for a class. A stand-alone spelling checker might provide additional services not provided by an editor.
An operation might be implemented in several alternative methods, giving rise to polymorphism (generic operations). An important task of the channel is to select the appropriate method, i.e., bind the operation.
Methods are usually defined explicitly, as procedures associated with classes. They can also be specified implicitly, via data bindings. For example, the implementation of an operation might be specified by binding its result to a slot in a frame structure; it might also be specified by binding its operand to the key of a relational table and its result to another column. The retrieval and update techniques are implicitly understood.
A method for one operation can be an operand or result object for another, as when compiling a method, installing it, or querying its status.
Clients are the external users of objects, from whom implementations are hidden by encapsulation. They issue requests. They are not to be confused with clients in a client/server architecture.
The following are possible clients issuing requests:
Not all models support all these kinds of clients.
A request occurs in a context, a loosely defined notion which may include such information as the time and place of invocation, the identity of the client, resource availability, communication modes, etc.
A context might identify what mode of method selection is currently in effect (see below). It can also influence the choice of method, which might depend on the identity of the client, the hardware or software environment, available resources, etc.
"Channel" is a very broad concept, involving many things related to choosing a method and initiating its activity. It involves infrastructures, binding, type checking, inheritance, location services, and many other things. Various channel mechanisms coupled with a variety of request forms account for a good many model variants.
Many major features of object orientation come to play in the behavior of channels. We will just touch on them in passing to position their role in the framework.
There are various modes of determining the appropriate method.
In call mode, the method is directly designated by the operation in the request, as in a remote procedure call.
In session mode, the channel would be a tightly bound link between the client and method, such that the request unconditionally flows to a predetermined method. This might be the case, for example, in an edit session that has been established with an editor. The edit commands simply flow to the editor without any significant mediation or intervention. In general, such a session would have been established by a prior request, and may now be thought of as part of the context in which the client is issuing subsequent requests. There may still be something in the channel which is listening for a request to end the session.
Another example might be a link between a semantic object (or data object) and a presentation object. Pipes and hypermedia links might also be examples.
In operation binding mode, the request has an operation and several operands. The channel chooses the method by a complex algorithm that might depend on the operation, the operands and their types or classes, and various factors in the context such as location and resource availability. The channel now serves as an infrastructure, involving various network protocols, authorization checks, concurrency management, and marshaling of arguments (collecting and transmitting certain information about the arguments).
Message binding is like operation binding, except that the recipient is the only operand which influences the choice of method. The behavior of the channel is sometimes described as routing the request to the recipient, which then chooses the appropriate method for the operation. This variant may even take the view that the recipient contains its set of methods.
Early binding is a variant of either operation binding or message binding. It means that some of the binding decisions have been made prior to the issuance of the request, e.g., at compile time. The channel itself then has less to decide.
Not all object models acknowledge all these communication modes. Request processing is sometimes only described in terms of the last three binding modes, where it is assumed that the other modes of communication are detected and managed by other facilities.
In greater detail, a request event causes the following to happen, not necessarily in the given order:
Much of the variation among object models and systems lies in the sequence and components or locations in which these occur.
Binding, i.e., choice of method for executing a request, was largely covered under Method Selection Modes. Object models vary considerably in their treatment of inheritance and polymorphism, intimately involved in method selection. They also differ significantly in their treatments of early and late binding.
Failure to find any applicable method corresponds to a (run-time) type violation, i.e., the operation is not meaningful for the given operands. This is sometimes described as "message not understood".
Availability of several applicable methods without a "natural" selection criterion characterizes the "multiple inheritance" problem, dealt with in various ways by various object models.
A number of phenomena are included here, all affecting whether the operation will be allowed to proceed. It is these things which are most likely to occur in different sequences in different systems. In particular, there may be significant differences between what happens at compile time and at run time.
Type violations arise if there does not exist a method which implements the operation for the given operands. Type checking attempts to insure that requests which would encounter such violations are not permitted to execute.
Access control determines whether the invoker is authorized to apply the operation to the given operands.
Constraints prevent an operation from executing (or undo their effect) if constraint specifications are violated.
Locking may arise if other invokers are executing conflicting operations.
Certain data associated with the operand objects may have to be assembled for transmission to the method. This depends on the assumptions in the object model about the "natural" default contents and components of objects. It could involve complicated export/import.
There are potentially many things involved here, and it would be particularly interesting to see how various models compare in this area. Some relevant factors:
Who are the client and method for such requests? Can a passive method execute an activation request?
There are assorted techniques for returning results and reporting status conditions to clients.
Things in the object world don't seem to mesh, due to different interpretations of shared terminology and concepts. We've tried to provide a unified framework in which such variants can be articulated and explained, if not reconciled.
This only sets the stage. We need to determine whether and how various object models can be explained as some variant of the core behavior.
Essential questions:
You are invited to participate.
This material evolved while working with colleagues in the Object Management Group [OMG] and the ANSI/X3/SPARC/DBSSG Object-Oriented Database Task Group [OODBTG]. Foremost among the many colleagues with whom discussions were most profitable are Alan Snyder of H-P Labs and Allen Otis of Servio Logic.
[OODBTG] "Technical Report of the ANSI/X3/SPARC/DBSSG Object-Oriented Database Task Group", (in preparation).
[OMG] "Object Management Group Standards Manual", (in preparation).