Database Technology Department
Palo Alto, California
> 1 INTRODUCTION . . . 1
> 2 CONTEXT . . . 2
> 3 OBJECTS IN INFORMATION SYSTEMS . . . 2
>> 3.1 Why? . . . 2
>> 3.2 Where Is It? . . . 4
>> 3.3 Adaptive Precision . . . 4
>> 3.4 Spheres . . . 5
>> 3.5 How Do We Know It's There? . . . 5
>> 3.6 But Is It Really There? . . . 5
>> 3.7 How Does It Look and Act? . . . 6
>>> 3.7.1 State . . . 6
>>> 3.7.2 Behavior . . . 7
>>> 3.7.3 Specifying Behavior . . . 8
>>> 3.7.4 Initiating Behavior . . . 8
>> 3.8 Encapsulation . . . 8
This is a "naive" exploration of object orientation. On the face of it, it could simply be a tutorial for the uninitiated. But it is also a sanity check for practitioners, examining various assumptions about first principles and why they matter.
What is an object?
Why do you ask?
If we're just making small talk (not to be confused with Smalltalk), we might say that an object is an entity - which is no help at all (shades of [D&R]!). We might as well say it's a thing. Dictionaries provide scores of definitions of these terms, and we could argue forever about how to interpret them and which is right. Is the sky an object? Are clouds? Is green? Is two? Is the Hewlett-Packard Corporation? Or the United States of America? Am I as an employee and I as a parent the same object? Am I the man I used to be? How many objects will fit on the head of a pin?
In broadest terms, we could apply the "it" test: if it ever makes sense to refer to it as "it" (or "he" or "she"), then it is an object. Does that help? Not much.
Why are we asking about objects? The context is information processing systems.
That narrows our scope a bit. The term "object" is acquiring a technical meaning, as a construct which systems can handle effectively to do the information processing we want. The exact definition is still evolving, because we have different kinds of information processing as goals, and different perceptions of what it is that systems can do effectively.
We have various sorts of motivations in the contexts of information systems. We want programs and data to be correct. We want programs to execute efficiently, and data to be maintained efficiently. We want the development and maintenance of programs and data to be economical. We'd like to limit the impact of change. We'd like to minimize the quantity of programs and data that need our attention, yet maximize the amount of information processing we can do. We want to manage data as a persistent resource, shareable among many programs.
Let's explore an example. Let's say that lots of programs need to know people's ages. There are many ways they might get it.
Ages might be maintained in a table of data about people, or it might be maintained in various tables for teachers, students, salesmen, etc. It might be maintained in a column called Age (or something else), or it might be maintained in a parameterized form (column 1 contains the person identifier, column 2 contains the parameter name, e.g., Age, and column 3 contains the value). Age might also be computed from birthdates, with the same set of options all over again for obtaining birthdates.
Age might also be computed differently, e.g., for dead people it is the age at death, and for fictional people it is a fixed quantity stored in a table. The definition may also change with time; in some Oriental cultures, it was the custom to consider a person to be one year old at birth.
Things change, in time and space. The techniques used may vary with time, and from place to place. To the extent that programs are explicit about these techniques, they are vulnerable. They have to be changed when things change, and multiple versions are needed for environments that use different techniques.
A program that simply says what it wants, namely someone's age, and says the least it can about how to get it, will be most robust. It can be used in the greatest variety of situations, and it can survive the greatest range of changes intact.
This principle goes beyond the programs that use such information. A necessary corollary is that the information must be made available in such a way, i.e., there must be a program which can be called to get the ages of people, which will take care of all the messy details. If things change, the damage is controlled: only this program has to be changed, not any of the programs that use the information. Or if things are done differently in different environments, it is only this program which has to be customized to each environment.
It's a good idea to make such programs as general as possible, to minimize the number of such programs that have to be maintained, as well as to minimize the number of choices that users have to make. Don't maintain separate programs for the ages of teachers, students, engineers, etc., when you could write one for the ages of people. Just make sure that someone wanting the age of a salesman gets connected to the program that provides the ages of people. (This leads to the principle of class hierarchies. A program that provides the territories of salesmen can not be called to provide information about people who aren't salesmen.)
It's a good idea to be clear about what things are visible to consumers of the information, and what things are peculiar to a given implementation. One reason is that programmers are tempted to take shortcuts, sometimes in the name of efficiency. Why call a whole other procedure when I can look up the table myself? (This leads to the discipline of encapsulation.)
Another useful principle has already been illustrated: whenever possible, hide distinctions from the programmer. If ages are computed differently for different kinds of people, don't make him choose the appropriate method. He might choose wrong ones, he might not anticipate all the options, and the decision criteria might change. (This leads to the principle of polymorphism.)
Having the requestor simply say what he wants also provides a higher level of semantic abstraction, closing the gap between requirements specification and data modelling. The need to obtain people's ages has been around for a long time, and is often identified during the analysis and design phases of application development. However, the actual construction of applications involved bridging a gap to the facilities of the system. Fortunately, increasing levels of abstraction have narrowed that gap. Applications have progressed through various stages, such as searching for specific addresses on magnetic tape, seeking logical records on random access devices, requesting data from buffer managers, requesting data from named tables, etc. The level of abstraction is increasing, the amount of translation between requirements and applications is decreasing. The object-oriented paradigm may be achieving the ultimate level of abstraction, expressing requirements and applications in the same language, requiring no transformation step. That's certainly debatable, but it may at least be true within the realm of formal languages; further improvements will go more in the direction of natural language. Perhaps what we are saying is that the formal language for requirements and applications can now be the same.
In developing requirements, we might say that there are people, who have names which are character strings and ages which are integers. There are sales territories which also have character-string names. There are salesmen, who are assigned sales territories. Salesmen are people (hence they also have names and ages, which we don't have to say).
It would nice now if an application program only had to ask for the age and sales territory of the salesman named Dick, without saying anything more about how to get them. That's approximately what object-orientation provides.
Roughly what we've said so far is that objects are things to which operations can be applied in order to obtain and maintain information. Implementation is hidden as much as possible. Operations may be applicable to different kinds of things, sometimes in different ways. Is that enough?
We're not sure where the objects are. Are we talking about things inside or outside the machine (information processing system)?
Let's use construct to mean something inside the machine. Some constructs (representatives) are used to represent other things (subjects). The subjects might be outside the machine - employees, departments, projects, bridges, engines, circuit boards. The subjects might themselves be constructs inside the machine - files, directories, programs, sessions, transactions, queues, arrays, records. Sometimes it's not clear. Is the workstation in the information system, or vice versa? How about the network? Is a contract, or memo, or book in the system? Even if the text isn't there?
Does "object" mean "representative" or "subject"? Or both? Sometimes we know perfectly well. When we create a person object, we know we haven't given birth to flesh inside the machine. It's quite clear that we have only created a construct to represent the person, capable of capturing the properties of the person - but it is not the person.
But what if we create a compiler object? That operation might create an actual program capable of being executed. Or it might create a representative object which lets us collect properties, such as language, operating system, programmer, department, completion date, etc. Or it might do both - and if it does, is that one or two objects?
Similar questions arise when we ask to create a new file, or a new database.
We can get away with simply talking about "objects" up to a point. But when we try to do things like making a new file or a new database, we need to clarify whether we are really creating the "subject" itself, or merely a representative for it, or both.
This will happen to us many times. We'll try to keep a certain concept as simple as we can for as long as we can.
I don't know what to assume at this point. Some models seem to deal with the subjects themselves; others seem to be dealing only with representatives. Others don't seem to care, or are ambivalent.
If pressed, I would probably lean toward thinking of objects as representatives, but that could be negotiated.
Time out for a meta-discussion about levels of abstraction in this exposition.
We observe the principle of adaptive precision. Economy motivates us to stay as casual, abstract, and informal as possible, whenever possible. This lets us work with the fewest concepts, and leaves the most options open. But proper understanding sometimes requires more formality, precision, and detail in explaining what's really happening.
When we run into an ambiguity, or a misunderstanding because we've been making different assumptions, we'll refine our concepts. We'll get more detailed, looking a little more under the covers at what is "really" happening.
So, for example, we'll talk about objects whenever we can. Once in a while, though, it makes a difference to know whether we're talking about a subject or a representative. When I destroy a file object, what have I done?
We'll run into many more cases of adaptive precision as we go along...
...as in the case of "information systems", and the very notion of "existence".
If a database has information about a file which does not exist in the operating system directory, does the file exist?
The answer, obviously, is yes and no. It exists in the database, but not in the directory.
Let's introduce the notion of a sphere, suggested by the phrases "sphere of knowledge" or "sphere of influence". We will speak of an object existing (or being known) within a certain sphere. I don't know yet to what extent we might formalize this notion.
We sometimes refine the notion of "information system" or "machine" to a more specific sphere, such as a database or a directory.
For greater precision, we combine ideas. For a given file, we might say that the subject exists in the sphere of the directory system, while a representative exists in the sphere of a given database. But we won't say that unless misunderstanding makes us.
How do we know that something exists in the system? There is very little we can observe directly. We get character strings displayed and printed out, and images and sounds can be generated, as well as movements of physical devices. For our purposes, we can assume for now that the only things which exist directly in the system are strings of some kind. We can even go pretty far by saying that the constructs in the system are exactly the things which we can see passing in and out of the system.
We can only know that other things exist in the system indirectly. We only know that Dick exists because the construct "Dick" comes out as an appropriate response to some operation, or because the system does not complain when we ask it to do something with Dick, such as tell us his age.
The nature of an object is only known by its behavior, namely how it responds to operations. We can tell the system to remember someone's birthday, and we can later ask that person's age. If we ask the system to set the variable x to the husband of Jane, and then ask for the age of x, we're not even sure that the whole object (representative) for Dick was assigned to x. The most we can be sure of is that something got bound to x which subsequently served as a reference to Dick. That something is itself a construct in the machine; we will call such a construct a handle.
Here's another case of increasing refinement. We casually say we're asking for the age of a person. A measure of "goodness" of a model or system is the extent to which it supports that metaphor, letting us keep thinking in those simple terms. But, if pressed, we may have to admit that we are asking for something associated with a representative of that person. If it matters, we might even acknowledge that we're not in direct touch with that representative; all we have is a handle that refers to it. Well, actually, when we make a reference, we are using a copy of the handle. (If I delete the value of that variable x, I may only have erased that copy of the handle; I didn't delete the handle from the system).
There's precious little we've actually said about objects so far. A handle refers to a representative of an object; operations can be applied to a handle, to get the effect of operations on the object.
(So, where's the object?)
There are static and dynamic aspects to an object. Statics have to do with the form and content of an object, and dynamics with what it can do, or what you can do to it.
It is a good idea to separate semantics from implementation, with implementation consisting of things which can be changed without altering semantics. But there is little agreement as to what this means or how to do it.
At a fairly high level of abstraction, the description of an object is mostly dynamic. (Not to be confused with procedural; behavior can be specified declaratively.) If certain operations are performed on an object, certain results will occur. The form and content that facilitate the results are hidden inside, as a matter of implementation; they could change without changing the semantic results of the operations. We might say that people have birthdates, and the system has to remember any birthdates it's been told (i.e., a behavior), but we don't need to specify how it remembers. To say that people have ages means you'll get an answer if you ask someone's age; we won't say whether it is looked up in memory or calculated from their birthdate.
Lower levels of abstraction are increasingly concerned with form and content.
There are at least two different levels of abstraction at which the static aspect of objects in the system can be described.
The form, or state, of an object is often perceived as a discrete chunk of stored data that belongs to the object, organized as variables whose values are properties of the object. Thus the "state" of Dick might include the facts that his name is "Dick", his wife is Jane, and he was born May 1, 1960. Such properties correspond to facts which can only be known by assertion, i.e., because someone says so, and can't be deduced from other information. Other properties might be defined by algorithms, such as the derivation of a person's age from his birthdate. (If an object has no assertable properties, it may have no state.)
The notion of state as stored data seems to mix semantics and implementation. For efficiency reasons, we might wish to store a person's age instead of having to compute it every time (we remember people's ages), so long as we take care to update it once a year. A semantic specification might say that one or both of Birthdate and Age are assertable properties, and describe how they are related. It would be up to the implementation to decide whether to store one or the other, or both, and how to manage changes in values. Furthermore, these implementation decisions could be changed without affecting the semantics of the object. Thus stored data need not be the same as assertable properties; which of these concepts corresponds to "state"?
Also, the notion that state belongs to an object constrains the treatment of shared information, i.e., relationships. The fact that Dick and Jane are married might be expressed as a Wife variable in Dick's object, a Husband variable in Jane's object, or both. If I want to know if Dick and Jane are married, I have to know whether to ask Dick or Jane; I can't just ask. If they get divorced, I have to know whether to change the state of Dick's object or Jane's object, or both; I can't just announce the divorce.
At a higher level of abstraction, we might simply say that the "state" of an object requires some memory of the current values of assertable properties, without specifying any structure for that memory. We might specify families of propositions which are simultaneously true or false, such as
It would again be up to the implementation to decide which and how many of these to maintain as stored data, how to derive one from another if necessary, and how to keep stored values synchronized.
We are suggesting that the same syntax be used for invoking procedures and referencing state variables. This allows for a slight simplification. While both approaches require externally available information to be referenced with method-like syntax, the other approach requires a distinct variable to be declared for all stored data. Our approach would allow Husband or Wife to be declared as stored data, without requiring new variables to be declared. If the implementation changes, these could be redefined as procedures without affecting external interfaces.
The states of many objects would be interlocked with each other by networks of relationships. In effect, state might be shared among objects.
We could turn our mental model of state inside out. Instead of state being in an object, we can say that the information system as a whole has a state in which objects are embedded. The state of any object is some subset of the whole state, but the states of individual objects often overlap. The image starts to look more like overlapping envelopes in a graph.
There's another subtle difference. In the first view, that chunk of storable data gave us a comfortable place to anchor our notion of what and where the object is. That discrete identifiable thing has faded out of the second view. All we have left are the handles of objects. If d is a handle for Dick, we can tell the system to remember a value for Birthdate(d), and we can ask about Age(d). But where is the object itself?
Does it matter? Will the system behave any differently?
At this level of abstraction, we don't say where an object is, or what it looks like. We sometimes see images on screens and printouts, and we sometimes transmit an image, but these images are not the things that exist in the system, though it is sometimes convenient to think so. We can often get different sorts of images as manifestations of the same object.
The illusion of objects being compact sets of state variables could be maintained for programming purposes. The system can support declarations asserting that Name, Wife, and Age are variables for the object representing Dick, and Name, Husband, and Age are variables for the object representing Jane. The objects don't necessarily have to be stored that way.
We might not even say what an object "contains", though there are ways to make objects behave as though they contained things, and that's also a convenient way to think about objects, sometimes.
What constitutes behavior?
When we ask the system to perform operations on objects, several things might happen. The operation might never terminate, until we lose patience and abort it. We might get some error indications, as constructs. We might get some results back, again as constructs. And there might be some alteration to the information in the system (updates, or side effects).
How do we know that information has been altered? The same way we know anything else: via observed behaviors. An update has the effect of changing the consequences of some operations. Changing a birthdate changes the result of an inquiry about age.
If we're visualizing the state of the system as a directed labelled graph, then functions serve as labels on edges that link their arguments and results. Side effects (updates) can be visualized as adding or deleting nodes or edges.
There's a lot more to be said about how behaviors are specified; we'll do that later.
Objects are things to which operations can be applied. This can be described as sending messages to objects, or as applying functions to objects.
For single objects, the two are equivalent. When multiple objects are involved, the message metaphor requires one to be designated the recipient. The function metaphor does not; if we want to know Dick and Jane's children, we don't have to decide whether to ask Dick or Jane, we just ask.
This parallels the two approaches to state, and again involves an inversion of concepts. In message sending, an omniscient message handler routes the message to the appropriate object, which executes an appropriate procedure in response. (More likely, a central copy of the procedure is usually executed, e.g., at the "factory".) In the function call metaphor, the omniscient facility initiates execution of a procedure appropriate to the arguments of the call.
We could write Dick.Age, sending the Age message to Dick, or Age(Dick), applying the Age function to Dick. Does it really make a difference?
Messages can take arguments. If we felt like designating one of the arguments of the function call as a "recipient", do we have any more than a syntactic difference? For example, if we always designated the first argument of a function call as the recipient, we have a correspondence between Dick.Married(Jane) and Married(Dick,Jane).
Suppose that it took a long and complicated procedure to calculate age, so that we don't want to pre-calculate it for everybody, but once it was calculated we'd like to save it for future reference. We might maintain a SavedAge property, as well as AgeExpires to tell us when it must be recalculated, i.e., the next birthday. Age(x) might then be defined as
SavedAge(x) <- CalculateAge(x);
AgeExpires(x) <- NextBirthday(x);
The properties SavedAge, AgeExpires, CalculateAge, and NextBirthday are not part of the "semantics" of being a person. They are introduced solely to implement a particular approach to maintaining the ages of people. When some genius discovers a simplified technique for calculating people's ages, we'd like to be free to change the implementation without impacting anybody else who used the ages of people. We do that by saying that these four new properties are hidden, or encapsulated, or private, i.e., not part of the external interface to a person object, hence they can't be used by others. Only the Age property is exposed, and that's the only way anybody else can refer to the age of a person.
Note that the private properties consist of both assertable functions (usually mapped to stored data) and algorithmic functions, i.e., procedures. In the general case, we do not have any direct correspondences between assertable functions, stored data and hidden properties.