Implementation Hiding

William Kent
Database Technology Department
Hewlett-Packard Laboratories
Palo Alto, California

Nov. 1993

SQL3 Discussion Paper
ANSI Number: X3H2-93-368R1
November 17, 1993
Title: Implementation Hiding
Author: William Kent

References:

1 X3H2-93-359R/ISO DBL MUN-003, (ISO-ANSI Working Draft) Database Language SQL (SQL3), Jim Melton (ed.), August 1993.

2 X3H2-93-076, "SQL3 OO Underlying Assumptions", by William Kent. [html]

3 X3H2-93-109, "Identity and Equality", by William Kent and Amelia Carlson. [html]

4 X3H2-93-234, "POINTS TO and CONTAINS", May 1 1993, by David Beech, Boris Burshteyn and Phil Shaw.

5 X3H2-93-384R, "Extents for object ADTs", Sept 10 1993, by John Bellemore, Tim Nguyen, Gray Clossman and Phil Shaw.

1 DISCUSSION

Papers [4,5] mention "the tradition of SQL as being more of an `end-user' language than a `system programmer' language". Such a language depends on, among other things, the object-oriented principle of implementation hiding. The present paper seeks consensus on the extent to which this principle guides the development of SQL3. The material is presented as a proposed addition to the concepts described in [1]. Committee discussions of the draft proposed below will serve to clarify areas of consensus. Some of the concepts mentioned herein may be reflected in current language facilities, while others might constitute language opportunities.

2 ACTION PROPOSED

Add the following new subclause to Clause 4 (Concepts) in [1].

4.x.1 Implementation hiding

To enhance portability and reusability, applications should be insulated from the effects of changing implementations as much as possible. As reflected in the definition of "abstract data type (ADT)" in Subclause 3.1.3a, object orientation promotes such insulation via "the separation of the interface of the type from its implementation". The definition of "implementation (of an ADT)" in Subclause 3.1.3u goes on to say that "Stored data together with the data structures and code that implement the behavior of an ADT is its implementation."

"Implementation" can be interpreted here as those specifications which can be altered without altering the correctness of an application. Changing such specifications might require recompilation of the application, but no reprogramming.

This proposition has the following implications for a schema and a data definition language:

Implementation specifications should be clearly distinguishable from other specifications. Interface and implementation should be distinguishable in type definitions.
There should be a mechanism for altering implementation specifications.
It should be possible to show alternative implementations in the same schema.
Information users needs to know about behaviors should be available without dependence on implementation specifications, e.g., via formal specification techniques and/or pre- and post-conditions.

While it is certainly desirable for a user to have confidence in the correctness of an implementation, it is difficult to guarantee such correctness. Any assurances of correctness should not compromise these principles.

There is an important question of how an implementation is chosen when alternative implementations are available, especially for newly-created objects. Ideally, the choice should be made outside the application, perhaps by some defaulting or context-dependent mechanism. For example, the local copy of the schema might only specify one implementation. Alternatively, the implementation might be chosen on the basis of which machine or operating system the application is running on, or it might depend on which application invoked the given application.

To the extent that an application does choose the implementation and/or it makes decisions based on the implementation, the portability and reusability of the application are compromised.

Precise interpretation of the general principle of implementation hiding can be guided by corollaries such as the followingē

4.x.2 OID formats

OID formats are a matter of implementation, and not part of the language standard, except perhaps at a low level comparable to standardizing the exact bit pattern representation of character strings. Whether any useful information happens to be embedded in the oid rather than in a separate table should be left to the implementers.

[Acceptance of this corollary should be accompanied by a change to the definition of object identifier type in Subclause 4.9, removing any reference to type information being contained in an OID.]

4.x.3 Private data in constructors

The essential purpose of private attributes is to describe data which is internal to the implementation, and should not be exposed to users. In particular, applications should in no way be dependent on private attributes. By extension, an application should even be insensitive to changes in the configuration of such private data. In particular, an application should not initialize private data.

Typical examples of private data (in non-relational terms) include such things as:

Sizes of buffers and caches.
Status of buffered/cached data.
Pointer to the head of a queue or stack, or copy of the top value.
Internal links between elements of a list.

Internal representations of data visible to applications should be initialized in terms of the visible data. Thus if age is visible while birthday is hidden, then the application should initialize age, not birthday. If this is not really invertible, the counter-argument is that this is not a realistic example.

4.x.4 Names of attributes

Applications don't have any need to refer to the names of attributes. They only need to use the retrieval and update operations (observers and mutators). There is no logical difference between an attribute name and its observer operation. Thus it is sufficient to specify that a mutator updates an observer, such as

Set_Age UPDATES Get_Age

Move UPDATES Locate

without specifying an attribute name. Note that naming conventions do not require the attribute name to be part of the mutator or observer operation names.

4.x.5 Separating interface and implementation specifications

Suppose an interface for circular objects is defined with members for Center, Radius, and Diameter. It should be possible for different implementations to store either the radius or the diameter, or both. Updates to either can be propagated into updates to whichever ones are stored. An application should be able to function with a single interface to any of these implementations, and even survive a change of implementations with at most a recompilation.

It would be appropriate for the interface to document relevant semantic constraints, such as the fact that the diameter is twice the radius. However, implementation characteristics, such as which attributes are stored and which are virtual, should not be part of the interface specification.

As another example (illustrated in section 1.6 of [3]), geometric points should be defined as abstract points which might be implemented by storing either polar or rectangular coordinates. The properties of a point include x and y coordinates and also a magnitude and angle. It should be possible to designate a point using either polar or rectangular coordinates (i.e., two designators), independently of how the point is stored in any particular implementation.

4.x.6 Organization of sub- and supertables, or sub- and supertype extent tables

There are many ways to configure the "real" tables which implement the semantics of sub- and supertables, as well as the extent tables for sub- and supertypes. It should be possible for an application to operate with any of these configurations, and to even allow the underlying configurations to be changed without impacting the logic of the application (recompilation might be required, but not reprogramming).

We can illustrate some possible configurations using two example types/tables: Person(Name,Birthplace) and Student(Major,Credits). Let :pam be a person who is not a student and :sam be a student, hence also a person.

All instances of a base type/table and its subtypes/tables maintained in a single table, effectively containing columns corresponding to attributes of all types in the type family:

PERSON
--------------------------------------------------------------
| oid  |  Name  | Birthplace | Student |   Major   | Credits |
|============================================================|
| :pam | Pamela | Pittsburgh | N       |           |         |
| :sam | Samuel | Sacramento | Y       | Sociology | 77      |
--------------------------------------------------------------

Vertical partitioning by type attributes. The PERSON table contains columns for attributes defined for Person, the STUDENT table contains columns for attributes defined for Student, etc. An object has an entry in each table corresponding to a type of the object:


PERSON                           STUDENT
------------------------------   ------------------------------
| oid  |  Name  | Birthplace |   | oid  |   Major   | Credits |
|============================|   |============================|
| :pam | Pamela | Pittsburgh |   | :sam | Sociology | 77      |
| :sam | Samuel | Sacramento |   ------------------------------
------------------------------

Horizontal partitioning by subtype. The PERSON table contains all values for direct instances of Person, the STUDENT table contains all values for direct instance of Student, etc. (This is a bad design if it becomes possible to be a direct instance of Student and Teacher without being an instance of Intern.)

PERSON
------------------------------
| oid  |  Name  | Birthplace |
|=============================
| :pam | Pamela | Pittsburgh |
------------------------------

STUDENT
----------------------------------------------------
| oid  |  Name  | Birthplace |   Major   | Credits |
|==================================================|
| :sam | Samuel | Sacramento | Sociology | 77      |
----------------------------------------------------

Horizontal partitioning between transient and persistent instances. (The distinction could be established, for example, by an additional parameter to the constructor routine.)

PERSON
------------------------------
| oid  |  Name  | Birthplace |
|=============================
| :pam | Pamela | Pittsburgh |
------------------------------

STUDENT
----------------------------------------------------
| oid  |  Name  | Birthplace |   Major   | Credits |
|==================================================|
| :sam | Samuel | Sacramento | Sociology | 77      |
----------------------------------------------------

STUDENT_TEMP
----------------------------------------------------
| oid  |  Name  | Birthplace |   Major   | Credits |
|==================================================|
| :tom | Thomas | Toledo     | Tennis    | 22      |
----------------------------------------------------

4.x.7 Private tables for type extents

As an extension of the previous point, it should be optionally possible to restrict access to extent tables for types so that they can only be manipulated from within the bodies of constructor, destructor, observer, and mutator routines. It would thus be possible to configure the schema so that applications do not directly access such tables. Changes in implementation would be reflected in changes within these routines, without otherwise affecting applications exploiting this option.

4.x.8 Where object data is stored

An object may be mentioned in many places in many tables and/or variables, under the same or different types. It should not matter to an application which place, or how many places, actually store the data about the object. Such definition should be part of the implementation specification, which may in the future even provide for replica management.