RDBMS != Object Store

  It's true that an RDBMS doesn't map well to the object-oriented ideology. That's because an RDBMS does not store objects, or anything like them.

The object-oriented ideology as instantiated in C++ and Java is founded upon breaking data into objects, bearers of identity, which belong to classes, bearers of structure and behavior. (C++ and Java make little account of metaclasses, which are used in more dynamic object systems such as Python's class system and Common Lisp's CLOS. Templates are not metaclasses.) Objects have identity, so they can be equated; they are the unique bearers of attributes about themselves; and each object's structure is dictated by the class to which it belongs.

  When object-oriented partisans look at a database, they see its relvars (or table headers) as bearers of structure and think of classes, and its tuples (or rows) as bearers of identity and think of objects. They see a database as a place to store objects persistently.

  But this is not what an RDBMS does. An RDBMS isn't an object store; its relvars are not classes and its tuples are not objects. So what is an RDBMS? What is "relational" anyhow? Relational databases are founded upon relational mathematics, which is what you get when you cross set theory with predicate calculus.

  Set theory is the branch of math that deals with collections of elements which behave according to formal axioms. Set theory lets you say, for instance, that if you have a non-null set R and a non-null set S, that you can construct a set R*S of all the possible pairs of elements from R and S.

  Predicate calculus is the branch of logic that deals with quantified statements about entities. It lets you formalize logical arguments such as the syllogism: All men are mortal; Socrates is a man; therefore, Socrates is mortal. Predicate calculus deals with generalizations and instantiations of those generalizations.

  What do we get by combining set theory and predicate calculus? We get a system that allows us to operate upon sets of tuples of values satisfying predicates. A relation holds tuples of values which make some predicate true. For instance, consider the predicate "Person x owes me y dollars." Tuples which satisfy this predicate will be pairs (x,y) for which the sentence is true. For instance, if Fred owes me 40 dollars, (Fred, 40) satisfies the predicate. It could thus be a tuple in the relation described by the predicate -- the one relating people's names to how much they owe me.

  With the relational algebra (or an RDBMS) we can do operations upon this relation and others. We could, for instance, select a result set of all those people who owe me more than 50 dollars -- or join this result set with those people's addresses. Whatever result set we ask for will be calculated from the facts in the database. We might get back this result set:

(Barney, 75, 40 Elm Road)
(Megan, 60, 9 High Street)

  Now, are the elements of this result set objects in the object-oriented sense? They are not. They do not have identity. The tuple about Barney is not Barney himself, or even a machine representation of him. It doesn't uniquely store attributes of Barney -- after all, we created it by joining tables which also contain such attributes. It is not even, truly, a fact about Barney exclusively -- for it is also a fact about the number 75, and about the address 40 Elm Road. It isn't an object; it's a tuple value, and values do not have identity as objects do.

  Moreover, note that by joining, we can construct new relations from old ones. Thus, not only are tuples not objects, but relvars are not classes. After all, in OO we do not create new classes by joining, but by inheritance or encapsulation of old ones -- and creating a new class does not cause it to be instantiated into known-correct objects.

So what does this matter to OO people faced with RDBMS as a data store? To make the two match up, they must needs throw away some of the properties of each. An object system where objects don't have identity is not very good at being an object system, since you can't tell if two objects are the same -- only that they have the same values. And a RDBMS where you cannot construct new relations on the fly by joining is an infertile place to store facts, one which cannot be used to derive original facts from them.

  In summary: There is an insoluble tension, or impedance mismatch, between OO and RDBMS. OO is about classes with structure and objects with identity. RDBMS is about relations bearing tuples satisfying predicates. OO lets you pass around structure, data, and behavior by handing someone a pointer. RDBMS lets you derive new facts (in new structures) from known facts.

  These are not the same kind of thing, and there exists no natural mapping from one to the other. Anyone who tells you otherwise is selling something. If you want an object store, get an object store; if you want to store facts and derive other facts from them, an object store will not help you -- you need an RDBMS.

posted on 2005-04-19 22:06  寒若辰  阅读(591)  评论(0编辑  收藏  举报

导航