By Peter Bell

Objects, Tables and ORMs – part deux

The other day I looked at inheritance strategies for ORMs. Today I want to look at some of the other times where there isn't necessarily a 1:1 relationship between objects and database tables . . .

Joining Tables
When you have a many:many relationship, you are often going to want to have a joining table for persisting the relationship between your objects. For example, if you have a product catalog with Products that can be associated to multiple Product Categories, you're probably going to want to have a Product_to_ProductCategory table. It is important that any ORM be capable of handling joining tables for many:many relationships and for systems where the ORM actually generates the database tables (OK, that is more than just an ORM - creating schemas is not typically a job for an ORM), it should be able to create joining tables whenever many:many relationships are defined. An ORM should also be smart enough to remove associations from objects that are deleted, so if you delete a Product or a Category, all of the references to that product or category should also usually be removed from the joining table.

Joining Tables with Extended Properties
Most ORMs can handle simple joining tables which are just comprised of two foreign keys for associating (say) Products to Product Categories. Things get a little more interesting when you have to model joining tables with extended properties.

Imagine that you have an e-commerce store with a simple attribute based system comprised of Products, Attributes and Attribute Options. A t-shirt (Product) might have an Attribute (size) with Attribute Options of small, medium and large. Often you will want to support Product to Attribute Option specific properties. For example, a large t-shirt might weigh more or cost more than a small one and might have a different product image. Often such additional properties for Attribute Options would be stored in a Product_to_AttributeOption table. Another example would be if you had products associated to multiple categories with the display order for each product stored in the joining table (so one product might be displayed at the top of one category and at the bottom of another). The question then, is how best to model this class of problems.

Perhaps the simplest solution is just to model the joining table with extended properties as a separate object. The only thing then is that you need to have a good mechanism in the ORM for writing extended three object join queries as you would want to return a product together with all of its attribute options and all of the extended properties of those attribute options using a single generated query to avoid getting into an n+1 query problem where returning a product with fifteen attribute options required sixteen separate queries.

To date I've always just hand written the queries to return a product together with all of its attributes and attribute options including the extended properties joining the product to its attribute options, but I'd really like to come up with a simple algorithm for generating this class of queries in the general case. Any thought/ideas would be much appreciated!

Virtual Objects
Another class of problem is that of “virtual objects”. Objects that make sense as a separate discrete element in the object model but that really don't need their own table. Imagine a simple e-commerce system where you want to store billing and shipping address information for an order within the Order table. It may well make sense to have an Address object, but you might want to be able to say that in the care of Orders, address information should be stored in the Order table. This wouldn't be a property of the Address object, but of the particular Billing and Shipping address objects composed within an Order object.

It would be great to be able to say that an Order object has-one BillingAddress object and has-one ShippingAddress and to use some kind of convention to say that they should be persisted to the Order table automatically. Anyone written something like this in the general case? Any thoughts appreciated. . .

Copying Objects in Composing Table
To return to the idea of Address objects composed within an Order object, sometimes it might be nice to store the addresses in both an Address table AND in an Order table. The Address table would provide for an address book (a common e-commerce feature), but by also persisting a copy of the address information directly within the Order object, you know there will be a permanent record of the address that an order was actually sent to – even if the user later updates their address book.

Again, I'm still thinking through the best way to implement and to describe these. Any thoughts on language or terminology that might be appropriate? Any ideas on implementation in the general case?

Non Persisted Objects
Finally, I just wanted to mention that there might well be a use case for being able to describe objects that are not persisted at all. While it is not a concern of an ORM, I have a system for describing my business objects, so it may be important I have the ability to use that system to describe objects that are never persisted at all.

Can you think of any other use cases where there isn't a 1:1 correlation between objects and database tables? How do you handle the classes of problems outlined above? Any input appreciated!

Comments
If you want to see how other's handle these issues, I would strongly suggest you look at the code for Datamapper (http://www.datamapper.org/). It is an extremely fast and full featured ORM for ruby. The code is real easy to follow and commented well so even if you don't know ruby, you should be able to decipher it. They've addressed many of your questions in writing it. Check it out. Wouldn't it be great to have an ORM like this for CF, wink wink ;)
# Posted By tony petruzzi | 4/3/08 9:47 AM
Hi Tony,

Firstly, thanks for pointing me to Data Mapper. I hadn't been keeping up with Ruby ORMs, so it was good to see what it was doing. I agree that it covers many issues really well, but interestingly while it covered an approach to joining tables, I didn't see anything about the idea of "virtual" objects being composed into their parents tables or anything like that.

Very cool though . . .
# Posted By Peter Bell | 4/3/08 1:21 PM
> Can you think of any other use cases where there
> isn't a 1:1 correlation between objects and database tables?

They seem to happen to me all the time. My favorite is the idea of auditing....almost every DBA I know puts the equivalent of "createdDate," "createdByUserId," "updatedDate," and "updatedByUserId" on most tables.

I don't have corresponding props on my object...doing so is repeating the save attributes and behavior across your model, which should be a good clue to refactory. I've typically got Widget has-a AuditInformation has-a Creator (User) and has-a Updater (user)....so I do need something to map my objects to a differing relational schema.

This is very similar to your idea of "virtual tables," and one Hibernate recognizes as a first class idea (a component rather than a parent / child relationship).

Join tables:

Join tables are OK for simple things, but as you showed, they're inherently dangerous. A lot of the folks in the Hibernate camp figured this out quickly (it's pointed out repeatedly in Java Persistence with Hibernate, the main book on the framework).

The danger lies in the fact that systems will often quickly require not just storage of the relationship (which products have which options, e.g. a T-Shirt has Size options) but also modelling of the relationship itself (when a product is related to options, what are the attributes of that relationship?). I take an OO approach to this, rather than trying to think in tables: a Product has-many OptionRelationships, and an OptionRelationship has-a Option and has-many OptionValues. Hibernate handles this fine, but for pure CF, thinking about it in terms of objects first makes the SQL I'll have to write seem a lot simpler to me.

Overall:

I'm curious as to why a series on "Objects, Tables, and ORMs" is always about tables? Shouldn't it be about models, then how to map those models to tables? Many of the issues you're raising have been solved quite nicely by Hibernate, and if you're looking to roll your own, I'd suggest reading the book I referenced earlier as it talks about not just Hib implementation of their solutions but the theory behind the solution from an OO (rather than relational) perpective.
# Posted By Joe Rinehart | 4/3/08 3:05 PM
Good grief I can't type today:

"doing so is repeating the save attributes and behavior across your model, which should be a good clue to refactory"

should be:

"doing so is repeating the SAME attributes and behavior across your model, which should be a good clue to REFACTOR"
# Posted By Joe Rinehart | 4/3/08 3:07 PM
Hi Joe,

Thanks for the comments! The series is specifically about handling the issues with mapping Objects to Tables using ORMs, so that's why it is all about tables - or specifically about coming up with an understanding of some of the ways objects and tables can interrelate that will need to be captured in the language for describing objects. You're right about Hibernate. I got a book about two years ago which was great, but I really need to take the time to catch up with their implementation as it really is the reference ORM.

As far as rolling my own, I do need to roll my own language. Whether I then write the implementation from scratch or write an adaptor to map my concepts into a given underlying implementation will depend on the options available in the language. I've got to say that Hibernate is a pretty good reason to consider moving my object model into Java where it would still be available from CF, Groovy, Jython and JRuby, but I quite like keeping my model in CF at the moment. I'll get my language to a point where it covers enough of my use cases and then I'll come back to best implementation approach . . .

I have a basic ORM/data mapper that works quite nicely for a set of use cases, but I never got round to wrting something to handle multi-stage joins without n+1 queries which is why I was wondering if there was an elegant way of handling has-many relationships without having to model the joining information as a separate object, but I think I'm going to have to bite the bullet and either implement n-object joins in a single query and/or choose a pre-built system that supports that out of the gate (like Hibernate).

Could you remind me of the book you "referenced earlier"?! Sounds like it's time for a trip to Amazon again . . .
# Posted By Peter Bell | 4/3/08 3:21 PM
Peter,

FWIW, we're going down the java domain model w/ Hibernate route. Not across the board, but where it makes sense.

The big thing on my ORM wish list is the ability to have an ORM managed class that has no properties (and is therefore never persisted) but just acts as one end of a persistent association.

This would be useful in two scenarios:

1. An "Everything", or "System" object at the top of a composition tree into which every other object in the system is conceptually composed would then be able to take advantage of the awesome power of collection mapping, enabling me to manage all objects in a persistence agnostic manner. Usually this kind of umbrella object looks like a service, has no intrinsic properties and is not mapped - and therefore has to be persistence aware in just about every operation. The moment we can make it visible to the ORM, all its associations can be managed by the ORM and dealing with "all widgets", "all whatsits" etc. becomes vastly simpler.

2. A class at the bottom of a composition (or inheritance - the argument is the same) tree that exists only to provide extra associations. This is the case when extending or decorating a framework class in order to add associations but not adding any new properties. No properties = no table = not visible to the ORM - therefore once again we need to explicitly manage the persistence of all assocated collections.

We can cover both these cases by adding an artifical table and persisting some artificial properties. It's a bit of a hack, though, and especially in the first case very conceptually muddy - what if someone adds a second record to the "Everything" table?
# Posted By Jaime Metcher | 4/3/08 6:42 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.