By Peter Bell

Objects, Tables and ORMs

One of the quickest tests of an ORM is to look at how flexible it is when mapping objects to tables. Here are some of the possible requirements to consider . . .

One Table Per Object
The simplest starting point is to assume that there is one database table per business object. For extremely simple applications, this may sometimes hold true (and for more complex apps you may be able to hack around this limitation for a while), but there are a range of other cases that a mature ORM needs to at least decide whether or not to support.

Implementing Inheritance
One of the first issues usually raised is how to support inheritance. If an AdminUser sub-classes the User object or a Client and Vendor both extend Company, how do you persist the information about these kinds of business objects? There are three common approaches: one table per abstract class (single table inheritance), one table per concrete (instantiable) class and one table per class.

One table per abstract class
With single table inheritance (one table per abstract class), you create a single table for each class hierarchy, so you would put both Clients and Vendors into a Company table and you'd either have a single “type” field to distinguish whether a company was a Client or a Vendor or (my preference) a boolean field for each type – allowing for quicker searches and for a single company to be both a Client and a Vendor without having to enter two separate records.

As long as you don't have too many custom fields per subclass or too many records per subclass, this is a good default strategy. It starts to fall apart if you have millions of Vendors but only hundreds of Clients and are querying Clients more often than Vendors or if you have a really large number of custom persisted attributes for each subclass, but it's a pretty good starting point for most use cases.

One table per concrete class
With one table per concrete class, every instantiable class gets its own table, so the Vendors would go in a Vendor table and the Clients would go into a Client table. This handles the problems of wasted columns and large number of one object type slowing down searches on other object types, but it only works well if the subclasses are exclusive sets. If a company can be both a Vendor and a Client, with this solution you'd need to have two records – one in the Vendor table and another in the Client table and you'd have to make sure that changes to one record propegated to the other.

One table per class
The most normalized approach is to create one table per class (abstract or concrete). In this case there would be a Company table containing all of the persistable attributes for all company types and then Vendor and Client tables containing the extended persistable attributes for Vendors and Clients respectively. A Vendor would be returned by joining the data in the Company and Vendor tables.

While this is an elegant approach that solves the problem of wasted columns which you get with single table inheritance, as your object model gets more complex, both generating the code to handle this and managing the performance of the resultant queries becomes a problem. Because of this, single table inheritance is often a good default choice for ORMs to implement.

Inheritance is just one of the cases where there is not necessarily a 1:1 relationship between objects and database tables, but I think that is enough for a single posting. Look for more on this in the next day or two . . .

How do you handle persisting of inheritance in your object models? Any thoughts appreciated . . .

Comments
Are you trying to single out any one ORM or is this in general?
# Posted By Andrew Powell | 4/2/08 4:06 PM
so what do this mean? are you trying to say that the ORMs available for CF are crap (you be correct)? i don't get why you would post something like this without disclosing what ORM triggered this thought.
# Posted By tony petruzzi | 4/2/08 4:10 PM
@Andy/Tony, Honestly, the ORM this is aimed at is a proprietary CF ORM that is integrated into a larger proprietary framework. The developer is me and the goal is to make sure that my (currently rather lame) support for cases where there isn't a 1:1 relationship between business objects and tables improves!

Sorry, should have made that REALLY clear. I personally believe Reactor and Transfer (and any other CF ORM's I missed) rock. However, in terms of features I tend to use Hibernate as my best of breed benchmark (although I haven't been thinking about ORM for a while, so I'm probably a version or two behind on Hibernate features). I'm honestly not sure whether any or all of these elements are supported in Transfer or Reactor as neither quite fits my use case, but knowing Doug I'm sure Reactor does everything he needs it to do, and knowing Mark I know that he is committed to making Transfer *the* POSS ORM solution in the CF space and I am convinced he'll succeed in that.

Should have made it way clearer that I was talking about patterns in the abstract as part of my design process. Sorry guys!
# Posted By Peter Bell | 4/2/08 4:22 PM
I know Hibernate gives you flexibility in how you persist your sub-classes. Personally, I tend to favour the One table per class approach. It's easier for me to follow in my Hibernate mappings... And yes, I do tend to prefer mapping files to annotations.

Speaking of that... wouldn't it be cool if a CF ORM let you do something like annotations (comments maybe?) that defined your persistence?
# Posted By Andrew Powell | 4/2/08 4:28 PM
@Andy, I've been playing with annoting cfproperty tags to put metadata in beans. I'm not that in love with the outcome though. Because I'm focused on systems that are primarily defined in terms of metadata I'm finding that it makes more sense for me to keep the metadata in external syntaxes like a cms or xml files. I do think annotations are great for use cases where you are primarily using a given 3gl for developing your apps and you just want to be able to document your intent more efficiently by adding annotations to classes and methods.

Nice thing is at least CF supports annotations - Chris Scott wrote something cool about it last year on his blog . . .
# Posted By Peter Bell | 4/2/08 4:33 PM
These thoughts are not just useful for ORM, but also talking about DB persistence in general.
# Posted By Sammy Larbi | 4/2/08 5:03 PM
@Sam, agreed!
# Posted By Peter Bell | 4/2/08 5:10 PM
@peter,

thanks for clarifying.

my personal opinion is that the one class per table is the best approach. i like the idea that all the validation and businesslogic for a single record in the table is contained within a single class.

that being said it does fall short in some situations. Most notability is when you have a one to many releationship between two tables and need to perform validation and business logic on the mulitple children of the parent class.

case in point. i'm currently doing a project where we have a price adjustment table that has multiple pricing teirs. the business logic calls for prices not overlapping and making sure that only one tier be marked as unlimited or as a ceiling. trying to put this kind of business logic the one table per class model fails miserably.

in order to accomplish the validation i had to create a class that loads both the parent and multiple children. using a class like this i can validate the parent and the multiple children using the validation build into their respected classes, however now i can create a validation method that accounts for all the business logic and validation that spans across the various children. hopefully this makes sense.

the one table per class works about 95% of the time and for most projects, however it does fall short when you try to do something outside the box.

transfer is the only orm i've seen that takes something like this into account right off the bat, enabling you to form collections using arrays and structure (depending on which you want). transfer really is one hell of a product. i guess my biggest gripe with it is the way it supports null values and the fact that it doesn't support loose typing (duck typing). those feeling aside i can easily seeing transfer become the "linq" for coldfusion.
# Posted By tony petruzzi | 4/2/08 7:07 PM
i would vote for the one table per class scenario. this is the most normalized approach, and i can't imagine having any performance issues until you exceed much more data than referenced. as long as you have proper db management, you should be able to handle many 100's of millions of data with near instaneous response. granted, you might need to invest more into db management into the creation of your persistence (ie mysql:sharding, oracle:partitioning, etc).
# Posted By shag | 4/3/08 12:22 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.