By Peter Bell

What I Need From My ORM/Data Access Layer

The ideal ORM is dependent on your exact use case. From Transfer and Reactor to SQL Alchemy and Hibernate, there are many ORM systems available of varying complexity. Here's what I need from my ORM system. I'd appreciate any input on features I haven't mentioned along with a sense of the problem they solve. This spec works for me today, but I'm sure there are plenty of new use cases just lurking out of site that I haven't thought through yet :-> . . .

Column Aliasing
I want to be able to vary object property names independently from database column names so if I have a column called First_Name_104_VARCHAR100_Type3b (do DBA's have nothing better to DO with their time?!), I can just alias it as FirstName.

Left Outer Joins
If I want to display a list of users and each user has-one Boss, I want to include Boss.Title in the property name list and have it take care of the left outer join so I don't get n+1 queries and don't have to write the SQL by hand.

Aggregate Functions
If the same list of users also needs to include the number of auctions they have won, I want to be able to describe an aggregate property of a has-many relationship to generate a subquery in the select (I know there can be performance issues, but a join doesn't work in the general case for this where there may be more than one aggregate on the same joined table - cartesian products and other evilness ensues). So I need to be able to describe that a user has-many Auctions and be able to add a filter (only won auctions) and aggregate (in this case count) to describe the NumberofAuctionsWon aggregate property of the User object that I can then include in any "get" operation.

Get Associated
I need to be able to call User.getAssociated("Addresses") and to be able to filter, order, page and set the property name list for the returned IBO in case I just need home addresses or trouble tickets that aren't resolved yet, paged, 20 records per page and just returning the title and ID.

Cascading Deletes
If an object is composed as opposed to associated (which I define for these purposes by whether or not the related objects should continue to exist after their related object is deleted), I need automated cascading deletes of all composed objects n-levels down.

Inheritance
Of the three approaches, one table per concrete class doesn't require any magic - customers and vendors are in their own separate tables. One table per abstract class (class hierarchy) just requires support for an ObjectProperty and ObjectPropertyValue so you can distinguish customers if Customer = 1 or if CompanyType = 'customer' (independent bit settings per object type are usually the most performant and flexible approach as they index well and allow a company to be both a vendor and a customer). If you want one table per class, ideally you'd add the ability to set the table that each persisted property is in and to join the tables automatically, but you can sometimes get away by just using has-one associations between the table - depends how "clean" you want to be and whether you actually use this approach or not.

Calculated Properties
I need to be able to "select" properties that don't exist in the database at all. If I ask for Age, it should be smart enough to get me dateofBirth. I should also be able to order by calculated properties. Some calculated properties can be ordered by in the db (for instance, Age and Date of Birth are easy to figure out), others can't which means you can't solve the generalized problem for n-records as you have to handle the ordering by retrieving the entire recordset, calculating by the calculated property and then sorting the data set by the new calculated value which is fine with 50 or 100 records but not with 10,000 or 100,000 unless you have a serious caching system. For now I only have a partial solution and will work on the cashing system when I have a need for it. Similarly for filtering by calculated properties, I haven't had a need yet, but it could also be solved using a cache, a little clever calculation and a query of query.

Joining Tables
For many-to-many relationships, a join table is usually (although not always) used to implement. It is pretty straightforward to handle if the joining table has no properties of its own. With its own properties (perhaps a display order for the products in a given category), it becomes a little trickier. Expect a posting on that tomorrow morning as I try to upgrade my ORM to handle that without having to treat the "smart" join table as a separate object which can hurt from a performance (and elegance) perspective.

What Else?
I see a lot of people doing things like deeply saving a bean including its children, but that really doesn't map very well to the way I usually do things. Would love to hear a little more from people who do that about the kinds of use cases so I can understand where my approaches will break down.

Anything else missing (excluding caching/performance which I agree are important but aren't a focus for my current use cases)?

As always, thoughts appreciated!

Comments
Here's an idea:

Look at the ORMs available for ColdFusion (Transfer, Reactor, ICEGen, ARF), see what they offer in terms of the list of requirements you need. If you see one that offers more than the others, but not all, contribute to the project and add those features.

Writing an ORM isn't easy (I wrote ICEGen) and it's hard to fit everything into it. The big problem though is that ColdFusion is sorely lacking a solid ORM. We need more people contributing to projects, like an ORM, to give the language something beneficial.
# Posted By Tony Petruzzi | 8/25/07 10:08 AM
I think at least for calculated properties you can define a column as a calculated column in the db table and take care of it that way. At least then it's one less thing you have to worry about - just tell the ORM backend to pay attention and not try to update calculated columns, and everything else works the same.

Cascading deletes can be similar in that when you define your object model / metadata include an option for cascadeDeleteOnComposedObject=true or whatever, and then in generating the db schema you can take care of it there. Of course that doesn't work if you prefer soft deletes, but you could easily generate a trigger that does the appropriate update if that is the case.

Lately I've been using the DB to take care of more DB things (and unfortunately some non-DB things) so that's where this is coming from. I think the overall code required to implement this would be a bit easier to write and comprehend if you went that direction with it.

A potential problem here though is you are more likely to start getting into vendor-specific code for the DBs you want to support.
# Posted By Sammy Larbi | 8/25/07 11:35 AM
@Tony,

I already have an ORM that does almost everything above. I need something that I can control myself and port to other languages (such as Ruby) as I need to. What I was trying to do was start a discussion of what features (especially the ones I missed) are important and why, so that all of the ORM authors can decide if any of the features are of interest to them.
# Posted By Peter Bell | 8/25/07 11:37 AM
@Sam, Putting calculated columns in the db doesn't work in the general case. If we're taking about FullName = FirstName & LastName or Age from DoB, that's fine, but what if the calculation of price depends on a discount that requires a call to a real time API for checking a consumer credit rating? I'd hate to try to do that in T-SQL!!!

I think there are benefits to moving some calculated properties to the DB, but on the whole for now I prefer the simplicity of all of the custom code I require being in a single language (ColdFusion) rather than a combination of CF and SQL. All that said, I could see me moving more to the db over time - as you mentioned - it can just make multiple db platform support a little tricker.
# Posted By Peter Bell | 8/25/07 11:41 AM
@Peter,

If you already have an ORM that does everything you need, then release it to RIAForge.
# Posted By Tony Petruzzi | 8/25/07 1:09 PM
@Tony, I have released a bunch of things to RIA Forge. I blog, present and help out on mailing lists. The ORM is part of an in-house proprietary framework I use to earn a living so I can afford to do all of the free stuff. Why not ask Joshua Cyr to give away Savvy or Adobe to give away all versions of CF?

We all have to make a living from something unless we want to fix cars for income and code purely for love!

Doesn't mean that we can't all still share ideas and learn from each other.
# Posted By Peter Bell | 8/25/07 2:01 PM
@peter,

My bad. Didn't know that it was an proprietary project. Sorry about that.
# Posted By tony petruzzi | 8/25/07 3:19 PM
@Tony, NP - should have made it clearer upfront! Apologies if I over-reacted!
# Posted By Peter Bell | 8/25/07 3:35 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.