What Should a Persistence Framework Do?
Write My SQL
I don't want to have to write my select, insert, update and delete statements. I just want to provide a little metadata describing how my objects are persisted and then I want my persistence framework to create the appropriate SQL (whether through pre-generation, runtime generation or runtime code synthesis).
Create Joins Automatically for Has-one Relationships
If a User has-one Boss, I'd like to be able to ask for FirstName, LastName and Boss.Title and have my persistence framework generate the appropriate left-outer-join code for me.
Aggregate Subqueries
If a User has-many Auctions and I want a list of FirstName, LastName and AuctionsTotalWon, a little bit of metadata describing the has-many aggregate operation (AuctionsTotalWon = Count(AuctionsWon) or TotalAuctionSpend = Sum(AuctionsWon.Price)) and would like the persistence framework to generate the appropriate SQL to return that in a single statement.
Get Associated
There should be a facility for me to ask an object to getAssociated(RelationshipName) and get back an IBO loaded with the associated object (has-one) or 0..* objects (has-many) so I can easily Order.getAssociated("OrderItem") or User.getAssociated("Address") - ideally with support for specializing the filter and overloading the default order by so if I want to get the addresses for a user ordered by three different properties, I don't have to define three different relationships between users and their addresses.
Support Paging
I need all list queries to be able to accept a page number and number of records per page so that I can easily get paged lists.
Allow for Versioning
I tend to build version and soft-delete capable cms's a lot of the time, so I need to be able to simply describe my versioning and delete rules in one place and have those applied to all of the objects implementing that versioning/deletion strategy.
Cascading Deletes
If I delete an object, any composed (as opposed to associated) sub-objects or joins should also be deleted automatically. For example, if a Order has-many OrderItem (composed), deleting the order would delete the related OrderItems. If a User has-many Roles (associated - a security role may be shared by more than one User), the Roles wouldn't be deleted with with User. In the case of a many-many (Product many-to-many Catgeory), if I delete a Product, it should delete the Product and the ProducttoCategory joining records, but not the Category.
Nice to Have's
In addition to the above items which for me are requirements, there are a number of other things that might be nice from a persistence framework:
Handle Column Aliases
This isn't a biggie for me, but if you're working with legacy db's it might be nice to be able to refer to the FNAMEVCHAR150 column as FirstName instead.
Support Inheritance
The ideal system would support single table per class hierarchy, table per class (joins) and table per concrete class inheritance strategies. At the very least it needs to support single table inheritance with a boolean field for each concrete class so a record can be of more than one type (a company might be both a vendor and a customer).
Value Objects
Value objects (those that are defined only by their values and that don't have an identity per se) are often best composed into other tables. If an Order has a BillingAddress which is a value object (it's a simple commerce system without an address book feature and you don't want it to have a lifespan or editing profile different from that of the order), you want to be able to persist the BillingAddress within the Order table.
Provide Caching
Some kind of caching mechanism for improving performance would be useful.
Not So Sure's
There are also some features that I've not had much of a need for to date.
Cascading Saves
I don't often find myself persisting an object graph. Usually I'm saving an object (its direct properties) or adding, editing or deleting one or more related objects. The only time I would find a cascading save useful was if I had value objects that I wanted to compose within the same row as their parent (e.g. BillingAddress within Order in a simple commerce system where there was no Address book feature).
What else do you want out of a persistence framework? What do you find in such frameworks that you don't need? Input appreciated.



Take a step back and look at the low level of adoption of object databases and even technologies such as object serialization in Java. The deadliest aspect of the "impedance mismatch" between databases and relational systems is really a difference between the kind of data structures that can live in memory in an application and the kind of persistent data structures that last a long time on disk and will need to be gradually migrated over time to reflect changing business realities.
The strategy that works best in the long term is to start with the design of the persistent data structures and work outward: so it's more of a matter of extending the domain of SQL into object systems rather than extending object systems into the database.
Minimizing configuration is important. I've developed a "passive record" in PHP that configures itself by introspecting the database. When I'm doing rapid development projects, I just don't have time to maintain multiple artifacts to describe the database structure. The experience of working with symfony plug-ins, in particular, shows how ORM configuration files doesn't scale.
I've been thinking about Microsoft's experience with LINQ, which uses some really clever ideas to integrate SQL queries with languages such as C# and VB. LINQ is successful because it's ducked the hard questions which may not be answerable: it supports a simple 1-1 mapping between object structures and database structures and does it with flair. On the other hand, developers are cold on the Entities framework which is trying to tackle the general ORM problem -- ultimately because that problem doesn't have a solution, or rather, beyond a certainly level of complexity, ORM systems have a way of introducing more problems than they solve.
I also deal with RAD development of systems, but do so by describing the domain model and using that (with some annotations) to generate the db schema as well as the code to interact with it. I find the object model way more useful for developing non-trivial applications than starting with a schema.
I've seen Hibernate used in some pretty large scale systems. How specifically do you find that it doesn't scale?
Have you checked out my Groovy/Hibernate project for CF? http://www.barneyb.com/barneyblog/projects/cf-groo...? It handles every item on your list, and while I doesn't let you persist CFCs, if you squint your eyes, Groovy classes sure look like CFCs written in CFSCRIPT. Plus they don't have a lot of the syntactic problems that CFML has.
Regarding Paul's comment of eliminating duplication, Hibernate supports that very well, I think. You have your entities, you have your mapping, and you have your schema. If they're the same, you only need entities, and Hibernate will figure out the mappings and even build your schema for you. So you only need use what you actually need.
I like that Hibernate/nHibernate is available in most languages as eventually I need to support n-3gl's with my solution, but I'll reserve the decision on implementation until I've locked my API and then I might do a quick spike both ways to see what's quickest/easiest.
I have definitely been watching your Groovy/Hibernate work and will no doubt be pinging you with questions about that as it looks great and Groovy was on my short list anyway! Interestingly I'm not seeing the adoption of Groovy in the Java world that I'd expected (and the cool kids are already onto Scala already :-) ) but I like the promise of what it offers, so I'll definitely give your stuff a play!
I've created a homebrewed CF ORM that does database introspection to generate objects on the fly, with all the column names from the associated db table as properties all set up. It also uses Peter's awesome IBO methodology to step around CF's object instantiation performance penalty.
The huge bonus is that I can drop this ORM into any project that already has a db established, and get up and running creating, manipulating and persisting objects right away. And if I add a column to a table, I just need to refresh the application and I have the property available to me without any additional changes.
Peter's approach would be great if you were starting from scratch in a project, I.E. no database existed yet, but I'm not often in that situation.
I have to agree with Peter on this, I like to import the important metadata from the database and add the richness to it also. I have a generator that i've been perfecting along the way and making it smarter each revision so that it can generate more of the work. At the moment, the most time consuming task is selecting validation types (server and client side) and a few other config items that I like to have in place.
@Peter, I'm glad to see I have a simliar approach to what you described in the last comment, I remember asking you this on IM regarding storing the metadata in the db and it has been a great help!
The configuration management problems involved are a bear: to add one column to a table I had to fork the project for a plug-in, and I need to do a complex and error-prone procedure to make sure that the database, models and everything are synchronized.
I'm obsessive about having the ability to maintain multiple development, test and production servers (Requirement 0.) To support that, I need to be able to migrate the system between versions by a procedure that's almost entirely automated. I generally do that with ruby-influenced migration files: these are generally SQL scripts, but they can be programs written in another language if something can't be easily expressed in SQL.
I see configuration management as the most basic technology underlying a web system. I find that open-source and commercial web applications and frameworks are often sorely deficient in that area. For instance, I really should be able to make a clone of my wordpress blog at another URL in just two minutes, just by copying the files, copying the database, and changing
1) the database name
2) the location of files on the disk
3) the root url
Instead, wordpress encodes full paths to url's throughout the database, so you need to do a set of undocumented operations on the database to do this kind of migration.
If any web framework makes it hard to do CM the way I want, it can jump in the lake for all I care.