By Peter Bell

Returning data: Beans vs. Queries

A very common approach to persistent data access and management in ColdFusion is to use a (data table) Gateway for bulk operations and a DAO for single record operations. A corollary is that the DAO usually returns a loaded bean whereas a Gateway usually returns a query.

There is a problem with that – I don’t want to have to write all of my views twice! . . .

Have you ever come across a use case for multiple record editing forms? Say a user wants to update pricing on 10 products or change the subscription status of 20 users? One solution is to have a default form and then to wrap it with an iterator so that you can display n-instances (as well as a field name list property so it can be used to display as few or as many of the fields as the user wants to edit at runtime just by editing the list). But if the data for a single form comes from a bean and the data for a collection of records comes from a query, you need to have two different forms or write some translation code to handle this case.

Equally, you might have a simple view that you want to be able to apply an iterator to if you want to display multiple records. Again, this mainly affects dynamic, default views – anything that is customized for a particular page (such as the layout of a product in a nice category list) will probably be optimized for 1 or n record display. Again, a field name list could drive which fields were shown to a particular user allowing for credible reuse of the views between single record cases (with most of the fields being displayed) and multiple record cases (where only a handful of fields might be displayed for each record).

It seems to me that it might be nice if the model returned data in a consistent manner so you didn't need to have to worry about whether the number of records was 1 or 0..n. At first I was tempted to get my DAO's to return queries just like my Gateways. Then I decided to use objects for DAO's which tempts me to use objects for my bulk objects as well.

This is pretty common in the java world, but instantiating a collection of objects can be performance limiting (this isn't premature optimization - this can make any application CRAWL) and it is an awful lot of work just to display a table of standard, non-editable data. Object pooling/recycling is one possible solution, but I think I’m just going to throw an iterator into my business objects so they can relate to 1..n records.

[edit] I guess I need to make it clearer that I'm not talking about creating a collection of objects with an iterator but creating a single object (so there is only the instantiation cost for 1 object per request) which will act as a flyweight, providing access to a query recordset using getters to support calculated values. The only instantiation cost will be a single object per request which doesn't seem unreasonable to me. [/edit]

This should also provide some additional benefits. Often there is not a 1..1 mapping between fields in the database and the attributes of the object. The classic example is that you might store a users date of birth but provide access to an Age property. How do you display the age of 20 users in a query returning from a Gateway? You’d have to write duplicate code to that in your getter that you already have in your business object. This doesn’t seem an elegant approach.

So, I’m going to play with using a business object with an iterator so the same class can handle access and mutation of 1..n records. Conceptually I am not sure that the idea of an object being a collection of objects is "right", but it seems to me to offer some engineering benefits, so I’m going to give it a try and see if I crash and burn (add me to your feeds list to find out!).

Comments
I Should say that the other solution to the Age of 20 users is to put it into your DB encapsulated in a stored procedure. That was one of the solutions I considered. For a shop that is tied into a single DB and has a great SQL person that is definitely an option, but I'm going to keep my business rules out of the DB except when I really need them to be there for performance purposes.

Best Wishes,
Peter
# Posted By Peter Bell | 7/11/06 10:40 AM
This is exactly what has bothered me about the DAO/Gateway dichotomy that has been espoused by the CF OO-crowd. The problems are exactly as you described.

In Java, you almost always return a collection of objects - it will scale well, even into 1000s of objects. You would drop down to the raw recordset set if performance became a real issue. All the Java ORM wrappers always return objects, for example.

I've taken the opposite approach - everything returns a query. This allows us to scale up to display 100s of records of a time and use the same data type on search results as on 'details' pages.

For methods like 'age' we have a Utils-type CFC to which we can pass date of birth and get back an age.

Glad I'm not the only one, anyway....
# Posted By Tony Blake | 7/11/06 12:27 PM
Hi Tony,

Thanks for the comment! Initially I was going to return everything as a query, but as I'm leaning OO I figured I should try it the OO way!

The truth is that for small team development, I'm not sure that it matters much whether you have a set of functions and structs or clean OO classes encapsulating data and the methods that operate on the data. The only real benefit of OO that I see (without a compiler) is that it FORCES you to write more loosely coupled code as if your attributes are private, it is impossible to modify them all over the place. There isn't really any magic that couldn't be done in a pure procedural language - its just less likely you'd DO it in a procedural language.

I do want to point out though that unlike in java, I'm looking to return a single object with an iterator (flyweight pattern) so I only have to instantiate a single object but have all of the benefits of encapsulation of my getters and setters.

Best Wishes,
Peter
# Posted By Peter Bell | 7/11/06 12:48 PM
One approach I've used before is to have all Beans provide a method which returns the Beans state as a one-row query. It's not really much different to returning the properties as a struct, and can be done easily via the getMetaData() introspection facility. The only gotcha to beware of is that you then HAVE to expose properties with <cfproperty> or getMetaData() doesn't pick them up.

However, provided you follow that rule, you can implement a getPropertiesAsQuery() method in a base GenericBean class, that all Beans then extend.
# Posted By Al Davidson | 7/11/06 1:02 PM
I've wrestled with this issue myself and after taking a .Net class I came to realize what I was taught in there really made sense.

Your DAO and business rules should apply to a single record. Basically the way I do it now is to have all my CRUD in a dao and pass beans back and forth to the DAOs. This makes it very easy to add columns to the underline tables and reflect them within my application. Also this makes server-side error checking and handling a breeze.

Now as for retrieving 20 records at a time and displaying them, I use a query. I'm not going to sit there and created 20 beans, throw them into a structure and then loop over a structure, when a query is less code and easier to handle. In the .Net class we were shown this exact way, use a query and a DataGrid. I guess if you really wanted to you can create a CFC that stores all these "multi" queries and call them from there.
# Posted By Tony Petruzzi | 7/11/06 1:03 PM
Hi Al, That is definitely an approach, but I'm not sure it covers the case of calculated properties where the getters are not just returning an attrbute but are actually calculating it.

Tony, this is actually the approach I can't make sense of. Firstly, what about the use case where I have a view or form that should work with both 1 and n-records. If I use a bean for one and a query for the other, I'll need two sets of display code or a translator like Al suggested. I just want to use #user.getFirstName()# throughout my view templates irrespective of whether I'm displaying and returning 1 or n records.

Equally, how does your query handle calculated attributes? Let say you want to display FirstName and Age in a table for 20 records. You already have a getAge() in a bean that gets the calculated age based on the DateofBirth field (you're not usually going to store age in a database). If you use a query to access the 20 records you're going to have to duplicate the getAge() code somewhere - either in your SQL or some place else. I don't like the duplication of code.

I do agree with you that you shouldn't create 20 objects. You just create a single object with an iterator that allows itself to expose all 20 rows of data with only the cost of instantiating a single object.

Make sense?

Best Wishes,
Peter
# Posted By Peter Bell | 7/11/06 1:32 PM
Tony is spot on in his evaluation - why create all those objects when you're only going to use a few fields and throw the rest away? And you indicated this in your initial post (performance).

Of course, you can have the best of both worlds - you can have a gateway proxy and use AOP to convert a query to an array of objects if you need to, avoiding the need to duplicate code in views...
# Posted By Sean Corfield | 7/12/06 1:10 AM
OK, your blog is SERIOUSLY broken!!

The repeated posts are because it kept telling me my captcha was wrong but clearly it's posting the comments anyway.
# Posted By Sean Corfield | 7/12/06 1:13 AM
Hi Sean,

Bizarre - I haven't been having any problems with the captchas (apart from not being able to get them right half the time). But all WHAT objects?

My plan is to create a single object which exposes the 20 or 50 or however many records using a flyweight pattern. So I'm only creating a single object (per request) - no need to ever create an array of objects.

I get the benefit of getters so that I can use calculated as well as persistent fields without any duplication of code, I can use a single set of views for single and n-record displays (all use the #object.getProperty()# notation rather than some using that and others using #Property# notation.

Am I missing something here? Two people seem to think it's a bad approach but I'm not understanding why!!!

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 6:06 AM
Hi Sean,

OK, got rid of the duplicate comments.

I should also clarify, I'm not necessarily going to fully load the objects (and certainly not their associated objects). I'm planning an AttributeNameList parameter to the getbyFilter() method and the get() method which will allow you to take advantage of the syntax of the getters and the power of the calculated fields, while allowing you to only load the fields you need into the object. So, if you have a user with 40 fields, you will be able to just load firstname, lastname and dateofbirth if all you want to display is firstname, lastname and age (calculated from dateofbirth).

That way the object will require no more db traffic that a query and will better handle invalid field names as if you try to display ffirstname (note typo) or a field that hasn't been loaded (say address1) it won't crash and burn error but will display on the page and/or log or notify with an appropriate message such as "ffirstname not valid attribute" or "address1 not loaded". And it's really so bad to create a single object per request? Sounds like premature optimization to me!!!

Feel like I'm missing something based on comments, but still not quite understanding what!!!

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 6:19 AM
Not sure if I fully understand either but when I work with S.A.M. on my personal project I used generic get and set functions and they work great. From your last comment Peter I would find something like that extremely useful that way I don't have to duplicate a function to return a smaller amount of fields from the users table.
# Posted By Javier Julio | 7/12/06 1:40 PM
Re: Captcha - I get an error saying the captcha text was invalid when I Post, even tho' the text is correct and the comment *does* Post...

Re: the flyweight: interesting approach... for a traditional master/detail app, the master (list) page is always going to iterate over all of the records so I don't actually think you'll save anything. You'll still pay the cost of creating the objects - you're only saving setting some data. However, if you're only displaying 20 records at a time, that's not a huge overhead I guess.

However!

I do have serious concerns about the integrity of objects in your model: you're allowing objects to be created that aren't guaranteed to be "valid" because they don't contain a complete set of busines data. To me, that just "smells" bad. Hard to build concrete arguments against but my gut says it's the wrong way to do things :)
# Posted By Sean Corfield | 7/12/06 8:51 PM
I've weighed in on this before, basically I think of it like this:

A recordset is an aggregation of 1:N objects.

We use it to display an "object listing".

It doesn't bother me that my "list" view knows how to interact with the recordset. My "list" view is 99 times out of 100 completely different code than my "display an instance of this object" view. That's the typical reason people squirm about the DAO/Bean/Gateway approach and I think in the real world it's not *that* big a deal.
# Posted By Dave Ross | 7/12/06 9:08 PM
Hi Sean,

Sorry about captcha error - I can't replicate but will keep an eye on it.

Re: the flyweight, I think there is still a misundertstanding. I have created a business object with a loadQuery() method that takes a recordset and saves it to a structure of structures - autonumbered within the business objects variables scope. I set variables.Iterator = 1.

If you get("FirstName"), it checks that FirstName is an exposable attribute. If it is, it looks to see if it has a custom getter. If not it just returns variables.structure[variables.iterator][FirstName]. next() increments variables.iterator (if it is less than variables.recordcount), isLast() returns 1 if variables.iterator = variables.recordcount.

That's it. Very little code, only a single object instantiation, and all of the benefits of support for getters and setters. And to boot, this is all in the baseObject, so you just throw a little metadata (which will be generated) and any custom getters or setters into the entity specific bean and you're done!

I'm really not coming across any weaknesses and am getting great results with almost no code. Waiting for the dark side to reveal itself :->

As for the partially loaded objects, I've got to say it smells just fine from here. I'm not going to partially load a user or a shopping cart but if I'm going to list 20 article titles on a page or the prices of 40 products I want to be able to use getters for calculated properties but am not going to fully load 40 records with junk I don't need. I just don't see when I'm going to take a list of 40 products to display on a screen and then suddenly decide on the way to the screen to add them to a cart or do something else that needs the fully loaded objects any more than you're going to loop through a query in your display code and then call a getrelated method against one of the properties of the query. We're talking about transitory request scoped eye candy - just with support for private data and calculated fields using the same code as for when we're using those object more heavily. Also, because of the generic getter facade, even if you call a method that is invalid, it'll catch that and handle it gracefully in the generic getter rather than you having to put error checking code in all of your callers. Going back to base principles this one smells pretty good to me, but we'll see what results a little more milage brings out!

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 9:11 PM
Hi Dave,

Thanks for the post! Hope all goes well?

I guess what makes my use case a little unusual is that I'm trying to create something that will generate almost anything with almost no manual coding. I have come across a number of use cases for the same display for 1 and n records. The most obvious example is dynamic, general forms and (to a lesser extent) views where you use a field list to drive the subset of the display code to exercise (cfloop containing cfswitch) and where you use the same code to display a page to edit 12 prices or a page to edit all of the properties of a single product. Equally, the same display code will display everything about a user or first and last names for 10 users (form reuse is more common than view reuse). For the price point I play at people are more than happy to use such code for most of their back end admin screens.

The big problem I have with recordsets is the lack of getters. My classic example: you have a users table with FirstName,LastName and DateofBirth. I want to display a list of users with FullName and Age (the object attributes NEQ the record properties). Equally, when I view a single user I want to be able to see FullName and Age. The only way to avoid duplication would be to put all of my calculations into the db which I'd rather not do (and some calculations that are easy in CF are much harder to do in SQL). This may seem minor, but for automatically generating compelx use cases with lots of calculated fields I'd like to be able to just generate the getters and use them for 1 or 0..n record interactions.

Just tryin' to keep it DRY!

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 9:22 PM
You said:

"while allowing you to only load the fields you need into the object."

That sounded like you were planning to create objects and then only partially populate them (bad).

If all you're really doing is providing a getter-based iterator wrapper to a cfquery, that's OK (although I wouldn't bother - I don't see you're getting any benefit).

Perhaps you're not explaining your plan clearly enough for me to understand? :)
# Posted By Sean Corfield | 7/12/06 9:31 PM
Hi Sean,

I'm clearly not explaining my point well at all! I'm going to try another post and will send you the link!

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 9:34 PM
OK guys,

Here's a new post - perhaps this makes the position a little clearer?

http://www.pbell.com/index.cfm/2006/7/12/Objects-C...

Best Wishes,
Peter
# Posted By Peter Bell | 7/12/06 10:11 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.