By Peter Bell

Approaches to Validation and Allowing Invalid Objects (responding to Jared part 2)

In Jared’s recent great posting, he talked about how he had argued in the past that beans shouldn't have a this.validate() method. I just wanted to take a little more time to look at approaches to validating objects and architecting these kinds of solutions.

A Real World Use Case
Often you want to take user input (usually from the form scope) and to validate and then (if valid) persist that input. For example, I want to add a new article to my website, so I fill out a form. It is then validated on the server (even if you have client side validation you need to include server side validation in case the user has Javascript turned off or is using a plug-in to play with their form submission values). If it is invalid, the form is redisplayed - pre-populated with the content I had entered along with an error message. If it is valid, the new article is saved to the database and I’m redirected to a new screen (perhaps an article list screen) along with a success message.

Anyone who has written a bunch of web apps has probably written such a system a hundred times over. The question is, how do you want to architect this? If you’re implementing an MVC architecture, your controller is going to want to take the form scoped variables and pass them (probably passing the form scope containing all of the variables) to your model. But what should your model do? At one extreme you call ArticleService – an application scoped Singleton. It has a save method which you pass a struct containing a collection of values including the user input. So you call variables.ArticleService.save(form) in your controller (assuming you’ve auto-wired ArticleService using ColdSpring or LightWire). ArticleService.save() then validates the values and passes them to a “DAO” that saves the values to a database (I put quotes around DAO as part of the definition of a DAO is that you pass it a loaded object, not a set of properties). At the other extreme, your controller somehow gets a transient Article business object (either directly from an object factory or via an ArticleService.getArticle() method), loops through all of the Article setters or uses an Article.load() method and lets the Article business object take things from there.

When you break it down there are quite a few design decisions to consider here. The first is whether you need an ArticleService at all.

Do we NEED Service Methods?
Service methods aren’t necessary for building web applications. Go look at most Smalltalk apps or the majority of Ruby and Python apps and you’ll see that there are whole populations of web programmers delivering well-designed Object Oriented web applications without using service classes at all. So, clearly we don’t NEED service classes to write good OO apps.

Typically the "right" OO approach is to look at collections of one domain object as another domain object. For instance, in the latest Head First Object Oriented Analysis and Design book (great book, but learn the basics of OO programming first – they suggest Head First Java as a good starting point), they have the example of a Guitar object in a guitar store. They model the collections of guitars using an Inventory domain object, which can return collections of guitars, etc.

The other possible approach is to simply call a transient business object (which does raise the question of where you get it from – there are a number of interesting possible answers to this) that represents the domain object – either singularly or in collections.

However, if you look in the Java world you will see large numbers of enterprise applications using service layers, so there is nothing wrong with service classes – as long as you don’t go overboard with them. I personally find the concept of an ArticleService a useful way of interacting with collections of articles and encapsulating a number of article responsibilities that I don’t love putting in the Article object, but I freely admit that as I get more OO experience (especially from the Smalltalk infected – the Kent Becks and Martin Fowlers and the like) I may change my mind on this.

Loading the Business Object
The next design decision is how to load the business object. We have a struct (the form scope) containing a number of properties some of which need to be loaded into the business object. This can be handled by the Controller, the ArticleService or the Article business object. There is no one right answer, but there are some interesting implications of the different choices.

The Controller could certainly load the business object calling the necessary setter methods, but how would the controller do this? It either needs to know the list of properties to be loaded, it needs to ask the Article business object for properties to be loaded or it needs to use reflection to load all properties that match.

Asking the Article what properties to load isn’t really a great approach based on the “tell don’t ask” theory. If you don’t know what properties to load, just pass them all to the object and tell it to go load the right properties itself. Reflection works as you can either loop through all of the Article methods looking for ones beginning with “set” and comparing them to the names of the properties in the struct (matching ArticleTitle to setTitle() and so on) or loop through all of the properties in the struct and look for the matching setter using an if structkeyexists on the Article methods. The other approach is to provide the Controller with a list of properties to set for the Article. At first this doesn’t seem very DRY as you’re repeating a subset of the settable property list for the Article business object in both the Controller and the article but it does provide an added benefit that you can limit the properties that a given form can edit. Unless the Controller KNOWS that this particular form only allows you to edit ArticleTitle and ArticleDescription, a hacker could add a form.ArticleStatus field and use an edit form to automatically publish a new article by setting its status to approved – skipping the usual approval process. Because of this I DO actually tell each form controller the list of object attributes it is allowed to edit. It is a little less DRY, but a little more robust.

Of course, just because we want to Controller to limit the properties to be edited doesn’t mean it has to load the object. It could quite easily loop through the list of attributes to save, pull each out of the form scope and then pass them (individually or within a struct/hash table) to either ArticleService.save(PropertyStruct) or even Article.load(PropertyStruct). So, who should load the bean?

On the whole I like to look at the service layer as an API to the model – for both HTML controllers and other consumers like web services, Flex and AJAX. So I tend to prefer the controller just calling ArticleService.save(). I have an ArticleService singleton and it seems to work out quite well for me. Anyone else doing anything different here?

The question then is how does the ArticleService go about saving? Personally I am moving towards it just calling THIS.getNewArticle() and then looping through the passed struct of properties, calling each of the associated setters so any business logic for the setters can be nicely encapsulated in the business object (I use my IBO with generic getters and setters so this doesn’t involve too much extra typing when creating business objects with lots of behaviorless setters).

Who Validates What?
The next question is the key to Jared’s original suggestion in the past that maybe business objects shouldn’t have a validate method because they should never BE invalid. The question is what is the purpose of a setter? Is it to validate an input and return a validation error if the value is invalid or is it to set a value (which may or may not be valid) for the objects validation routine to later determine whether it is valid or not? FYI, there is an interesting old thread around this topic at TSS.

The first point I’ll make is that however we answer this question, we still need an Article.isValid() method. Why? Because a set of attributes can individually be valid but as a collection be invalid – 10004 is a valid ZIP code – but not if you live in Canada. This actually shows that an isValid() method IS required by business objects in the general case even if you implement setter based validation.

I don’t have “the answer” to this question. In practice I like to use dumb setters to capture the input (valid or otherwise) and then separate validation to validate the values. This works particularly well for me as I use custom data types with pre-defined validations associated with them so in practice I never have a need for custom setters (I define my attribute level validations at a custom data type level so I don’t have to keep writing a ZIP code validation script for all of my objects that contain ZIP codes). It also works well as it allows me to use this same business object to populate the data values in the form if I have to redisplay it so I have the option of showing my users the invalid data they typed without having to write any extra code to do that.

OK, this posting is getting way too long, but I look forward to learning what other people think about this and how they architect such solutions.

Comments
"The first point I’ll make is that however we answer this question, we still need an Article.isValid() method. Why? Because a set of attributes can individually be valid but as a collection be invalid – 10004 is a valid ZIP code – but not if you live in Canada. This actually shows that an isValid() method IS required by business objects in the general case even if you implement setter based validation."

That depends entirely on the extent to which your validation validates... what if your setZip() method actually takes care of validating not just the fact that the string is in the right format, but that it's appropriate for the context in which the value is being used? If your setter makes sure that not only is the format and length and so on correct but that it's an appropriate zip code for the rest of the address, you don't need any other validation.

That's my whole point... if we're going to provide rich and valuable functionality in our objects, then shouldn't they do the whole job? Delegation can come into play here, with a setter actually asking a foreign entity to make sure that the zip code is good (why can't setPostalCode() actually call a singleton that handles making sure a zip code is valid??)

I don't know how I architect these systems, because I've never taken this approach to application architecture before... but it's got my juices flowing and I'm excited to see exactly what comes from it. I'm hoping to leave the "dumb bean" in the past and really architect my applications well in the future. We shall see what comes of this...

J
# Posted By Jared Rypka-Hauer | 12/29/06 1:30 AM
Hi Jared,

In my experience it is easier to have two levels of validation - attribute specific and object level. If you don't, you get into a world or hurt where you need to load up the country before you can load the ZIP and now your controller or service needs to know the right ORDER to load the attributes in. For complex beans I find this just makes things too fragile when there are lots of complex dependencies. Also, sometimes it is not clear which attribute to put certain validations in. Lets say the sum of attribute 1, attribute 2 and attribute 3 must be between 7 and 12. Who the heck is responsible for that rule?

In practice I've found distinguishing between attribute and object validations ends up providing more elegant solutions, but it is always worth questioning/playing with everything - you never know what you might come up with!
# Posted By Peter Bell | 12/29/06 1:43 AM
"However, if you look in the Java world you will see large numbers of enterprise applications using service layers, so there is nothing wrong with service classes – as long as you don’t go overboard with them."

Typical java web applications shouldn't be used as blueprint OO applications imho. What they do is enabling a bunch of code monkeys to fill in procedural holes. Even if the design makes sense from a economic point of view (cheaper programmers and much more comprehensable than a bunch of jsp pages), it is not OO in the classical sense. (see Arthur J Riel "Object-oriented Design Heuristics" - "be alarmed if you see lot of objects named ...Manager)

i love your discussion though.
# Posted By john miles | 12/29/06 6:54 AM
Hi John,

Glad you like the discussion!!! I agree 100% that Java isn't where you should look for the best OO code - it is Smalltalk - that is where most of the people who wrote the books on Java got their start. Ruby is the closest runner up. That said, pure OO is hard and slow. If you're read Evans on Domain Driven Design he keeps focusing on taking the time to really "get" the domain which is lovely if you have that much time. I've found a service layer to be a useful pragmatic approach to writing maintainable code. After all, a lot of huge enterprise apps have been written in Java and maintainability is more important than purity :-> Good point though!
# Posted By Peter Bell | 12/29/06 8:28 AM
"This works particularly well for me as I use custom data types with pre-defined validations associated with them so in practice I never have a need for custom setters (I define my attribute level validations at a custom data type level so I don’t have to keep writing a ZIP code validation script for all of my objects that contain ZIP codes). It also works well as it allows me to use this same business object to populate the data values in the form if I have to redisplay it so I have the option of showing my users the invalid data they typed without having to write any extra code to do that."

Peter - I'd be interested in seeing an example of this if you could throw one together?
# Posted By Brian | 7/26/07 1:10 PM
Hi Brian,

Unfortunately, this is pretty much at the center of my in-house framework so it depends on so many different pieces of code (in a good way) that there isn't an easy way to throw together an example - sorry!
# Posted By Peter Bell | 7/27/07 5:01 PM
Hi Peter. Good articles in your blog by the way and look forward to Scotch On The Rocks 2008!

You said...

"(I define my attribute level validations at a custom data type level so I don’t have to keep writing a ZIP code validation script for all of my objects that contain ZIP codes)."

I'm new to OO so just getting my head around it.

I am refactoring an application where I have back-end users with addresses, website subscribers with addresses and also companies with addresses. I'm leaning towards composition by composing my User, Subscriber and Company beans with an Address bean that validates itself.

Is this what you're getting at with your comment or are you suggesting a different approach (ie some Utilities or Misc UDF class that has email validators, zip code validators etc ??)

Alan
# Posted By Alan Livie | 9/4/07 8:46 AM
Hi Alan,

Glad you're enjoying! In general terms, the classic OO solution to the problem you're describing would indeed be to add an Address bean which would be capable of self validation and that's what I'd recommend doing. You can then decide when implementing the Address bean whether or not to call a separate utility library to do the heavy lifting of the validations or not. I find there are enough similarities in validation code that it makes sense to have a parameterizable Validation bean which objects like Address and User can call to implement their validation code, but to the outside world it should look like the Address and User both validate themselves, and if you don't need the added complexity of the validation bean, don't use it - you can always refactor later.
# Posted By Peter Bell | 9/4/07 9:19 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.