By Peter Bell

Generating 80% of your applications: Step 1 - Implementing Custom Data Types in ColdFusion

I love the idea of scaffolding. The more routine work that the computer can do for me (especially in terms of CRUD), the better. But the biggest complaint most people have who have used scaffolding is that they usually have to replace it for production because the code is too naive.

If you look at most of the reasons you have to replace scaffolding it is simply because you know that a field is a phone number or an age rather than a string or a date time (which is all the database knows) and you typically want to apply more specialized form fields, display rules, transformations and validations. The solution is to improve scaffolding with Django style custom data types rather than Ruby on Rails style database metadata.

So, why not have a system that just allows you to describe custom data types in terms of their form fields, display rules, transformations and validations and then simply define all of your object properties using those custom data types? That is exactly what I have been doing for the last few years in my application generator, and I’m currently in the process of trying to upgrade the way this works to make it a little more OO.

There are five problems you need to solve to implement custom data types for HTML based applications: form field display, formatted display, form to value transformations, data transformations and validations.

Form Field Display
Each data type needs to be associated to the appropriate HTML to display when you want to add or edit the field. Some will be a simple text box or another standard HTML form field. Others might be a drop down javascript calendar or a WYSIWYG editor. Still others will be multiple form fields such as a US phone number comprised of three text boxes or a date time with a calendar for the date and drop downs for hours, minutes and AM/PM. There are plenty of arguments to be made for the “right” kind of fields to use for different data types, but imagine if you changed your mind just having to change one library “calendarHTML()” method to upgrade the calendar for all of your date/times across all of your applications?! (You could of course have multiple types of date times each using different displays or validation rules.)

Formatted Display
Often the way you want to store and manipulate a property and the way you want to display it are two different things. For example, I sometimes find it useful to keep a phone number as a string of integers but to display it formatted with brackets or hyphens or whatever. This is not a matter of HTML formatting, but rather of adding or removing characters such as displaying a credit card number as X’s with the last four digits and therefore is presentation technology agnostic. It seems to me that custom data types should be capable of supporting one or even multiple formatted display options.

Form to Value Transformations
Anyone sick of concatenating three single phone number text boxes into a single string manually? What about catching check boxes that don’t exist? What about concatenating date and time values into a single well formatted date/time? Wouldn’t it be great if each custom data type could handle turning its form fields into a single value which could then be transformed and validated based on business rules?

Data Transformations
Form to value transformations are potentially display technology specific. Data transformations are not. I often like to be able to transform numerics or bits or phone numbers using RegExs to remove any invalid characters. Such transformations are display technology agnostic.

Validations
While it is not the only type of validation required, it is important for custom data types to handle property specific validations. For instance, age needs to be not only well formatted but also within a reasonable range (or even within a specific age range if minors cannot access your site or if your insurance offering is for the under 35s only).

So, form field display, formatted display, form to value transformations, data transformations and validations are the five types of code that can be profitably associated to custom data types. But where should that code go?

What Goes Where?
One of the things that makes the implementation of custom data types problematic is that they are a concern that cuts across Model, View and Controller. I find the easiest way of untangling this to imagine having both a HTML and a flex front end (which requires a leap of imagination as I’ve never even fired up the Flex IDE, so bear with me!).

Creating form field HTML is clearly an HTML view function that would be unnecessary in a Flex application, and it is used to generate HTML to display or the screen. So form field display is part of the View.

Formatted displays (as I describe them above – not including HTML formatting) are (to me) part of the model as they are display technology independent. You could argue that they are part of the view as they handle display formatting issues, but as I might want to centralize the ability to (say) format phone numbers with brackets before sending them to either a Flex app, a .NET web service consumer or an HTML page, I think it is a more pragmatic choice to allow this kind of formatting (no HTML allowed!) within the model.

Form to value transformations (as I am defining them) are again limited to HTML applications and seem to me to be part of the controller whose job is to handle and abstract the URL, form and (possibly) session scopes and to provide well parameterized calls to the model.

Data transformations and validations are again display technology and even controller agnostic and it seems to me that they should reside firmly within the model.

So, the next question then is how to architect these features as a maintainable reusable, scalable (in terms of complexity) set of classes.

Approaches to the Architecture
One approach to the architecture would be to divide it functionally into classes handling each of the five concerns. Another approach would be to have a singleton for each custom data type. Neither approach is ideal. The singleton per custom data type is more OO, but means that HTML display code and model validations would all be in one place, so the classes wouldn’t be very cohesive. The functional division works better from a cohesiveness perspective but leaves you with five different classes that need editing every time you want to add a new custom data type leaving a real possibility of forgetting to add one of the five methods required for a new custom data type.

Inheritance Preferred
One other thing to throw into the mix is that it would be great if the architecture supported inheritance. I can absolutely see a Date of Birth custom data type extending the Date custom data type so it can take advantage of the base date validations for well formedness while still being able to add its own validations for a subset of dates (a date of birth before say 1870 is probably invalid for applications relating to living people).

This suggests that a “singleton per custom data type” might make sense as it would allow us to take advantage of the “extends” attribute rather than having to write our own pseudo-inheritance engine.

Model, View Controller . . .
I’m a big proponent of MVC architectures with a clean separation of these concerns, but custom data types have me stumped. They are fundamentally a concept that cuts across the Model (e.g. validations), View (e.g. form HTML display) and Controller (e.g. form to value) and sprinkling pieces of custom data types across three different areas doesn’t seem to be in the best interests of maintainability (which is the main underlying driver for OO design patterns).

. . . Or Something Different
As a provisional design choice, I’m going to add a fourth directory to my com (components) directory called “datatype”. It seems to me that the benefits of being able to encapsulate the complexity of custom data types (and thus substantially simplify the generation of model, view and controller code) outweighs the downsides of having to communicate a non-standard approach.

I am going to create one singleton per custom data type with an inheritance tree, and to avoid having both validations and HTML display code in a single class file (which to me is extremely un-cohesive) I’m going to use composition with each custom data type just making a parameterized call against a library (transformations, validations, etc.) to improve reusability of algorithms, cut down on the size of the code base and make each class a little more cohesive – in its own way. So, I’ll have a library of reusable transformations, validations, form fields and the like that each custom data type can use to handle its various requirements.

I’m going to put all of the data type specific “libraries” into com.datatype.lib and that will leave me with a partial directory structure that might look something like:

.com.datatype.Text.cfc
.com.datatype.Int.cfc
.com.datatype.Age.cfc
.com.datatype.ContentArea.cfc
etc.
.com.datatype.lib.DataTypeValidation.cfc
.com.datatype.lib.DataTypeTransformation.cfc
.com.datatype.lib.DataTypeFormHTML.cfc
.com.datatype.lib.DataTypeDispayFormat.cfc
.com.datatype.lib.DatatTypeFormtoValue.cfc

* I know the “DataType” part of the names is redundant, but a few extra characters is a small price to pay (for me) to have completely unambiguous class names and to know exactly what class file I’m working on in Eclipse even when I can’t see the directory it is in.

I know I am going to end up with a lot of very small singletons here and I may just refactor this to a set of generated case statements within the DataTypeService, but if I did so that change would be contained within the DataTypeService, and I have a feeling that the large number of simple singletons might not be such a problem – especially as I get the ability to inherit which seems to fit the problem domain very well.

So, any thoughts? While I am reluctant to come up with this “fourth option”, I think that the benefit of encapsulating the concept of custom data types outweighs the additional documentation required for a non-standard approach. Just to mitigate the risk, I think I’m going to encapsulate probably variabilities in how I implement this with a DataTypeService singleton so you can call DataTypeService.IsValid(“Age”,ThisAge) and the like. That way any changes in the implementation will be abstracted from the rest of the application, and if this ends up being “one redirection too far” from a performance perspective, I can always simplify the architecture later on. In particular, once I get back to generating the OO code, I can always refactor the templates to make the system more performant if required.

What do you think of the approach?!

Comments
excellent post!

in the RIFE (http://rifers.org) community, we do something similar through constraints on the business objects; I made a post related to this in response to something Aral posted a few weeks ago (http://eokyere.blogspot.com/2006/10/as-dry-as-it-c...)

The idea is that you keep your business objects as clean as possible, and implement some framework-specific interfaces that let your framework have a way of defining useful meta-data (constraints) that can be propagated through the different layers of your system, and based on context be able to do certain things.

for instance, the RIFE constraints allows the framework to generate your db relations, generate your html forms, do content transformations, do validation and send meaningful customizable errors back to you (where any). with the exception of error message properties and db generation, everything else happens at runtime, so there's no throw-away (scaffolding) code to deal with.

once again, excellent post.

cheers,
-- eokyere :)
# Posted By Emmanuel Okyere | 11/1/06 9:30 AM
Hi Emmanuel,

Many thanks for the link - looks like a very interesting framework. Will definitely spend some time checking this out!
# Posted By Peter Bell | 11/2/06 10:23 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.