By Peter Bell

Working with Metadata

I think I finally realized on Monday night that my application generator was going well. I am doing a pretty big rewrite of the generator to fix a couple of design decisions I don't love and was trying to get a project done on the new code base. I had just completely changed my metadata model, so I had broken pretty much every line of code and with time running out I realized that I was never going to get the project done before my all day class on Tuesday. I took my metadata description for the project, entered it into the previous version of my framework and within 15 minutes I had a rich admin system with roles and authentication up and working for a bunch of custom objects.

I feel like I've solved the core problem of being able to generate rich applications using metadata (actually, I interpret the metadata using a framework rather than generating code right now, but the outcome is pretty much the same), but I don't feel like my system is flexible enough to handle big changes in the structure of my metadata or my approach to consuming it. The application knows much more than I'd like it to about how my meta model is structured and that's something I really want to fix . . .

One of the biggest choices you have to make when metaprogramming is how to store, handle and interact with your metadata. There are a number of approaches. In my case I need a pretty flexible solution. I could get metadata programatically (say through db or class file introspection), via XML files, as user input from a form based content management system, via a tab delimited file, etc. I could also potentially need to merge metadata from multiple sources so (for example) a user can take db introspected metadata and extend it with custom annotations in such a way that re-introspecting the database wouldn't destroy any of the extended metadata.

I don't know if it is meaningful, useful or possible to solve this for the general case for any grammar, so I'm going to start by looking at a way to solve the problem for the fairly constrained classes of grammars that I find myself working with.

When I describe how an application should work, there are features which have attributes (name, title, package, etc.) and a collection of actions (a feature is a controller and an action is analogous to a method within the controller) each of which has its own type of attributes (a list action obviously requires different parameters than a form process action).

When I describe the business model, I have business objects which also have attributes (name, title, db table name, ID property name, etc.) and which have collections of class methods (deleteExpiredCarts(), validateUser(), etc.), instance methods (getAge(), getFullName(), etc.), relationships and properties. Class methods are simple elements with attributes, I don't model instance methods (I just hard code them in the beans as I don't have a good DSL for describing the kind of imperative coding that is required for these and am not sure it would make sense to develop one) and relationships are also simple elements with attributes.

So, to this point we have the idea of a base element which has n-attributes and contains one or more collections of elements which also have n-attributes. In the case of properties, things get a little more interesting. All properties have a certain set of attributes (title, name, column name, custom data type, SQL data type, etc.), but each custom data type can also have a collection of attributes that are unique to the custom data type. A custom data type for a simple string is probably going to include optional properties for overloading the size element of the text box used to display it, whereas a custom data type that includes a drop down for displaying a list of values is going to need attributes to allow you to customize whether or not it is a multiple select, if so, what the size is, what the value list is and so on.

There are two approaches to handling this. From an XML perspective, the "right" solution would be to create a set of sub-elements that can be nested within the Property element (one for each custom data type). The benefit of this approach is that a Schema could be written to validate all of the attributes of each of the different elements making the validation very easy to perform. The strength of that is the ability to use any XML tool to richly validate the metadata and to provide for code hinting and the like were you to want to manually create or edit XML. The weakness is that it requires a more flexible grammar which makes generalized processing and interacting with the grammar more difficult. The other approach would be to just allow a knowledgeable user to put whatever additional attributes they needed into each property tag. I like the simplicity of the grammar, but I know already typing this that it is a non-starter, so at the very least my constrained grammar is going to have to support the concept of an element having attributes plus n-collections of elements and those elements having support for both attributes and probably n-collections of elements within them. It will also have to support the idea of links between models (the list of possible objects for a list action would have to pull the list of object titles from the business object model).

I have already made some simplifications compared to (for instance) the set of valid grammars in an XML document. I have removed the distinction between XML text and attributes. This is something I can live with as for my use case I can express my text here quite happily as . I get the distinction between the two in the XML world, but the distinction is not one that I need for my use case.

So, the simplest collection of grammars that will allow me to appropriately express my applications is n-models where each model is comprised of a collection of one or more elements each of which can have attributes and n-collections of elements (where each collection supports n-elements representing types of things - has-one and has-many could be types of relationships, persisted and calculated could be types of properties, and so on) which can also have attributes and n-collections of sub elements.

I'll obviously need to decide whether to support just three levels of depth or n-levels (if n levels using recursion is as easy it might be nice to have the flexibility - we'll see), but right now I'm a lot more interested in how I'll store and work with this data, but I think that should be another post.

Comments
BlogCFC was created by Raymond Camden. This blog is running version 5.005.