Metadata for Code Generation
This posting looks at different approaches to working with metadata . . .
Sources of Metadata
The first question is where your metadata will come from. You might get a quick win by introspecting a database schema or your class files to generate metadata, but generally if you want to generate a substantial part of your application, you're going to want to make the models first class citizens and edit them directly.
As such, the model is where you'll be doing most of your "programming" whether by creating XML files, using a content management system or using a graphical editor like those in oAW (Eclipse EMF based), MetaEdit+, Microsofts DSL tools, or - if you're a real masochist - UML which generates XMI (based on the rather ponderous MOF meta-grammar).
Concrete Syntax - Editing
So, the source of metadata is probably going to be you. The question then is what you're going to be editing. You could use a textual editor for editing a freeform language (if so, check out XText in oAW for generating Eclipse plug-ins for editing your DSLs with constraint checking, syntax highlighting and code completion - it's pretty cool). You might also use a graphical editor that'll either spit out some form of XML or that'll have tooling for generating code (which you can either use for true code generation or just to serialize the model to XML as an input to a separate code generator). You could also use some kind of content management system for managing the metadata. Or of course, you could just define an xsd for your language and use an XML editor to edit your DSL.
Concrete Syntax - Storage
The concept of "projections" is really important. There is no necessity to use the same concrete syntax to edit, store and process your DSLs. For example, you might edit your DSL using a visual editor, store it in a database, and then process it as an XML file or an in-memory object model representation. So remember that the projections you choose for editing, reporting, storage, processing and transformations can be completely different as long as you can automatically transform between the various representations of your underlying statements.
In the case of storage, if you have large numbers of statements, a database can make for efficient re-use of the statements. Otherwise, XML is often an easy structure for storing metadata as it allows for source code check in, transformations using XSLT and it's widely readable and processable.
Concrete Syntax - Processing
Once you've decided on syntaxes for editing and storing your model, the next question is what you're going to pass into your generator for processing. For some simple use cases, you might just query a database and use cfoutputs in cfgen to provide the necessary metadata, but most often you're going to want to be able to walk your model graph to some extent so you're either going to want to pass in an XML representation of your model or an object model using objects that represent the model.
What about CFGen?
So, the question then is what kind of metadata should CFGen support. On the one hand I want to make CFGen extremely flexible, on the other hand I want to provide some tooling for getting people started quickly with common metadata sources (XML files and database queries) and provide both an object model and XML representation of XML files automagically (using CF8's oMM()).
Any of the above making sense to people? What kind of representation do you use to store metadata?


