By Peter Bell

Playing with Stub Pages

The idea of a stub page is a simple file that sets some page specific properties before then including a front controller style index.cfm. So, if you go to about-us/history.html (with ColdFusion enabled to process .html files) the stub file might look something like:

<cfscript>
   Page.Title = "History";
   Page.Name = "history";
   Page.SectionName="About Us";
   Page.SectionName="about-us";
   Page.FilePath = "about-us/history";
   Page.Feature = "PageDisplay";
   etc. . . .
<cfscript>
<cfinclude template="/index.cfm">

Originally, page stubs were just a way to avoid problems I was having getting URL rewriting to work on IIS back in 2001/2002, but now as I revisit them, I see them as a potentially interesting way of decoupling the process of generating URLs from the process of consuming them . . .

When you just want one page for each record in your database, there is no real reason to use page stubs (other than as a backup if you can't get ISAPIRewrite to work!). Just use URL rewriting to turn /a/b/c.html into index.cfm?filepath=a/b/c and then Page = PageService.getPagebyFilePath(filepath).

However, what happens if you want your catalog feature to be able to generate URLs for each category and product or your events feature to be able to create URLs for every event? As you end up with n-features - potentially written by a number of different developers, getting the URL rewriting to map a URL to the appropriate resource could get pretty hairy and you'd have to have some kind of lookup for each object supporting filepath based access. You could use a RESTful approach, but that is a technical convention that doesn't always meet the needs of users to have the new product available at /new-category/special-subcategory/new-product.name.html rather than at just products/new-product-name.html.

By using page stubs, all you do is write a page stub generator for each feature that requires one (if it has multiple pages that each need to appear like a different html file - if you only need query string parameters to access a resource in that feature then this is not relevant). They all extend a core stub generator so general conventions like the file extension and the properties and formatting of a page stub is controlled in a single place (BasePageStubPublisher) making it easy to modify as the properties describing a page change over time. You define a page in such a way to support the setting of certain properties if they haven't been set via URL or form (so you can pass a Catalog feature a product ID and a screen type to use, for example) so that many URLs can easily call the same feature but with different default properties (so /new-category/special-subcategory/new-product.name.html becomes equivalent to index.cfm?page=catalog&action=productdetail&CategoryID=12&ProductID=712).

There are definitely downsides to this approach, but I have a bunch of sites using the basic principle (I think I started doing this about five years ago) and I think this is powerful enough to be worth spending an hour or two writing some simple generators for core pages and the catalog feature to see how well it works out.

Thoughts?

Comments
Since you are already generating pages, or their stubs. Why not just generate the full thing instead of doing the cfinclude? Placing only the display content you need on that page and not requireing the mother of all index.cfm controllers?
# Posted By Joshua Cyr | 2/21/07 5:26 PM
Too many variables. The page template could be different for different users, times of day, etc. The content in n-content areas is dynamic based on independent sets of rules, and the subset of content displaying could be custmized based upon permissions, etc. Front controller behind my page metaphor gives me all these nice whiz bang variation points!
# Posted By Peter Bell | 2/21/07 5:58 PM
Peter -- So would both index.cfm?page=catalog&action=productdetail&CategoryID=12&ProductID=712 or /new-category/special-subcategory/new-product.name.html work as external links?
# Posted By Ron Alexander | 2/21/07 6:05 PM
Depends. If I put the index.cfm in the web root, then yes. If I wanted to lock it down, I'd put it below the web root and it would not be available. But yes, they would work identically as input (URL and form) variables just overload any default page variables.

What do you think? Good or bad :->
# Posted By Peter Bell | 2/21/07 6:27 PM
Obviously could enhance to limit the actions and settings available to any given page to avoid URL hacking, but the truth is that permissions are handled by the model anyway, so feel free to call to view /admin/user/admin/password.html (if you implemented a REST inspired URI schema for say and admin feature), but its not gonna give it to you unless you have the appropriate permissions passed to the model call.
# Posted By Peter Bell | 2/21/07 6:30 PM
For SEO and readability reasons I love this one: /new-category/special-subcategory/new-product.name.html

From my experiments and understandings this type of url index.cfm?page=catalog&action=productdetail&CategoryID=12&ProductID=712 gets hated on by search engines. Plus it is just plain ugly.
# Posted By Ron Alexander | 2/21/07 6:37 PM
Right! The index.cfm?whatever was never going to be exposed in practice (my URL renderers would never create such a URL even if I decided for whatever reason to make it web accessible). Many of my clients need SE optimized URLs, so I needed some way to implement the nice URLs. I still go back and forth between page stubs and URL rewriting with a lookup table (for each URI create an entry with feature, action and properties).

Actually, URL rewriting and a big URI lookup table would actually be kind of cool. I would then be able to get the various feature publishers to publish to a database containing all of the possible URIs, although for large, highly trafficked sites, the page stubs would probably perform better.

Hmmmm. I think I'll start with page stub generator and eventually could provide both as options . . .
# Posted By Peter Bell | 2/21/07 6:47 PM
I would think then that your stub system should be able to be flexible in whatever gets included in the end. Perhaps picking the right template. I just feel like you are locking yourself in very tight currently. Also I think you are adding a lot of extra overhead by not serving the page directly. I understand you ahve a lot of dynamic content, but perhaps there is a hybrid approach that doesn't force all aspects of the page to be figured out and then displayed. Templates for instance, that then get included?
# Posted By Joshua cyr | 2/21/07 7:04 PM
Hi Joshua,

I guess I see performance and caching as a secondary strategy. I'd rather have a very flexible front controller that can do anything and then add a bunch of caching optimizations (whether to disk or to memory - whether on the same server or to offload static requests) as a layer on top. Once I can do anything, it is fairly easy to do a subset of those things more efficiently.

It is a good point though. Initially the core of this is really just a way to provide a lot of flexibility in terms of URLs (which are the external, unchanging interface to the apps that are built). There is no question that I could provide different versions of the system with higher levels of static file caching. In fact that is a really good point. I'll look at this as the start of a static caching system that could be extended rather than seeing it as fundamentally limited to just setting some properties.

Really good point - many thanks (as always) for the input!

Coooool!
# Posted By Peter Bell | 2/21/07 7:15 PM
The more I listen, the smarter I (hopefully) get :->
# Posted By Peter Bell | 2/21/07 7:15 PM
Peter,

I have actually used this method very recently for a client who wanted search engine friendly URLs. Having an include makes it so much easier come maintenance time.
# Posted By Pragnesh | 2/21/07 7:41 PM
Hi Pragnesh,

Many thanks for the comment!
# Posted By Peter Bell | 2/21/07 8:43 PM
We're currently using page stubs as part of our Fusebox-based CMS and they work great. They're considerably more readable, and are easily generated on the fly when needed.

I'm probably actually about to embark on a massive URL inventory of the site to remove as many explicit fuseaction calls as possible and replace them with named pages that simply set that fuseaction. I'm hoping this will cut down on some of the URL redundancy that we've been noticing as search engines crawl our site (i.e. /collections/index.cfm/fuseaction/Items.ViewImages vs /collections/digitalcollections/searchimages/index.cfm/fuseaction/Items.ViewImages).

As far as Joshua's comments go, am I wrong in thinking that he's suggesting you generate the whole page (content and all) when you generate the stub? For us, this is out of the question, as we would constantly have to regenerate the stubs as functionality changed. An include is a massively intuitive way to call your controller once you've set the basic parameters of what your page is supposed to do.

In our case, all the stub file sets is the ID of the page. That way, the title, heading, section information, etc. can all be pulled as needed from the database. The page ID itself doesn't change, but the other information can, so we don't need to regenerate the stub every time the page is edited. This probably adds a little bit of overhead, but with a little query caching, the site runs relatively fast. A caveat -- if you visit now, it will have less than stellar speed, because we're still trying to optimize the image database. I
# Posted By Toby Reiter | 2/22/07 6:51 PM
Hi Toby,

Thanks for the comment! Sounds like you do almost exactly what we do. For more details on how I'm re-implementing this for LightBase, check out http://www.pbell.com/index.cfm/2007/2/22/URIs-in-L...

I'm also only putting ID's (actually filepaths so it wouldn't change much if we went to URL rewriting instead of page stubs) plus a little more info if necessary such as category and page ID's to jump "deep" into a page. I also am OK with performance hit of having to look up the rest of the page data - at least for first time around.

I get where Joshua is coming from and may look at overwriting my page stubs with static files for cases where there isn't anything dynamic on those pages, but usually I have a lot of dynamic stuff happening, so that's for version 1.5!

Nice to hear other people doing the same things!
# Posted By Peter Bell | 2/22/07 7:04 PM
Commonspot, a commercial CMS, uses stub pages.

Stub pages are safe and effective in all page-based languages such as ASP, JSP and PHP. Because they're cheap to compile, they can support very high performance, particularly in non-db or db-optional architectures.
# Posted By Paul Houle | 4/27/07 11:15 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.