How do YOU provide site wide search?
The problem is that as your model gets more complex it becomes a real pain to implement what (to the client) is just a "simple" text box and search field.
One approach to this is to write a search generator which takes metadata about each object and uses it to automatically generate the display mappings and the like, but that would be a chunk of work and I'm pretty backed up right now.
The other approach I'm toying with is just either a third party service or a downloadable script that treats your site as a collection of static pages, spiders them and provides a simple Google like listing. The strength of this approach is that it better matches what people expect to see - a way to find "pages" in your site with the search terms rather than a way to access content items matching the search terms (the content items often loose context being accessed by search).
Just to clarify that last point, what does the "back to category" button do on a product page if the product is in n-categories and you accessed it using a verity search? All of those kind of contextual data provide dilemas when using a "search each table with verity and collate the answers" kind of approach to searching content items rather than just spidering the site.
Of course, there is still a place for an "employee search" or "product search" that would clearly want to use either a db query or verity, but what do you think of just using a spidering service or product for your site wide search and does anyone have any good recommendations for solutions in the hundreds of dollars or below that they've implemented successfully in the past?



Google, again, offers a great, simple solution that seamlessly can be integrated into a site.
Very interesting! Experimental right now, so for safety I can't use it for this project, but I'll definitely sign up for a key.
Please let me know if you ever have any sample code snippets you'd be willing to share on implementing this (or if you find any!)!
Interesting - hadn't really looked at it.
Here are some examples on how to do this:
Adobe has an article:
http://www.macromedia.com/devnet/coldfusion/articl...
Seth Duffey summarizes his experiences using this in CF7:
http://www.leavethatthingalone.com/blog/index.cfm/...
Here is a tech note on additional files required and how to use them:
http://www.adobe.com/cfusion/knowledgebase/index.c...
For today, I'm still probably going to use the hosted solution from http://www.freefind.com/ as it is SO easy, but I will definitely check this out some more and try to get it working for future sites ($200/yr for one site is fine - doing that for every project is a little more than I'm willing to spend!).
They offer a free version:
http://www.thunderstone.com/texis/site/pages/webin...
<p>
Pricing is a big issue. If I were looking at something free, I would try Nutch, part of the Apache Lucene project, or I might roll my own in Java using the Lucene engine, which gives you tremendous flexibility and scales extremely well.
What in Google do you find lacking? The only real neg I have right now is that I can't create "categories", but other than that, I'm finding it pretty powerful. I especially like the new OneBox features.
The biggest issue I have with Google is that it only stores and displays a few fields of data. That becomes problematic when you are indexing structured content and you want to return a richly formatted display of the structured data.
Other than that, Google just lacks capabilities that enterprise search systems have- autmoatic clustering of results (using, I think, Bayesian filtering- see clusty.com), faceted navigation (good for large catalog sites- see homedepot.com), personalization, merchandizing, rank-rating (the ability to affect ranking of results). FAST actually plugs right into our analytics provider, Omniture, and allows you to tune search results based on conversion patterns. Very cool for an ecommerce site.
The cost difference between the Google solution and enterprise players like FAST and Autonomy is substantial, but so are the benefits. We might re-purpose our Google appliances for internal use, where they have a better fit, especially with the OneBox stuff.
Take a look at http://www.picosearch.com
They charge annually (around $250US) and provide a very customizable templating system, automatic re-indexing and good search statistics.
See http://www.mohawk-flooring.com/ for an example.
- Tom