Integrating Legacy and Web Specific Data
Extending Products
For this particular application, the authoritative source for all product data is in their accounting system. Unfortunately, the accounting system doesn’t provide a mechanism for managing things like a WYSIWYG product description, so the client wants to be able to upload products from the ERP system while managing a handful of product fields using a web based admin system.
Complete or Incremental
The main type of variability is that a given application might want to use a complete or an incremental upload. With a complete upload, all products are deleted and re-imported, with an incremental upload, only additions, edits and deletes are transferred. In general terms an incremental upload is more efficient (especially with large data sets) but many smaller clients don’t have an easy way to report on which products have and haven’t been edited or added in a given period of time in their ERP systems. That is certainly true for this client so they are requesting a complete upload on a regular basis.
As a side note, the one issue with an incremental system is an appropriate mechanism for notifying of deletes. If you have a “soft delete” system (where deletion is really just the setting of a status or a hide bit), it is simply another type of edit, but with hard deletes the best solution I have come up with to date is to support both a delete list where the user provides a comma delimited list of unique identifiers for the records to delete and an ImportAction column in each import file with 0 for save (insert or update), 1 for insert, 2 for update and 3 for delete. Anyone got a better approach to that?
Handling Extended Attributes
The question then is what to do with extended product attributes that are only managed in the web system. One approach is to break them out into a separate joined table and allow your ORM of choice to handle the join, returning the appropriate object (or you could use a db view). Another is to include the extended attributes in the same table. For an incremental upload there is no particular need for an additional table, but with a true “delete and re-create” process, it is a practical necessity.
Just thinking about this for a while it seems to me more elegant to keep the extended data in the main table (assuming that the only reason it is being handled differently is due to limitations in the accounting package and not because it fundamentally is a separate composed object). Of course, implementing this would mean that we’d have to replace a naive drop and re-import with a pseudo-incremental update by importing the new data into a separate table, comparing it to the current data, creating a list of the necessary additions, edits and deletes and then executing them. While this is quite a bit of work and processing, on the whole data is more expensive than processor cycles so I’m going to propose an approach to implementing pseudo-incremental updates which has the benefit of not disturbing extended data and of allowing for auditing and rollback of “complete” uploads by turning them into a set of discrete loggable, auditable operations.
[update] Just come up with quite a nice pattern for implementing this. Will blog about it shortly. Does anyone else have any ideas about this?



Yeah. that's where I'd like to go, but my clients don't have the resources in house to generate such feeds! All they know is what is in the accounting package now - not when changes were made. We both know it isn't that hard to write a logging tool that logs changes, but it is more than they are up for and I don't know the API for their package (or want to learn it), so I just have to have a system capable of taking a full upload and discerning any changes automatically. It isn't actually that bad as long as the number of records isn't crazy and most of my clients have thousands rather than millions of records, so performance isn't too much of an issue.
I'm going to support the two feeds, but also add a layer capable of generating them from a complete update file that I'll just drop in an Import table and query against.
http://www.pbell.com/index.cfm/2006/11/15/Insert-v...
Thanks again for the comments!