By Peter Bell

How Would You Solve the Order:Address Problem?

Let's say you're building a commerce system. An Order has a BillingAddress. You want to support address book functionality where users can re-use, edit and delete addresses, but once an order has been placed, you want to fix the BillingAddress for that order while still allowing it to be editable within the address book. What's the best way to solve the problem? . . .

Broadly there are two approaches - versions and copies in the Parent table.

Composed or Associated?
Often you can look at an object relationship as being either composed or associated. If you delete a product, you're not going to delete the category it is in automatically as categories are just associated to products in most systems, but if you were to delete an order, you'd almost certainly delete the composed order items as they have no meaning without the order they are a part of.

The problem with Order.BillingAddress is that you are doing two things - describing the specific billing address for a given order which in itself is a value object with a lifecycle the same as that of the Order and you're also creating or using a separately associated Address object which has its own independent lifecycle. So, how could you handle this?

Versions
One approach is to create a version of the Address that is tied to the Order and that (once the Order has been saved) can not be edited. So in practice if you had an address book, if I tried to edit an Address that was associated to an Order it would add a new version of the Address and hide the old version from the address book while still keeping the Order associated to the old version of the Address.

Copies
Another approach would be to say that the address book was simply providing initial values for a value object composed within the Order and that you should save a copy of the BillingAddress properties like Order.BillAddress1 in the Order table as well as creating a separate Address object if it was a new billing address. In that way the lifecycles are immediately diverged, all of the BillingAddress information is neatly contained within the Order table and you don't have to add convoluted logic to the management of Addresses to just make sure the Order data doesn't get messed up.

What Would You Do?
I think I'm more tempted by the second approach. You could argue that it is less normalized as you're repeating Address columns within an Order table and you could also argue that it is slightly less efficient from a storage perspective as unless every single Address associated to an Order gets changed, this strategy will include more duplicated data. On the other hand, I'd argue that having the data separated clearly separates the anticipated lifecycles and simplifies the application logic, avoiding us having to have complex versioning concerns within the Address table purely to support the requirements of Orders being preserved over their lifecycle.

What do you think - and why?

Comments
I make copies, and mostly for one simple reason: It's harder to screw up a copy. Versioned addresses make sense in a perfect data world, but developers (myself included) are not perfect. I can think of many ways to accidentally use the wrong address in a version system, and many fewer in a copy system.
# Posted By Sam Curren | 7/14/08 10:44 AM
I use "copies" in this scenario.

"You could argue that it is less normalized..."

Yes, but by design. The point of normalizing a DB is to remove redundancy, but these addresses are NOT redundant data. As you noted, once the order is placed its Billing Address becomes a historical record with its own lifecycle. Even though it may be equivalent to the original Address Book record, changes to one should no longer effect changes to the other, and denormalizing the data has the desired effect.

If you're worried about storage efficiency, just create a new BillingAddress table and use it to store unique billing addresses. Your order history table would then have a FK relationship to this table, not AddressBook, so you're still decoupling billing addresses from address book records.

Whether or not that last step is justified depends a lot on your data. If your customer expects a high volume of repeat customers then maybe its worth it... but if the customer expects most orders to be from new customers then the denormalization won't buy you much storage savings.

In any case, I'd steer clear of versioning an Address table just to support the requirement that billing addresses are historical records. I think you're better off fully decoupling the Address Book from the order system.
# Posted By Seth Petry-Johnson | 7/14/08 11:19 AM
Another vote for a "copy." In the example given of orders and addresses I think of orders and all their related data (addresses, line items, customer accounting rows, etc...) as transactional data for the purpose of creating a paper trail. It's a snapshot of a transaction at some point in time, and a copy is the simplest way to represent this.
# Posted By Clint Miller | 7/14/08 2:04 PM
I just recently had a similar situation building an affiliate system for a client where I had to make a copy of the frequency of payments and commission amount. This was specifically done so that moving forward, any changes in the agreement with an affiliate to increase or decrease payment amount would only affect new customers from the point of changing the related amounts.

The copy method may seem like more data, but I believe that Seth made a valid point that the data related to the order is historical data that's not redundant. It all boils down to preference and how important that historical data is to the application also.
# Posted By Hatem Jaber | 7/15/08 6:28 AM
What I have done in the past is have from date and thru date fields in my address table. I use these dates to indicate when a particular address row was valid. So when joining with a particular order I will use the order date along with the from and thru dates in my query.
# Posted By Kurt Wiersma | 7/15/08 10:55 AM
@All, Thanks for the input. Looks like copy is most common which was what I was leaning towards.

@Kurt, Thanks - very cool idea. Wouldn't have thought of that approach at all!
# Posted By Peter Bell | 7/15/08 11:15 AM
Just want to further endorse Kurt's suggestion. This is a technique which can be widely applied across different domains when you need to know the value of something which changes over time. We use this technique in systems where people's balance changes over time or a salary rate as it allows you to determine what someones rate or salary was at a particular moment in time allowing you to produce a historical snapshot of an account.
# Posted By kola | 7/23/08 8:03 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.005.