Normalizing data tweaks the pleasure centers of my brain. Dopamine and experience with far too many legacy systems that were denormalized for no good reason drove me to the “normalize til it hurts” school.
With that state of mind, I read Jeff Atwood’s recent post on normalizing with only moderate interest. Mostly I thought to myself, yeah, well more people need to normalize more often, not less. It was as if someone had reported to an AA group that a glass of red wine a day might be a healthy source of antioxidants (or whatever).
Naturally life conspired to slap me upside the head with a real life exception, and, ironically, it was painful. Here is the normalized version of the data:
A certain organization gives work hour credits to students based on different criteria. There are different kinds of credits, and different credits have different business rules. This is the sort of data that I find most challenging to model in a way that is both flexible and practical. In one way, the data are fixed attributes. There are two types of work hour credits, and five types of technical training credits. Each has a different minimum value and maximum. In some cases the value can only be either the minimum or the maximum. There is also a limit on the total of the five possible technical training credits. Any of these business rules can be overridden by a manager.
I normalized because the data are types, of course. Even if the number of types is limited and static, I’ve had too many cases of clients assuring me that there will never be any changes only to ask later for modifications. (Which is fine. I’m totally cool with changing needs. I just hate to have programmed in inflexibility when it would have been just as easy to make it flexible, and less costly for the client, in the beginning. Note the emphasis “just as easy.”)
I chafed a bit at the normalization. I could see it was going to be a pain, but programming for my comfort isn’t the goal, after all. The only reason a client should care if something is hard to program is that costs more time and money. I knew optimization wouldn’t be problem. It was so unlikely as to have essentially no weight in my choice.
All was fine til I started to implement the UI that the client wanted for this data. Gad, it was turning out to be a bother. I’m embarrassed it took me so long to realize that I needed to step back and consider whether the degree of normalization was causing far more trouble than it was worth, but I did. In 5 minutes, I had revised the structures to this:
Oh, and the UI? That took another 10 minutes. The pain was from spending four unproductive (read: unbillable) hours futzing around with the first design.
- Yes, sometimes denormalization is useful, and not always because of optimization pressures.
- It’s easier to denormalize than normalize a production system.
- Normalize til it hurts, but also, always be evaluating effort versus benefit gained by normalizing (or denormalizing)
- It’s not all or nothing. Some denormalization isn’t the same as fully denormalized. Just denormalize the bits that need it.
- I have a tendancy to focus so hard on solving a problem, that I frequently forget to ask if the problem I’m solving is the right problem. Stubbornness is a good characteristic in a programmer, but it’s a liability too.
For the record, I learn lession 5. every (*@)%^*Q day I write code. Though I see it earlier in the process.