Ever since I figured out that what I did to databases was actually called normalization, some of the excitement went away.
It meant that what I was doing wasn’t some genius idea cooked-up in my brain but actually the right way things should have been done.
However, over the years, I’ve come to realize that normalizing is good only to a point. Usually the 3rd onwards forms aren’t useful for all tasks but need to be applied on a case-by-case basis.
In my search for new info, I came across this excellent primer article explaining what Shard‘s are and what to keep in mind before plunging into DB design using some unorthodox schemes (The warning here is related to the fact that it’s a learn-as-you-go world, and there might often not be any fall-back books to refer to!)
In some ways, we had to experiment with this already at HurricaneBOSS where we maintain shard’s of databases per customer’s company. Mostly for privacy … but definitely to also help with managing heavy loads! Good to know that it’s recommended by others too.
Read the entire article titled An Unorthodox Approach to Database Design : The Coming of the Shard.
I recommend reading the full article first and then following any of the other links … Good stuff!
9 Sep 2007 at 12:28 am
Personally, I think even 3NF is a kill at certain scales. There have been times when I’ve said screw 1NF, stick the multiple values as an XML/JSON blob into a column if you don’t plan to query on those individual values.
From my limited knowledge of shards, I think they make sense when you have a very write intensive system and you have a system that sucks at multi master writes (like MySQL). If you have a read intensive system, adding more peers/slaves (whatever you call it) should do the trick. As far as privacy goes, I refuse to buy thar argument. If you really need table level isolation, then you just have different tables with the same schema. Using shards for that seems convoluted. One of the main motivations of shard is that you have horizontal partitioning but at some point, you also need a unified data access view. In other words, logically it is still a single entity; the physcial separation is incidental.