Thursday, February 01, 2007

Data De-duplication for Primary Storage

Last night, while reading Steve Duplessie's long travelogue, I noticed a very interesting note referring to data de-duplication for primary storage.
We landed on time, and a car was nicely waiting to pick me up to bring me to my first meeting, with Data Domain … I chatted with a bunch of their smart folks theorizing about where other implementations of this technology could really affect change in the world, and found quite a few. What if you could get the performance attributes required by a high percentage of today's applications on a primary store that happened to get 40 to 1 compression rates? Imagine the economic advantages and the consolidation potential.
Again tonight, I came across another note referring to data de-duplication for primary storage in Jon Toigo's praise the PR agency post.
What de-duplication doesn’t address is the primary storage issue. The device used for primary storage is not inexpensive and has limited capacity. When you run out of space, you run out of space. You can manually delete and/or move to tape, but this is somewhat time intensive. Or, you can always make a storage vendor’s day by buying more and more hardware. What I believe will be a hot and extremely important technology in 2007 is data compression. The business case for data compression on primary storage is the same as the one used for de-duplication for backup. Compression can cut primary server data to a minimum of one-third its original size. It makes good business sense: by compressing data on primary storage devices you need less hardware less resources to manage, lower power consumption (and power consumption is a big deal), etc., all without a performance hit.
Hmm … it may be just my imagination but seems like Data Domain may be planning to enter primary storage market with a data de-duplication product and JPR may be doing guerrilla PR for them.

Anyway, I am glad to see that I wasn't the only one theorizing the extension of data de-duplication beyond backup to other storage sub-segments, last year. See Where are you being de-duped?.
Some of the near-term applications are going to be in the area of backup, archive, wide area data transfer, data caching, primary storage and enterprise data storage grids. In the end, data de-duplication can be applied anywhere where cost of resources freed by eliminating repeating patterns exceed the cost of resources required to remove repeating patterns.
see image


  1. You should look at Storewiz. They do 3:1 compression for Primary Storage.

  2. Thanks for pointing me in to Storewiz direction. Their website doesn't have enough info. Without knowing the technology behind 3:1 claims, I am skeptic.

  3. BTW, thanks for the info. JPR is listed as PR Agency for StoreWiz so they may be pitching that solution to Toigo. I guess wrong inference on my part.