Thursday, April 29, 2010

Why does CORE fail? Part 1 - Response

Steve Kenniston of Storwize made detailed comment in response to my last post Why does CORE fail? Part 1. I thought my response to his comment deserved a separate blog post. Frankly, I haven't kept up with developments at Storwize since May 2007 when I last wrote a series of blog posts on Storewiz so I don't claim any knowledge of current Storwize solution.
First, I am not so sure that time to 'uncompress' ... is a valid parameter IF all solutions are being compared identically,....
The time to decompress/reconstitution is as much important, if not more, than time to compress/dedupe. The compression/deduplication can be managed 'internally' to keep up with write expectations of applications and users whether through delaying writes just enough to allow data reduction in-band or through data reduction after writes complete or some hybrid approach. But, the read expectations must be met in-band so any decompression/reconstitution need to take place correctly and completely in the expected time. A solution that requires lower time to decompress should be rewarded in same fashion as a solution with lower time to compress being rewarded in CORE.
... First I think we can all agree that decompression or rehydration is faster than optimization (compression, deduplication). ... the performance of time to 'compress' (I prefer optimize) and then cut the time in half and call this time to rehydrate. Now apply the formula. I would assume that the new CORE value would come out very close as they are now.
I am not so sure of time to decompress/reconstitute being faster than time to compress/dedupe or being 50% of time to compress/dedupe as I haven't heard of a solution or seen data yet that supports such claim. Actually, the relationship may be reverse specially for solutions with large amount of compressed/deduped data and high data reduction ratio. Only related published data, I am aware of, is that of read speed being direct function of the smallest unit used for decompression/reconstitution - larger the unit size, higher the read speed.

As I questioned in my last post, are time to decompress and compress proxy for time to read and write from data reduction solution? If it is the case, CORE could be improved upon by including actual time to read and write (instead of time to decompress or compress) or including time to decompress/compress as penalty over normal read/write with a solution that has no data reduction technology - in essence, additional cost in the form of lower read/write performance in exchange for higher storage efficiency.
Also, without understanding how the solution works it is very difficult to debate the merits of the value of performance on that solution. ...
If CORE stays with the parameters that can be judged externally for a solution, it will be more relevant and valuable than trying to incorporate parameters internal to a solution like time to compress (tc). A CORE based on externally measured parameters like reduction ratio, read and write performance, and cost of solution over a range of storage capacity and time may produce a better value indicator. Any attempt to include internal mechanisms weakens the CORE due to lack of complete information and understanding of every solution and rapid changes in technology and techniques incorporated in such solutions.
How can you possibly say that a post process solution that has users: 1) Buy full storage capacity (vs. less capacity with an inline solution) ...... is a good solution? ...
Please read my post again. I never claim any one solution is better than other. CORE includes cost of solution as a parameter which supposedly should penalize the solution that includes more storage than required by other solutions.
Step out of the vendor shoes for a moment and put yourself in the shoes of the customer. Which would you want?
As a customer, I want a solution that will provide additional storage efficiency at reasonable cost while meeting my expectations for read and write performance, safeguards my data and doesn't require additional management overhead. Anything beyond that is vendor coloring the customer expectations to fit it's solution.

Monday, April 26, 2010

Why does CORE fail? Part 1

Recently, David Vellante at Wikibon wrote in his blog post Dedupe Rates Matter ... Just Not as Much as You Think about his Capacity Optimization Ratio Effectiveness (CORE) value for ranking dedupe/compression/capacity optimization solutions. He also applied CORE to few dedupe solutions for primary storage.

As I commented on his blog, right away I noticed that CORE formula had an important parameter missing - time to uncompress/reconstitute (hereafter referred as time to uncompress) deduped data. It is an important parameter that impacts the rate of reading data from dedupe solution by applications/users. As time to uncompress need to be happen inline for both inline and post-processing solutions, logically there will be no major discrepancy in using time to uncompress and reading data from a dedupe solution interchangeably.

Is time to compress/dedupe also proxy to rate of data written to dedupe solution?

Another important parameter is rate of writing data to a dedupe solution as applications/users have certain expectations on how quickly data must be written to a storage system. David includes time to compress (tc) in his CORE calculation, what I assume, as a proxy to rate of data written to dedupe solution. I may be wrong as I didn't see an explicit statement about why time to compress/dedupe is important.

In my opinion, he incorrectly assumes the impact of time to compress/dedupe (hereafter referred as time to compress) to be same across various dedupe solutions whether inline or post processing solutions. The time to compress impacts the rate of writing data, more so, for a dedupe solution that uses inline processing. There is no impact on rate of writing data for post-processing solutions. So, to have apple-to-apple comparison, David need to either use the rate of writing data across all solutions or include time to compress data as penalty for inline solution due to slowing down the rate of writing data.

The low time to write data is a requirement of applications/users which inline solutions meet by reducing the time to compress as much as possible (possibly at the expense of lower dedupe ratio). Post processing solutions meet the same requirement by delaying the compression/deduplication for later (possibly at the expense of additional capacity required for storing pre-deduped data).

Including time to compress data in CORE calculations without discrimination inaccurately biases the CORE toward inline solutions. Just because a solution have sub-ms time to compress in-band doesn't mean it should be rewarded over a solution with few ms time to compress out-of-band.

Assuming that time to compress in inline mode and post processing mode are equivalent, in CORE calculation, is flat out incorrect.

Why is Time to Compress being used as Time to compress the smallest unit compressed in the solution (e.g. file or multiple files or block)?

Is a dedupe solution that compresses 16KB block size in 0.001ms better than a solution that compresses 64KB block size in 0.003ms? The CORE fails right here.

For all other factors being equal, a solution that claims 0.001ms for compressing 16KB (smallest unit for the first solution) will produce higher CORE value than a solution that claims 0.003ms for compressing 64KB (smallest unit for the second solution). As specified currently, the time to compress, in turn CORE, doesn't take into consideration the variation in different unit size used by different solution. Is the CORE formula assuming that compressing/deduping in smaller units better than in larger units?

The smallest unit compressed varies across solutions by a wide range, even >1000x factor. The time to compress should be the amount of time it takes to compress a specified storage capacity and should be normalized across all solutions for CORE to be of any value. Comparing time to compress 16KB units versus 64KB units is like comparing oranges-to-apples. For 1MB data, in first case 64 units will need to be compressed (0.064ms) versus 16 units in later case (0.048ms). CORE using time to compress/dedupe without taking into consideration the unit size will penalize the second solution incorrectly.

In next post, I will further look in to CORE and take apart CORE formula ...