Wednesday, October 25, 2006

Preserving Wynton Marsalis in Digital World

Last weekend, I saw Wynton Marsalis perform at Paramount Theatre. During the performance, I started to think about how legacy of Wynton Marsalis would be preserved in digital world.

Long time ago, the primary way to preserve music was to write on music sheets. Overtime, some of these sheets were physically destroyed or ink faded. One way or the other, some of the music was lost. Also, different people interpreted the written music differently (A debatable point as I am no expert on this topic ... treat it as view from the outside) as there was no audio or video archive of the original performance. The main challenge in that era was physical preservation of music sheets.

Things in music world have changed dramatically since the introduction of audio and video technology. Now, music is primarily preserved using audio, video or both. This shift has allowed more and exact information about a particular music to be preserved like the interpretation of the music by the original artist and his or her original performance. With digitization, even the making of a particular piece of music can be archived for future generation. This change also has brought new challenges in preservation of music.

Will we still have access to the music of original artists, hundred or even twenty years from now? The answer is no longer as simple as physical preservation of music sheets. The digitization of music along with its benefits has brought new challenges in long term preservation of music.
  • Survivability of the audio and video file formats.

  • Capability of the application to play original audio and video file formats.

  • Availability of original digital rights management process if DRM was applied to the file.

  • Availability of original encryption key management system if file was encrypted.

  • Capability of the operating system to be able to host application and support media and system.

  • Physical survivability of the media on which audio and video are recorded.

  • Physical survivability of the system capable of using the original media.
Come to think of, the long term preservation challenges are not exclusive to music, they also apply to all other information that need to be preserved for the future generations. And, these issues become lot more complex to address once the file, application, keys (DRM, encryption), OS, media, system, and storage are physically separated.

Tuesday, October 24, 2006

Google Custom Search for Data Storage

There has been a lot of talk in blogosphere about Google Custom Search. See Google Co-op Launched, Google Custom Search engine - with ads, Google's new personalized search engine talk of town, Roll your own Google Search.

Using Google Custom Search, I created a custom search for data storage information. The Google custom search box for data storage information is displayed just below the title block on my blog.

Data Storage Specialized Search

The data storage specialized search is started with 154 links and 3 categories: Bloggers, Media, and Vendors. Incorporating relevant storage links in to this search engine is a continuing effort and your help is greatly appreciated.
  • Would you like to include your storage links and categories? Email me with your suggestions for relevant storage links and categories.

  • Would you like to include data storage specialized search on your web page? Email me a request for the code.

  • Would you like to be active contributor? Email me with your contact information - email address, phone number, location and affiliation. You will be added as a contributor volunteer to data storage specialized search. The contact information is only requested to filter out suspicious requests.
Join this effort and let's make this search engine for all things storage.

Monday, October 23, 2006

Funny Side of Storage

A colleague gave me this amusing bumper sticker from Data Domain. Who said storage is boring?

Have you come across anything from funny side of storage? Share it!

Wednesday, October 18, 2006

Trip to Japan

Continuing with the theme of reasons for light blog activity. Last few weeks, we were working out the details of visit to Japan next month.

Here are details of our trip,

Nov. 19 - 21 Tomakomai, Sapporo.
Nov. 22 - 23 Kyoto, Osaka.
Nov 24 - 26 Tokyo.

Even though, most visitors to my blog are based in US, there are some visitors from Japan too. If you are one of those readers from Japan, ping me. And if you are located in any of the cities on our list, we will be happy to say hello in person, time permitting.

I get the impression from a recent Wall Street Journal article How Demon Wife Became a Media Star And Other Tales of the 'Blook' in Japan that blogging is huge in Japan.
Blogs are even more popular in Japan than in the U.S. ... An estimated 25 million Japanese -- more than a fifth of the population -- are believed to read blogs.
It will be great to build some connections in Japanese blogging world. The Pleasures of Finding Things Out! And meeting readers in person all over world.

Note: If earlier you noticed a Google Map Test post, I was trying to include Google Map in my blog post. Unfortunately, I couldn't make it work. And after fiddling with the API for few hours, it wasn't worth wasting my time anymore.

Tuesday, October 17, 2006

Surely You're Joking …

You write in your previous post Evolution of Data De-duplication similar to FC SAN,
Data de-duplication is the single most important innovation in data storage since FC SAN.
If you believe in the potential of data de-duplication then why are you not working in this area?
This was the question, I asked myself after finishing The Final Frontier for Data De-duplication. Surely, there must be lot of situations where data de-duplication can be applied for it to become a pervasive technology. Some I described in Where are you being De-duped?.

Answer to this question "I should be working in data de-duplication" along with putting together a plan to realize it, is one of the three reasons for my light blog activity last few weeks (Other two in future posts). I am spending significant time researching and analyzing different data de-duplication techniques, opportunities and current products. I am sure results from some of this work will also show up on the blog in near future.

Tuesday, October 10, 2006

Tales of Two Data De-Dupers

Avamar Announces Record Third Quarter
Revenue in the third quarter grew over 165% from the same period in 2005, and 22% from the second quarter.
Data Domain Posts Record Sales in Q3
Third quarter sales for the company were up 40% compared to the previous quarter and up 260% compared to the same quarter in 2005.
These two de-dupers are getting excellent market traction with data de-duplication for data protection. How are the two other de-dupers, Asigra and Diligent, doing?

Such results from the first wave of de-dupers are just indicative of the exceptional return on differentiating innovation, something Geoffrey Moore wrote in his latest book Dealing With Darwin, a fascinating book. I can't wait to see second wave of de-dupers to make a comparison of their returns from what book termed as neutralizing innovations!

Will Data Domain be next to file for IPO?

Thursday, October 05, 2006

Where are you being De-duped?

As I wrote previously (See, Evolution of Data De-duplication similar to FC SAN), presently data de-duplication is making a name for itself in backup and recovery area.

In my opinion, data de-duplication will ultimately be incorporated in to data storage infrastructure for three reasons:
  1. The growth in volume of information that need to be stored is several times that of the growth in capacity of data storage media, irrespective of media type (See, The Final Frontier for Data De-duplication).

  2. Data de-duplication doesn't require the information context of data to eliminate the repeating patterns. It can also eliminate the repeating patterns beyond a single information container.

  3. There is no reason that data de-duplication should only be applied to reduce the data-at-rest footprint. It can as effectively reduce the traffic volume by eliminating the repeating patterns in data being transferred. iSCSI aficionados take note.
Some of the near-term applications are going to be in the area of backup, archive, wide area data transfer, data caching, primary storage and enterprise data storage grids. In the end, data de-duplication can be applied anywhere where cost of resources freed by eliminating repeating patterns exceed the cost of resources required to remove repeating patterns.

Do you or someone you know applying data de-duplication to solve unique problems? I like to hear about it. This is your chance to reach out to my unique and focused readers.

Share your vision and thoughts on data de-duplication with me. I will share it with my readers and give appropriate credit to you.

BTW, you are seeing Microsoft OneNote and Tablet PC in action with above doodle. Let me know when it gets annoying!

Wednesday, October 04, 2006

The Final Frontier for Data De-duplication

Bruce asked important questions in his blog entry What happens on the last day?
How many doublings does it take before the cost of storage exceeds the company's annual revenue? I've yet to have anyone give me their plan for 5 years out. What's yours?
Established storage vendors will like customers to keep purchasing storage at the current pace. But, it is only wishful thinking. In my opinion, nothing available currently is the answer or plan, customers want.

I am currently reading an interesting book Dealing with Darwin by Geoffrey Moore. The answer to cost of storage exceeding annual revenue and the 5 year plan lie in this statement from the book.
Competition for the scarce resources of customer purchases creates hunger that stimulates innovation
So, it will be the innovations that will give the solution and the five year plan to storing information without breaking the bank. And in my opinion, the underlying technology to store more with less will be data de-duplication.

It is matter of time before data de-duplication becomes a pervasive technology with in data storage infrastructure and ultimately becoming a key enabler technology in primary storage to store more data in smaller footprint.

Tuesday, October 03, 2006

Evolution of Data De-duplication similar to FC SAN

In my previous blog entries, I wrote about data de-duplication (See, Pitch something worth talking about!, Coolest Product, Episode 2: Technology). I believe data de-duplication will have similar impact on the way we store, transfer, protect and manage the data, as Fibre Channel Storage Area Network (FC SAN) did. Data de-duplication is the single most important innovation in data storage since FC-SAN.

It is no coincidence that both technologies initially targeted backup & recovery and then expanded in to other areas. Backup & Recovery continues to be an important function but a very expensive proposition for the businesses.

FC-SAN helped consolidate the backup footprint by enabling sharing of expensive backup devices. It also improved backup performance by separating backup traffic from client network. The volume of stored information is increasing at a rapid pace, and the need to reduce the backup footprint and performance further has risen again.

Virtual Tape Libraries (VTL) and Disk to Disk (D2D) backups already have shown significant promise in increasing backup performance. Now, data de-duplication is helping solve these problems with excellent data reduction, reportedly anywhere from 20 - 50:1, in backup savesets size and the reduced transfer size for offsite copy.

So what's next for data de-duplication?