Wednesday, August 1, 2012

Deduplication for Cloud Storage, Penny Wise and Pound Foolish?

We’re often asked about the deduplication capabilities of our storage connector (storage gateway) and people are initially shocked by our simple response: None at all! Aren’t reducing storage footprint and transfer bandwidth good things? Of course; but as with all good things, they come at a price.

There is no standard process for deduplication.  In fact, vendors spend quite a bit of time and money developing and protecting deduplication techniques.  To reduce the data footprint, deduplication algorithms analyze your content looking for bits of redundancy it can remove.  If you look at the resulting output in cloud storage, what you get is completely unusable blobs of data unless you use the same technology to rebuild your content.

From a usability perspective, deduplication is identical to encrypting your data, but with a key only the vendor knows. If you want access to your data, you must use that vendor’s technology to get to it. Under certain isolated scenarios, this may be acceptable. However, if cloud storage (public or private) is part of a long-term strategy to solve your data storage problems, this technology lock-in is a significant risk.

The promise of cloud is exciting because the technologies provide scalability and flexibility previously unattainable in IT organizations. Cloud storage not only provides a solution to the massive growth of data we are all experiencing, but provides the foundation for an entire class of applications that are being developed to manage, search, and process it. These applications use cloud storage APIs directly for efficient and scalable data access. Deduplicating the data first can lock you into one technology and out of many others. Switching technologies would require you to move all of your data back through one system, and into another.

Cloud storage connectors are critical today because most legacy systems can’t access cloud storage APIs directly. They make cloud storage accessible in existing environments and allow organizations to easily integrate these storage platforms into today’s workflow. When used properly, storage connectors can provide the bridge to both cloud storage and the next generation of applications. Choosing the right technology will give you the tools to start using cloud storage today, and the flexibility to continue using it with the applications of tomorrow.

  1. I'd be interested to hear your thoughts on storage file systems & platforms for use within cloud, I have tier 1/2/3 style systems however placing them feature wise is difficult.