As you may realize (See previous posts, Storage Vendors to Watch: Storewiz. I and SVW: Storewiz. Q&A. and resulting comments), compression doesn't seem to get much love in storage industry with primary concerns being CPU utilization and performance impacts. How does Storewiz implement compression?
There is not enough information available from Storewiz on compression methodology and implementation. Most of the information below comes from Storewiz patent applications, specifically Method and System for Compression of Files for Storage and Operation on Compressed Files [US 2006/0184505 A1].
ABSTRACT. A method and system for creating, reading and writing compressed files for use with a file access storage. The compressed data of a raw file are packed into a plurality of compressed units and stored as compressed files. One or more corresponding compressed units may be read and/or updated with no need for restoring the entire file whilst maintaining de-fragmented structure of the compressed file.The segments of an original file are sequentially compressed, by segment, into series of compression logical units (CLUs). The metadata for compressed section and corresponding CLUs are stored in a separate table.
Reading data stored in a compressed file requires identifying relevant compressed segment then CLUs belonging to that segment. Then, applicable CLUs are restored until all data that need to be read is restored.
Updating data stored in a compressed file follows the similar process as read. But, it involves a little more complexity as number of CLUs that need to be written after update can change from original number of CLUs restored.
Based on the patent document, the uniqueness in Storewiz compression implementation probably comes from:
- Random access to data in compressed stored files
- Operations on the compressed data without decompressing entire file
- Compression/decompression operations transparent to users
- User unawareness of the storage location of the compressed data