Tuesday, June 12, 2007

Where do you focus, Bandwidth or Latency?

Since my first post about Gear6, Gary Orenstein and I have been exchanging emails discussing various aspects of storage caching and Gear6. Recently, he commented in response to my request for pointers on storage caching market and implementations:
When I find interesting items related to caching I usually post on our blog. The thing is, there really hasn't been anyone promoting network-based caching until Gear6.
With rising interest in flash memory and SSDs, I am finding storage caching quite intriguing. I decided to start from basics.

What problems does caching solve?

The major benefit of caching is in reducing the latency whether caching is part of the web, network, file system, storage device, processor or memory. What is latency? Any delay in response to a request.

Bandwidth Bias

One consistent theme struck me odd as I started studying caching is how often we suggest more bandwidth as a solution to the slow performance issues and how little focus we give to the latency side of the problem. What is bandwidth? The amount of data carried from one point to another in a given time.

Even in iSCSI world, we all hear how 10GbE will be the inflection point, indirectly giving the impression that bandwidth is the bottleneck in iSCSI adoption. What is the real bottleneck in iSCSI? Is it bandwidth or latency?

I guess it sounds more impressive "With 10GbE, the bandwidth will increase 10X so you will be able to push ten times of data but latency will only be reduced in half (approx)."

From the productivity aspects of users and applications, a predictable and quick response to a request seems to be considerably more important than the amount of data being transferred over a specified period. What good more bandwidth does if data needs to wait for processing? A balance between bandwidth and latency need to be considered in designing solutions.

In the end, my impression is that most of us tend to focus too much on bandwidth and too little on latency.


  1. If you're talking about iSCSI then you're inferring to a WAN-based interconnect. The problem is really unutilized bandwidth due to TCP overhead (ie. windowing and acknowledgments) for UDP it would be just the unreliability of delivery.

    As for latency in the WAN, unless you invent a particle accelerator, you're not going to fix anything; you can only mask that issue using pure technology such as WAFS.

    Latency reduction in an HPC-type LAN (in the datacenter) would be what gear6 does and is the basis of their entire argument of attacking disk latency the(irregardless to controller caching, disk-based queueing technologies like TCQ/NCQ) problem.

    Well, I'd like to see some type of #'s in two types of environments: the heavily read-only environments and the highly read-write/transactional environments.

  2. I don't see iSCSI in WAN environment. The 10GbE argument by iSCSI proponent is not about WAN either. IMO, iSCSI problem is not as much of bandwidth as much of latency. I have seen and done enough iSCSI installs where only thing between hosts and storage was a GigE network switch but haven't seen any across WAN.

    I think Gear6 value goes beyond controller caching as it is able to perform caching across multiple storage systems and most probably can handle changes in workload across storage systems.