Pages

Monday, August 28, 2006

Power - The Theme of Data Center Outages

As soon as I finished my last rant (See, Infrastructure Failure - Be Paranoid), a reader pointed out a similar incident close to home, at Fisher Plaza in Seattle.

Fisher Plaza is considered to be a premium data center housing numerous high profile clients and ten different telecom carriers bringing fiber to the buildings. I visit the facility time-to-time as several customers are co-located there.

It seems Fisher Plaza experienced an outage due to electrical power equipment failure few weeks ago knocking KOMO TV and KOMO 1000 News Radio stations offline (See, Unsinkable Data Center Crashes in Seattle). There were also several similar incidents reported in other cities (See, Data Center Outages Bring Headaches, Headlines and InterNAPPing?)

The moral of these incidents for customers are very simple:
  • When planning mission critical operations, look beyond just redundant power supplies, UPS and WAN connections. Facility is as important as your equipment in keeping your operations up. "Many data centers just can't handle new technologies coming out," Comment by a presenter at AFCOM Data Center World Conference (See, Five Predictions: Relocations and Outsourcing)
  • Perform due diligence not only on tenant happiness and satisfaction but also crisis handling and management.
  • Murphy's Law is alive and kicking even for well planned, thought out, mission critical activities.

2 comments:

  1. I hear ya

    I used to help with the Web Tools at Exodus Communications and Cable and Wireless.

    Both were web hosting companies.

    We promised 99.99% uptime, which adds up to a few minutes a year of downtime.

    ReplyDelete
  2. 99.99% uptime claims come with enough caveats to make head spin! I have yet to see a downtime that lasted few minutes. Fisher Plaza downtime lasted 15+ minutes. I guess they spent few years of downtime quota in one incident.

    Customer should be asking questions about number of downtime incidents and the duration of each downtime and then decide whether they can live with that.

    ReplyDelete