Wednesday, August 16, 2006

Infrastructure Failure – Be Paranoid

Just after writing my previous blog post Data Center Power Consumption and Heat Generation, I came across an interesting blog post with 400+ comments. It details the failure of hosting infrastructure at DreamHost (See, Anatomy of a(n ongoing) Disaster..) In my opinion, it is a recommended reading for everyone who manages or designs IT infrastructure for living.

Here are excerpts from the post with my commentary and takeaways.
Ironically, all the recent disasters stem somewhat from us attempting to take some proactive steps to head off any sort of future power outages like the kind we experienced last year.
Instead of narrowly focus on preventing something from happening again, use the event as wake up call. Assess your environment for other potential risks and develop comprehensive plan to address them. Also, be aware of new problems that may arise while solving another problem. I like to use the Chess analogy - Further you anticipate moves, better your chances of prevailing.
We're now basically 95% of their data center.
Consider how important you are to your vendor and leverage your position to negotiate better deal.
The Garland Building is supposed to be an excellent place for data centers. There are more than a dozen in the building. Companies like iPowerWeb, Media Temple, BroadSpire, and even MySpace (now the most popular website in the whole US!) are in there.
Do your own due diligence even if you think vendor passed the due diligence by a larger well known company. Their failure may be critical to your business and a drop in the bucket for these "other"” companies. It is not uncommon for vendors to offer sweet deals to attract high profile companies.
Around last June though, the building informed all its data center tenants that they had essentially run out of power!
Don'’t wait for other shoe to fall before taking actions. Be paranoid.
After months of searching and negotiating with Alchemy, we still had to get Switch and Data to allow us to put a cross-connect in from their data center over to their competitors down the hall.
Finding an alternate data center down the hall may seem quick and easy fix to existing power problem. But such short-sighted and point solutions fail to address other lingering issues such as Disaster Recovery / Business Continuity. I guess the company plan is to wait until problem occurs before addressing them.

No incumbent makes competitor entry easy and painless. Get specific deliverables when you have the leverage. Be prepared to work alone and have a Plan B considering total non-cooperation from vendor.

No comments:

Post a Comment