Publisher's Outlook

Amazon EC2 Outage: What the Experts Tell Us

By Rich Tehrani, CEO, Technology Marketing Corporation  |  May 01, 2011






This article originally appeared in the May 2011 issue of Customer Interaction Solutions

The recent outage of Amazon’s EC2 service affected a number of leading-edge companies, which, in some cases, used the fact that they housed little to no infrastructure on their premises as a selling point to investors. Certainly, the cloud computing market is in what we could call a post-evangelism phase, where there seems to be universal agreement that the cloud has a role in most organizations – at least to help with some tasks – if not all. We know the concept of hosted solutions isn’t new – many companies outsource payroll, for example, or tax preparation.

But, when the largest company providing cloud infrastructure has a major outage of one of its availability zones lasting days, it’s time to sit back and reflect on the challenges of moving wholesale to the cloud without thinking the concept through.

In order to get a sense of what some of the major players in the space had to say, I reached out to a number of people to get their feedback. Here are some of the responses – as I get more, I will be updating this piece at tehrani.com.

Thomas Howe with consulting firm Embrase, produces an event with TMC (News - Alert) called Cloud Communications Expo. When I asked him about whether companies should relocate to the cloud he had this to say:

“Well, in general, they need to run in the cloud. The question isn’t is the cloud safe; the question is, is the cloud safer than what I can do? For nearly all companies, Amazon wins that battle.”

When asked about how to limit the damage from such outages he suggested working with a few vendors like Amazon and Rackspace.

I reached out to John Engates, CTO at Rackspace Hosting (News - Alert), who said, “Companies should do their due diligence with regard to the promises made by their cloud providers and the SLAs in place to back them up. Transparency is key. If the provider is not transparent with its architecture, it’s impossible to fully understand if what you’re building is going to be resilient and highly available on top of that provider.”

He echoed Howe’s sentiment, explaining how cloud outages get major media attention, but they happen all the time in corporate data centers. He further likened a cloud outage to an airline crash: people will continue to fly after a disaster, but a thorough investigation generally results in many lessons learned for the entire industry. He said this happens in isolation, if at all, in a data center.

Another important point is that resources can and should be considered disposable. Rather than protecting a single server with high-availability components, as you would see in a typical data center, you should build a group or cluster of servers to handle the job. He says the companies who used this approach in a multi-data center or multi-cloud manner generally survived the Amazon outage in stride.

I also reached out to Joe Staples (News - Alert), CMO of contact center solutions provider Interactive Intelligence, which offers hosted services as well as premises-based hardware and software. Joe agreed that cloud environments are generally more secure and reliable than data centers.

When asked how customers can limit damage from such outages Staples responded, “Develop a solid business continuity plan. Look for alternate providers that can deliver basic services in case of a primary outage. Ask the question, how long could we be without this service? If it is a critical application, customers should spend the added money to ensure they have a solid business continuity plan to keep them up and running.”

The general consensus here is putting an application in the cloud will generally make it more secure than a typical data center but if you want it to be more resilient you need to take steps to architect a cloud-based solution that can withstand outages in specific data center locations, or even across an entire cloud vendor.

As more and more companies begin to seriously consider the move to cloud-based services, the timing of this outage could even be considered fortunate in hindsight.


Rich Tehrani is CEO of TMC. In addition, he is the Chairman of the world’s best-attended communications conference, INTERNET TELEPHONY Conference & EXPO (ITEXPO (News - Alert)). He is also the author of his own communications and technology blog.

Edited by Jennifer Russell