TMCnet Feature Free eNews Subscription
July 30, 2012

Netflix Makes Chaos Monkey an Open Source Cloud-Crasher

By Steve Anderson, Contributing TMCnet Writer

It may seem a bit counter-intuitive for a company to have a tool designed to deliberately cause apps to fail, but as sages and pundits have said for centuries, failure is often a tool to make things stronger. That's just what Netflix was thinking when it made Chaos Monkey, its primary app-crashing tool, open source so that other cloud services can make themselves stronger, more resilient, and better overall.



Chaos Monkey, named for its ability to cause chaos and, as a result, failure in applications in Amazon Web Services (News - Alert), is designed to help developers see not only what goes into a crash, but also gives them the opportunity to see how to fix the apps to help ensure such crashes don't happen in the first place. Chaos Monkey, according to Netflix, "randomly disables production instances to make sure it can survive common types of failure without any customer impact". Setting Chaos Monkey loose, meanwhile, allows engineers to develop automated recovery processes, which in turn allow failures to go by largely unnoticed as the system recovers from its own failures without outside prodding.

Netflix detailed in a recent blog post its plans to release not only Chaos Monkey, but also several other tools geared toward helping companies improve their cloud offerings. The next offering will likely be, according to Netflix, Janitor Monkey, a system that helps keep costs down by keeping environments clean and running sharply.

With instance failure as common as it is in cloud applications, it's not surprising that an organization like Netflix would want ways to make its systems resilient. Considering the kind of service Netflix offers, they need to be up and running at all hours of the day and night alike, across multiple time zones, even across multiple countries and even continents. They need to be running at all times, and things like Chaos Monkey and Janitor Monkey help do just that.

Indeed, the entire cloud needs that kind of resilience, that kind of self-repair capability and that kind of always-on functionality. There will always be failures in the cloud, but the more that companies can do to have the system right itself around, the better off the whole concept will be for users and providers alike.



Want to learn more about video? Then be sure to attend the Video World Conference & Expo, collocated with ITEXPO West 2012 taking place Oct. 2-5, in Austin, TX. The Video World Conference & Expo highlights the latest strategies and technologies available to executives who are serious about leveraging emerging video communications capabilities to build competitive advantage. By bringing together the industry’s most innovative video technology vendors with end-users who are pioneering the use of video in the corporate environment, the Video World Conference & Expo mirrors a burgeoning market -- no longer just a corporate novelty -- with a growing appetite for learning the best practices in implementing video to make business communications more engaging and effective. For more information on registering for the Video World Conference & Expo click here.

Stay in touch with everything happening at Video World Conference & Expo. Follow us on Twitter.




Edited by Brooke Neuman
» More TMCnet Feature Articles
Get stories like this delivered straight to your inbox. [Free eNews Subscription]
SHARE THIS ARTICLE

LATEST TMCNET ARTICLES

» More TMCnet Feature Articles