From the post:
Netflix has moved on from just releasing the tools it uses to test the resilience of the cloud services that power the video streaming company, and has now open sourced a library that it uses to engineer in that resilience. Hystrix is an Apache 2 licensed library which Netflix engineers have been developing over the course of 2012 and which has been adopted by many teams within the company. It is designed to manage how distributed services interact and give more tolerance to latency within those connections and the inevitable failures that can occur.
The library isolates access points between services and then stops any failures from cascading between those access points. Hystrix uses a Command pattern to execute or queue Command objects and evaluate whether the circuit to the service for which the command is destined for is in operation. This may not be the case where what Hystrix calls a circuit breaker has triggered leaving the circuit “open”. Circuit breakers can be placed into a system to make it easier to trigger a coordinated failover. The library also checks for other issues which may prevent the execution of the command.
Does your distributed TM have the resilience of Netflix?
Is that the new “normal” for resilience?
The post goes on to say that a dashboard is forthcoming to monitor Hystrix.