标签 Software resilience 下的文章

Short Introduction to This Paper

This paper gives us an introduction about how Etsy uses "GameDay" to build more confidence about their system's behavior. Specifically, it includes the discussion about 1) why apply it in production environment, 2) how to do fault injection during a GameDay exercise, 3) business justification and 4) a case, limitations and fear.

Highlights of This Paper

  • Introduction about the provisioning of a server or cloud instance from zero to production
  • Explanation about why many complex systems are largely intractable
  • Pattern about GameDay exercise, introducing the methodology of how they doing fault injection in a real company

- 阅读剩余部分 (Read the rest) -

Short Introduction to This Paper

This paper aims at analyzing and improving how software handles unanticipated exceptions. The first objective is to set up contracts about exception handling and a way to assess them automatically. The second one is to improve the resilience capabilities of software by transforming the source code. The authors devise an algorithm, called short-circuit testing, which injects exceptions during test suite execution so as to simulate unanticipated errors. It is a kind of fault-injection techniques dedicated to exceptionhandling. This algorithm collects data that is used for verifying two formal contracts that capture two resilience properties w.r.t. exceptions: the source-independence and pure-resilience contracts. Then the team propose a code modification technique, called “catch-stretching” which allows error-recovery code (of the form of catch blocks) to be more resilient.

Highlights of This Paper

  • This work shows that it is possible to reason on software resilience by injecting exceptions during test suite execution
  • Definition of two contracts for exception handling: source independence contract, pure resilience contract
  • An algorithm and four predicates to verify whether a try-catch satisfies those contracts
  • A source code transformation to improve the resilience against exceptions
  • An empirical evaluation on 9 open sources applications with one test suite each showing that there exists resilient try-catch blocks in practice

Key Infomation

1.png

2.png

  • Source-independent: A try-catch is source-independent if the catch block proceeds equivalently, whatever the source of the caught exception is in the try block
  • Pure Resilience: A try-catch is purely resilient if the system state is equivalent at the end of the try-catch execution whether or not an exception occurs in the try block
  • Short-circuit testing consists of dynamically injecting exceptions during the test suite execution in order to analyze the resilience of try-catch blocks
  • Catch Stretching: Replacing the type of the caught exceptions so that they catch more exceptions than before. For instance, replacing catch(FileNotFoundException e) by catch(IOExceptione). The extreme of catch stretching is to parametrize the catch with the most generic type of exceptions(e.g. Throwable in Java, Exception in .NET)

Relevant Future Works

  • Further exploring how to improve the resilience of software applications: the scope of try blocks can be automatically adapted while still satisfying the test suite
  • The purely resilient catch blocks could probably be used elsewhere because they have a real recovery power
  • The resilience oracle has not to be only a test suite, but for example metamorphic relations or production traces
  • Automated refactoring of the relevant test suite

Questions

  • How to do catch stretching when there is a try with multiple catch blocks? And maybe the original test suites are not enough to verify new catch blocks

URL

Exception Handling Analysis and Transformation Using Fault Injection - Study of Resilience Against Unanticipated Exceptions