AdaStress: Adaptive Stress Testing and Interpretable Categorization for Safety-Critical Systems
2019-05-21T19:30:32Z (GMT) by
This thesis considers tools and techniques for the design-time validation of cyber-physical systems, where a software system interacts with or controls a physical system over time. We focus on safety-critical systems that may be fully or partially autonomous. The goal of the design-time validation is to identify and diagnose potential failures during system development so that issues can be addressed before they can manifest in operation. However, ﬁnding and analyzing failure scenarios in cyber-physical systems can be very challenging due to the size and complexity of the system, interactions with large environments, operation over time, black box and hidden states, rarity of failures, heterogeneous variable types, and difﬁculty in diagnosing failures. This thesis presents AdaStress, a set of design-time validation tools for ﬁnding and analyzing the most likely failure scenarios of a safety-critical system. We present adaptive stress testing (AST), a framework forsimulation-basedstresstestingtoﬁndthemostlikelypathtoafailureevent. ThekeyinnovationinAST is to frame the search for the most likely failure scenarios as a sequential decision-making problem and then use reinforcement learning algorithms to adaptively search the scenarios. To handle systems with hidden state, we present an algorithm for AST, based on Monte Carlo tree search and pseudorandom seeds, that can be applied to test systems where the state is not fully observable. Furthermore, we present differential adaptive stress testing (DAST), an extension to AST. DAST compares the failure behavior of two systems. Speciﬁcally, DAST ﬁnds the most likely scenarios where a failure occurs with the system under test but not with a baseline system. This type of differential analysis is useful, for example, when choosing between two candidate systems or in regression testing. Lastly, grammar-based decision tree (GBDT) learning is an algorithm for automatically categorizing failure events based on their most relevant patterns. The algorithm combines a context-free grammar, temporal logic, and decision tree to produce categorizations with human-interpretable explanations. We demonstrate AdaStress on two cyber-physical systems within aerospace. The ﬁrst application analyzes prototypes of the next-generation Airborne Collision Avoidance System (ACAS X) in simulated aircraft encounters. We ﬁnd, categorize, and analyze the most likely scenarios of near mid-air collisions (NMACs). We also perform differential studies comparing ACAS X to the existing Trafﬁc Alert and Collision Avoidance System (TCAS). Our results give conﬁdence that ACAS X offers a safety beneﬁt over TCAS. The second application analyzes a prototype trajectory planning system for a small unmanned aircraft navigating through a three-dimensional maze. We ﬁnd and analyze the most likely collision scenarios and planning failures. Our analysis identiﬁes a variety of potential safety issues that include algorithmic robustness issues, emergent behaviors from interacting systems, and implementation bugs.