This thesis considers tools and techniques for the design-time validation of cyber-physical systems, where a software system interacts with or controls a physical system over time. We focus on safety-critical systems that may be fully or partially autonomous. The goal of the design-time validation is to identify and diagnose potential failures during system development so that issues can be addressed before they can manifest in operation. However, finding and analyzing failure scenarios in cyber-physical systems can be very challenging due to the size and complexity of the system, interactions with large environments, operation over time, black box and hidden states, rarity of failures, heterogeneous variable types, and difficulty in diagnosing failures. This thesis presents AdaStress, a set of design-time validation tools for finding and analyzing the most likely failure scenarios of a safety-critical system. We present adaptive stress testing (AST), a framework forsimulation-basedstresstestingtofindthemostlikelypathtoafailureevent. ThekeyinnovationinAST is to frame the search for the most likely failure scenarios as a sequential decision-making problem and then use reinforcement learning algorithms to adaptively search the scenarios. To handle systems with hidden state, we present an algorithm for AST, based on Monte Carlo tree search and pseudorandom seeds, that can be applied to test systems where the state is not fully observable. Furthermore, we present differential adaptive stress testing (DAST), an extension to AST. DAST compares the failure behavior of two systems. Specifically, DAST finds the most likely scenarios where a failure occurs with the system under test but not with a baseline system. This type of differential analysis is useful, for example, when choosing between two candidate systems or in regression testing. Lastly, grammar-based decision tree (GBDT) learning is an algorithm for automatically categorizing failure events based on their most relevant patterns. The algorithm combines a context-free grammar, temporal logic, and decision tree to produce categorizations with human-interpretable explanations. We demonstrate AdaStress on two cyber-physical systems within aerospace. The first application analyzes prototypes of the next-generation Airborne Collision Avoidance System (ACAS X) in simulated aircraft encounters. We find, categorize, and analyze the most likely scenarios of near mid-air collisions (NMACs). We also perform differential studies comparing ACAS X to the existing Traffic Alert and Collision Avoidance System (TCAS). Our results give confidence that ACAS X offers a safety benefit over TCAS. The second application analyzes a prototype trajectory planning system for a small unmanned aircraft navigating through a three-dimensional maze. We find and analyze the most likely collision scenarios and planning failures. Our analysis identifies a variety of potential safety issues that include algorithmic robustness issues, emergent behaviors from interacting systems, and implementation bugs.