Dependable Cyber-Physical Systems
CPS (Cyber-Physical Systems) enable a new class of applications that perceive their surroundings using raw data from sensors, monitor the timing of dynamic processes, and control the physical environment. Since failures and misbehaviors in application domains such as cars, medical devices, nuclear power plants, etc., may cause significant damage to life and/or property, CPS need to be safe and dependable. A conventional way of improving dependability is to use redundant hardware to replicate the whole (sub)system. Although hardware replication has been widely deployed in conventional mission-critical systems, it is cost-prohibitive to many emerging CPS application domains. Hardware replication also leads to limited system flexibility. This dissertation studies the problem of making CPS affordably dependable and develops a system-level framework that manages critical CPS resources including processors, networks, and sensors. Our framework called SAFER (System-level Architecture for Failure Evasion in Real-time applications) incorporates configurable software mechanisms and policies to tolerate failures of critical CPS resources while meeting their timing constraints. It supports adaptive graceful degradation, the effective use of different sensor modalities, and the fault-tolerant schemes of hot standby, cold standby, and re-execution. SAFER reliably and efficiently allocates tasks and their backups to CPU and sensor resources while satisfying network traffic constraints. It also fuses and (re)configures sensor data used by tasks to recover from system failures. The SAFER framework aims to guarantee the timeliness of different types of tasks that fall into one of four categories: (1) tasks with periodic arrivals, (2) tasks with continually varying periods, (3) tasks with parallel threads, and (4) tasks with self-suspensions. We offer the schedulability analyses and runtime support for such tasks with and without resource failures. Finally, the functionality of the proposed system is evaluated on a self-driving car using SAFER. We conclude that the proposed framework analytically satisfies timing constraints and predictably operates systems with and without resource failures, hence making CPS dependable and timely.