posted on 2010-11-01, 00:00authored byJohann Schumann, Ole J Mengshoel, Ashok Srivastava, Adnan Darwiche
More and more systems (e.g., aircraft, machinery, cars) rely
heavily on software, which performs safety-critical operations. Assuring software safety though traditional V&V has become a tremendous, if not impossible task, given the growing size and complexity of the software. We propose that iSWHM (Integrated SoftWare Health Management) can increase safety and reliability of high-assurance software systems. iSWHM uses advanced techniques from the area of system health management in order to continuously monitor the behavior of the software during operation, quickly detect anomalies and perform automatic and reliable root-cause analysis, while not replacing traditional V&V. Information provided by the iSWHM system can be used for automatic mitigation mechanisms (e.g., recovery, dynamic reconfiguration) or presented to a human operator. iSWHM’s prognostic capabilities will further improve reliability and availability as it provides information
about soon-to-occur failures or looming performance bottlenecks. In this paper, we will discuss challenges and future potential and describe how Bayesian networks (BN) could be used for iSWHM modeling.