Architecture-Based Run-Time Fault Diagnosis

Casanova, Paulo; Schmerl, Bradley; Garlan, David; Abreu, Rui

doi:10.1184/R1/6621152.v1

file.pdf (596.12 kB)

Architecture-Based Run-Time Fault Diagnosis

journal contribution

posted on 2011-09-01, 00:00 authored by Paulo Casanova, Bradley Schmerl, David Garlan, Rui Abreu

An important step in achieving robustness to run-time faults is the ability to detect and repair problems when they arise in a running system. Effective fault detection and repair could be greatly enhanced by run-time fault diagnosis and localization, since it would allow the repair mechanisms to focus adaptation effort on the parts most in need of attention. In this paper we describe an approach to run-time fault diagnosis that combines architectural models with spectrum-based reasoning for multiple fault localization. Spectrum-based reasoning is a lightweight technique that takes a form of trace abstraction and produces a list (ordered by probability) of likely fault candidates. We show how this technique can be combined with architectural models to support run-time diagnosis that can (a) scale to modern distributed software systems; (b) accommodate the use of black-box components and proprietary infrastructure for which one has neither a specification nor source code; and (c) handle inherent uncertainty about the probable cause of a problem even in the face of transient faults and faults that arise only when certain combinations of system components interact.

History

Publisher Statement

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-23798-0_29

Date

2011-09-01

Usage metrics

Keywords

Autonomic computing diagnosis software architecture run-time

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Architecture-Based Run-Time Fault Diagnosis

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports