Error Detection with Memory Tags
The ability to achieve and maintain system reliability is an important problem that has become more critical as the use of computers has become more common. Fundamental to improved reliability is the ability to detect errors promptly, before their effects can be propagated.
This dissertation proposes methods for using storage tags to detect a broad class of hardware and software errors that might otherwise go undetected. Moreover, the suggested schemes require minimal extensions to the hardware of typical computers. In fact, it is shown that in many situations tags can be added to words of storage without using any extra bits at all.
Although tagging is central to the discussion, the methods used differ radically from those used in traditional tagged architectures. Most notably, no attempt is made to use the tags to control what computations are performed. Instead, the tags are used only to check the consistency of those operations that would be performed anyway in the absence of tagging. By so doing, redundancy already present in typical programs can be harnessed for detecting errors. Furthermore, it becomes possible to check an arbitrary number of assertions using only a small tag of fixed size.
The dissertation examines various strategies for exploiting the proposed tagging mechanisms; both the positive and negative aspects of each application are considered. Finally, an example is described, showing how tagging might be implemented in a real machine.