Full metadata
Title
InCheck - an integrated recovery methodology for fine-grained soft-error detection schemes
Description
Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead.
In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes.
To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times.
In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes.
To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times.
Date Created
2016
Contributors
- Lokam, Sai Ram Dheeraj (Author)
- Shrivastava, Aviral (Thesis advisor)
- Clark, Lawrence T (Committee member)
- Mubayi, Anuj (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
vi, 29 pages : illustrations (some color)
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.40720
Statement of Responsibility
by Sai Ram Dheeraj Lokam
Description Source
Viewed on January 3, 2017
Level of coding
full
Note
thesis
Partial requirement for: M.S., Arizona State University, 2016
bibliography
Includes bibliographical references (pages 27-19)
Field of study: Computer science
System Created
- 2016-12-01 07:01:24
System Modified
- 2021-08-30 01:20:48
- 3 years 3 months ago
Additional Formats