Computing Reviews, the leading online review service for computing literature.

Search

Fault Injection for Dependability Validation: A Methodology and Some Applications
Arlat J., Aguera M., Amat L., Crouzet Y., Fabre J., Laprie J., Martins E., Powell D. IEEE Transactions on Software Engineering16 (2):166-182,1990.Type:Article

Date Reviewed: Oct 1 1991

The authors have developed a moderately systematic approach for estimating the dependability of fault-tolerant computing systems. They propose injecting signals into the hardware at the integrated circuit pin level and monitoring the system to determine measures such as which faults are detected, which are corrected, and the length of time it takes the system to respond to the faults. The method depends on identifying the faults (F) to be detected by the system, the activations (A) of the system over which the faults might occur, the readouts (R) used to monitor the system’s operation, and the measures (M) derived from the readouts, so the dependability of the system can be analyzed. The authors dub these concerns the FARM set and point out that it may be used to analyze dependability from an axiomatic model of the system (such as a Petri net model), a simulation of the system, or a physical model of the system, including operational prototypes. The method has only been applied to physical models, however. The authors have designed and built a fault injection system, called MESSALINE, to implement their approach. MESSALINE has been used to test two systems, and the complete results have been reported elsewhere. A summary of the results is included in this paper, providing measures such as the percentage of injected faults that were detected and the latency time for fault correction. The method described in the paper seems to be a step in the right direction for validating the design of fault-tolerant systems. It cannot yet be called a methodology because the authors fail to give sufficient detail to permit others to apply the same approach; for example, they do not give a systematic means for deriving the activations to be used in testing. In addition, only hardware faults are considered; the authors make no attempt to analyze the behavior of the system with respect to software faults. Finally, no objective technique seems to exist for deciding how complete the coverage is that one obtains from using the approach. For one not versed in the terminology and methods of hardware fault tolerance, the paper is difficult reading. Terminology, such as fault, failure, and activation, is largely undefined, and the authors have used a style that tends to obscure rather than clarify issues.

Reviewer: David M. Weiss	Review #: CR123788

Control Structure Reliability, Testing, And Fault-Tolerance (B.1.3 )

Fault-Tolerance (D.4.5 ... )

Reliability And Testing (B.7.3 )

Would you recommend this review?

yes

Other reviews under "Control Structure Reliability, Testing, And Fault-Tolerance":	Date

Concurrent fault detection in microprogrammed control units Iyengar V., Kinney L. IEEE Transactions on Computers 34(9): 810-821, 1985. Type: Article	Jul 1 1986

Formal Verification of Fault Tolerance Using Theorem-Proving Techniques J J., Smith B., Wojcik A. IEEE Transactions on Computers 38(3): 366-376, 1989. Type: Article	Oct 1 1989

Area-energy tradeoffs of logic wear-leveling for BTI-induced aging Ashraf R., Khoshavi N., Alzahrani A., DeMara R., Kiamehr S., Tahoori M. CF 2016 (Proceedings of the ACM International Conference on Computing Frontiers, Como, Italy, May 16-19, 2016)37-44, 2016. Type: Proceedings	Oct 4 2016

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy