Computing Reviews, the leading online review service for computing literature.

Search

Self-healing components in robust software architecture for concurrent and distributed systems
Shin M. Science of Computer Programming57 (1):27-44,2005.Type:Article

Date Reviewed: Jan 5 2006

Is the concept of self-healing a step forward compared to fault tolerance? This paper claims that self-healing components can be achieved by a process consisting of “anomalous object detection, dynamic reconfiguration before and after object self-healing, repair of sick objects, and testing of repaired objects.” At first glance, this looks pretty much like the classical four constituent phases of fault tolerance [1]: “(i) error detection; (ii) damage confinement and assessment; (iii) error recovery; and (iv) fault treatment and continued system service.” Since self-healing, in contrast to fault tolerance, does not necessarily guarantee continued service, it might not require all four stages at full strength. However, the proposed simplification of step (ii), which only protects the environment against the sick object after it has been detected and until it has been healed, seems to go too far, as potential effects on the connected objects are ignored. The replacement of steps (iii) and (iv), by a repair and test step for the sick object only, is even more problematic, as no error recovery that addresses the corrupted states of other infected objects, and no fault treatment that excludes the possibility that the observed fault can affect the system again, is included. Since the proposal doesn’t present any assumptions or restrictions that justify the outlined simplifications, or outline which phenomena, such as software and hardware faults, or transient or permanent faults, are addressed by the approach, it is rather unclear what is delivered here by self-healing. Furthermore, no evaluation results for the approach are given; my answer to the initial question in this review is that this work is not a step forward, even though self-healing, in general, can be.

Reviewer: Holger Giese	Review #: CR132245 (0607-0731)

1)	Fault tolerance: principles and practice. Springer, Wien, Austria, 1990.

Reliability (D.2.4 ... )

Distributed Systems (D.4.7 ... )

Would you recommend this review?

yes

Other reviews under "Reliability":	Date

Software reliability: measurement, prediction, application Musa J., Iannino A., Okumoto K., McGraw-Hill, Inc., New York, NY, 1987. Type: Book (9789780070440937)	Dec 1 1987

Software reliability--theory and practice Hsia P. (ed) Computers and Electrical Engineering 11(2-3): 145-149, 1984. Type: Article	Apr 1 1986

Assessment of software reliability models Troy R., Moawad R. IEEE Transactions on Software Engineering SE-11(9): 839-849, 1985. Type: Article	Jun 1 1986

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy