Is the concept of self-healing a step forward compared to fault tolerance? This paper claims that self-healing components can be achieved by a process consisting of “anomalous object detection, dynamic reconfiguration before and after object self-healing, repair of sick objects, and testing of repaired objects.” At first glance, this looks pretty much like the classical four constituent phases of fault tolerance [1]: “(i) error detection; (ii) damage confinement and assessment; (iii) error recovery; and (iv) fault treatment and continued system service.”
Since self-healing, in contrast to fault tolerance, does not necessarily guarantee continued service, it might not require all four stages at full strength. However, the proposed simplification of step (ii), which only protects the environment against the sick object after it has been detected and until it has been healed, seems to go too far, as potential effects on the connected objects are ignored. The replacement of steps (iii) and (iv), by a repair and test step for the sick object only, is even more problematic, as no error recovery that addresses the corrupted states of other infected objects, and no fault treatment that excludes the possibility that the observed fault can affect the system again, is included. Since the proposal doesn’t present any assumptions or restrictions that justify the outlined simplifications, or outline which phenomena, such as software and hardware faults, or transient or permanent faults, are addressed by the approach, it is rather unclear what is delivered here by self-healing. Furthermore, no evaluation results for the approach are given; my answer to the initial question in this review is that this work is not a step forward, even though self-healing, in general, can be.