Browse All Reviews
Operating Systems (D.4)
> Fault-Tolerance (D.4.5...)
All Media Types
1-10 of 107 Reviews about "
ARMISCOM: self-healing service composition
Vizcarrondo J., Aguilar J., Exposito E., Subias A. Service Oriented Computing and Applications 11(3): 345-365, 2017. Type: Article
The evolution of service-oriented architecture (SOA) and the demands from industry provide opportunity on one hand, and requirements on the other--to develop fault-tolerant, robust, and resilient configurations of web services within an SOA e...
Jan 4 2018
Chunks and Tasks: a programming model for parallelization of dynamic algorithms
Rubensson E., Rudberg E. Parallel Computing 40(7): 328-343, 2014. Type: Article
In the paper, a novel parallel programming model, Chunks and Tasks, on top of C++ is presented. In this model, the programmer uses common C++ code to expose parallelism in two respects: data (chunks) and work (tasks). This parallelism is then auto...
Dec 10 2014
Event logs for the analysis of software failures: a rule-based approach
Cinque M., Cotroneo D., Pecchia A. IEEE Transactions on Software Engineering 39(6): 806-821, 2013. Type: Article
A very good description of the challenges of parsing log files begins this paper. The authors explain that ad-hoc logging statements are insufficient for fully understanding and tracking errors in software systems. They propose an abstraction of r...
Sep 5 2013
A resiliency model for high performance infrastructure based on logical encapsulation
Moore J., Kesselman C. HPDC 2012 (Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, Delft, the Netherlands, Jun 18-22, 2012) 283-294, 2012. Type: Proceedings
Heterogeneous, dynamically provisioned distributed systems for high-performance computing are becoming increasingly available to support a diverse range of compute- and storage-intensive tasks. The authors of this paper propose a resiliency model ...
Jul 8 2013
Depot: cloud storage with minimal trust
Mahajan P., Setty S., Lee S., Clement A., Alvisi L., Dahlin M., Walfish M. ACM Transactions on Computer Systems 29(4): 1-38, 2011. Type: Article
Storage service providers (SSPs) are fault-prone black boxes operated by a third party. Prudent clients should avoid strong assumptions about the integrity of data stored remotely and implement some form of end-to-end checks. Based on these premis...
Apr 2 2012
Architecting dependable systems VI (LNCS 5835)
de Lemos R., Fabre J., Gacek C., Gadducci F., ter Beek M. Springer-Verlag, New York, NY, 2009. Type: Divisible Book
A system is defined to be dependable if reliance can be justifiably placed on the service it delivers. It is a definition adopted a few decades ago, originating from the work of the IFIP Working Group 10.4 on Dependable Computing and Fault Toleran...
Feb 17 2011
Stochastic models for fault tolerance: restart, rejuvenation and checkpointing
Wolter K., Springer Publishing Company, Incorporated, New York, NY, 2010. 269 pp. Type: Book (978-3-642112-56-0)
There are only a few books that treat fault-tolerant computer systems at a theoretical level, and this is one of them. Wolter’s textbook presents, in a compact form, three issues that will interest specialists in distributed systems and soft...
Nov 4 2010
Practical and low-overhead masking of failures of TCP-based servers
Zagorodnov D., Marzullo K., Alvisi L., Bressoud T. ACM Transactions on Computer Systems 27(2): 1-39, 2009. Type: Article
An interesting approach to maintaining transmission control protocol (TCP) connections when a server crashes is described in this paper. TCP is usually implemented in the operating system, and the protocol provides reliable communications between ...
Apr 15 2010
The weakest failure detector for wait-free dining under eventual weak exclusion
Sastry S., Pike S., Welch J. SPAA 2009 (Proceedings of the 21st Annual Symposium on Parallelism in Algorithms and Architectures, Calgary, AB, Canada, Aug 11-13, 2009) 111-120, 2009. Type: Proceedings
Sastry, Pike, and Welch study the classic problem of dining philosophers from a new perspective: the equivalence of wait-free dining, under eventual weak exclusion, with the class of eventually perfect failure detectors....
Oct 29 2009
SystemC-based minimum intrusive fault injection technique with improved fault representation
Shafik R., Rosinger P., Al-Hashimi B. IOLTS 2008 (Proceedings of the 2008 14th IEEE International On-Line Testing Symposium, Rhodes, Greece, Jul 7-9, 2008) 99-104, 2008. Type: Proceedings
Fault injection is a popular approach for evaluating how a system behaves when failures occur. Such an approach may be carried out on the delivered system, in a laboratory environment, or during the design phase, enabling the designer to intervene...
Feb 2 2009
Reproduction in whole or in part without permission is prohibited. Copyright © 2000-2022 ThinkLoud, Inc.