Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
An evaluation of the cost and performance of scientific workflows on Amazon EC2
Juve G., Deelman E., Berriman G., Berman B., Maechling P.  Journal of Grid Computing 10 (1): 5-21, 2020. Type: Article
Date Reviewed: Sep 23 2021

Cloud computing is becoming popular for computing that requires large processing power and massive data volume. The work reported here aims to study scientific workflows on Amazon Elastic Compute Cloud (EC2). The principal goal is to study runtime performance and cost analysis of three real workflow applications from diverse domains, as well as to study the resource requirements.

The workflows are “loosely coupled parallel applications” that computational tasks handle via data flow and control flow dependencies. Unlike the tightly coupled systems where tasks communicate through networks, workflow tasks typically communicate through files that are created by source task and sent to (or shared by) the destination task.

The applications are from astronomy (Montage), seismology (Broadband), and bioinformatics (Epigenome), and typically run 100s to 1000s of tasks requiring many gigabytes (GBs) of read/writes. These are deployable fully (or partly) in the cloud; however, in this case, the submit host was outside the cloud to manage workflow and worker nodes, while storage was inside the cloud. With the submit host outside, it is easier to deploy workflows and setup, and log data will not be lost. This also becomes a permanent base for access and control, and the cost of data transfer is far less. However, being inside the cloud improves performance. A virtual cluster is used as an execution environment, with c1.xlarge instance type nodes. The findings follow.

There is a performance benefit to the Montage application due to the presence of many small files. Broadband shows the best overall performance, and Epigenome performed better at central processing unit (CPU) bound jobs but performed poorly in input/output (I/O) compared to the other two. Looking at “three different cost categories,” that is, resource, storage, and transfer costs, the performance versus cost relation was not linear for the most part, and there were increasingly lesser payoffs for cost in terms of throughput. However, the general pattern was found to be consistent: more resources, more cost. The average reduction in performance with an increase in number of nodes was attributed to the limited scalability of the applications. Epigenome showed “the least benefit” due to fewer jobs in the workflow, whereas Montage had the greatest benefit due to “having the most jobs and the most traffic between the submit host and the [nodes].” The performance study was largely empirical, with no mathematical relation outcome.

Over and above being a detailed and exhaustive empirical study, the paper can be an informative tutorial introduction to deploying applications on the cloud. It is recommended for graduate and undergraduate students.

Reviewer:  K R Chowdhary Review #: CR147361 (2202-0027)
Bookmark and Share
  Reviewer Selected
Business (J.1 ... )
Cloud Computing (C.2.4 ... )
Grid computing (C.2.4 ... )
Workflow Management (H.4.1 ... )
Would you recommend this review?
Other reviews under "Business": Date
Mind over matter and artificial intelligence: building employee mental fitness for organisational success
Athota V.,  Palgrave Macmillan, New York, NY, 2021. 78 pp. Type: Book (978-9-811604-81-2)
May 30 2022
Enterprise risk management models (3rd ed.)
Olson D., Wu D.,  Springer International Publishing, New York, NY, 2020. 234 pp. Type: Book (978-3-662606-07-0)
Sep 15 2021
Using BPM technology to deploy and manage distributed analytics in collaborative IoT-driven business scenarios
d’Hondt T., Wilbik A., Grefen P., Ludwig H., Baracaldo N., Anwar A.  IoT 2019 (Proceedings of the 9th International Conference on the Internet of Things, Bilbao, Spain,  Oct 22-25, 2019) 1-8, 2019. Type: Proceedings
Jul 15 2021

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2022 ThinkLoud, Inc.
Terms of Use
| Privacy Policy