Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Zooming in on wide-area latencies to a global cloud provider
Jin Y., Renganathan S., Ananthanarayanan G., Jiang J., Padmanabhan V., Schroder M., Calder M., Krishnamurthy A.  SIGCOMM 2019 (Proceedings of the ACM Special Interest Group on Data Communication, Beijing, China,  Aug 19-23, 2019) 104-116. 2019. Type: Proceedings
Date Reviewed: Feb 27 2020

The authors measure wide area network (WAN) latency from the viewpoint of a large cloud provider, Azure, by tracking the round-trip time (RTT) of transmission control protocol (TCP) connections. Presenting their tool BlameIt, the authors aim to find the faults and diagnose where the WAN is having issues.

Tracking where the problem is happening in a large WAN is a pressing challenge in networks today. It is difficult to find where and why problems are occurring, such as data not reaching its destination or packets being lost along the way, as the networks grow and become more complex. This paper presents a passive measurement tool to help localize certain problems in a WAN.

The paper first does a measurement analysis on various aspects of the Azure network. It describes the datasets collected and how they are able to deduce (1) the common countries in which bad RTT is recorded, (2) how long these bad connections last, and (3) how it affects their clients. It then goes on to present BlameIt. The tool is able to passively record various RTT-relevant data to understand where the problems are happening: client-side, middle, or end-side. A number of issues are recognized, for example, middle-segment problems dominate in India, China, and Brazil. The authors also found that the US has more directly related high RTTs than the rest of the world.

By taking measurements on autonomous systems (AS) and the border gateway protocol (BGP), where there is a latency degradation between client and cloud locations, the tool uses a combination of passive measurements (TCP handshake RTTs) and selective active measurements (traceroutes) to localize issues.

The paper is easy to read, and it’s exciting to see how Azure measures and determines where bad performance is happening on its network. In other networks, tools such as perfSONAR and measuring loss are used, and it would be interesting to see how Google Cloud Platform (GCP) and Amazon Web Services (AWS) measure their network performance. This paper is a good read for those working to improve network performance using machine learning.

Reviewer:  Mariam Kiran Review #: CR146910 (2007-0167)
Bookmark and Share
  Editor Recommended
Featured Reviewer
Cloud Computing (C.2.4 ... )
Data Communications (C.2.0 ... )
Network Monitoring (C.2.3 ... )
Reliability, Availability, And Serviceability (C.4 ... )
General (C.2.0 )
Would you recommend this review?
Other reviews under "Cloud Computing": Date
Web portals for high-performance computing: a survey
Calegari P., Levrier M., Balczyski P.  ACM Transactions on the Web 13(1): 1-36, 2019. Type: Article
Sep 24 2021
Orchestrating big data analysis workflows in the cloud: research challenges, survey, and future directions
Barika M., Garg S., Zomaya A., Wang L., Moorsel A., Ranjan R.  ACM Computing Surveys 52(5): 1-41, 2019. Type: Article
Mar 2 2021
Secure sensor cloud
Kumar V., Sen A., Madria S.,  Morgan&Claypool Publishers, San Rafael, CA, 2018. 126 pp. Type: Book (978-1-681734-68-2)
Dec 21 2020

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2021 ThinkLoud, Inc.
Terms of Use
| Privacy Policy