This paper presents a resource selection approach in large-scale distributed systems. The approach is decentralized and self-adapting, and aims to address scalability and resiliency in the face of changes in the number of attributes (representing the characteristics of nodes and the requirements of users and applications). The authors also address changes in the overall system composition (nodes opting in and out of the system). As noted by the authors, the paper is an extended version of a 2009 conference paper [1].
The paper is well written and nicely structured. The authors introduce challenges related to the resource selection problem in terms of scalability, and how to dynamically adapt to changes over time in the composition of the system or in the requirements of applications. They then present the system model, describing how they characterize the nodes and the overall system, and describe base resource discovery (without self-adaptation), including the network topology used, and query routing. The paper includes interesting discussions on dealing with updates and the reconfiguration of the system and providing maintenance in the presence of dynamic changes (churn, failure, and the set of attributes). Self-adaptation can impact or improve the system (in terms of workload, delivery, and so on). The presented approach is based on gossip-based protocols, which are also explained in this paper.
Every section includes an evaluation section with simulations or emulated setups (including some that use PeerSim). There are, however, no experiments involving actual deployments, as claimed in the conclusion. Some of the simulation setups are based on data from actual deployment traces from the Berkeley open infrastructure for network computing (BOINC) platform, but these are two completely different things. This is a drawback of the paper, as some of the simulation results are actually confusing and difficult to interpret. Confusing details include, for example, quite random routing overhead versus number of attributes in the system, and inexplicable and quite random drops in the delivery rates versus churn (rates even drop to zero in figure 9(a)). Real-world deployment and experimentation would greatly enhance the value of the work presented in this paper.