Internet services based on the notion of session, such as instant messaging, are growing in popularity. Fault tolerance in such services is achieved by introducing redundancy through replicating the servers that perform session management. Session control protocols, such as session initiation protocol (SIP), are transactional protocols. A key performance indicator of such systems is the transaction time in the event of multiple failovers to different servers.
A proper server selection policy (SSP), which selects the server that will fulfill a client’s session request, is crucial to minimizing the transaction control time. Round robin (RR) is an example of a simple static selection policy that achieves load balancing by selecting servers sequentially in a cycle. This paper presents a novel dynamic SSP, referred to as the maximum availability (MA) SSP, that minimizes the average number of attempted servers until success, thereby minimizing the transaction control time. MA SSP is a dynamic and adaptive algorithm, and has a low implementation complexity: a simple vector with the status of the servers is maintained at each client, and the server with the largest last known uptime, or with the last known shortest downtime, is selected. The paper presents integration of MA into reliable server pooling (RSerPool), a fault-tolerant platform defined by the Internet Engineering Task Force (IETF); however, the policy is applicable on other platforms, such as clusters, and should be of interest to developers implementing Internet protocol (IP) based systems and services.
The authors also developed an analytical model and expressions for evaluation metrics for the system. An event-driven simulator based on the model is used to show that the proposed MA SSP significantly outperforms RR.