Load balancing and sticky sessions in clustering

4 min readAug 13, 2020

As a small recap, what is clustering?

Clustering is having one endpoint shared among an identical group of service nodes to achieve high availability and scalability.

The cluster would have a set of service nodes(or servers) acts as a one service node endpoint that can complete client tasks by balancing the load. Now the question we get is how to balance the load? and which server or service node is responsible to execute the next client request that is coming to the shared endpoint?
The answer is, that is the responsibility of the “Load Balancer” or in other words, there is one specific application is there to decide which request is going to get handled by which server or service node in the cluster.

What is Load Balancing?

A load balancer is the software or hardware application that acts as the traffic light or the traffic cop for the incoming traffic to the servers. It decides which request should go to which server according to a predefined algorithm that checks each server status. There are many algorithms available for different load balancers to make decisions on the incoming traffic routing. The goal is to make sure that none of the servers would get overloaded with traffic and if one goes down then reroute the incoming requests to another one.

As a summary, a load balancer provides below functionalities.

Make sure servers won't get overloaded and route requests efficiently
Provide high availability of the services by routing the requests to online servers by avoiding downtime
Handle the scalability of servers by adding or removing servers based on the load servers get.

There are different load balancing algorithms available, and those algorithms would be a deciding factor to choose the correct load balancer depending on your needs.

Some algorithms available are

Round Robin — Requests are distributed across the group of servers sequentially.
Least Connections — A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
Least Time — Sends requests to the server selected by a formula that combines the fastest response time and fewest active connections.
Hash — Distributes requests based on a key you define, such as the client IP address or the request URL.
IP Hash — The IP address of the client is used to determine which server receives the request.
Random with Two Choices — Picks two servers at random and sends the request to the one that is selected by then applying the Least Connections algorithm.

What are the benefits of Load balancing?

Scalability — When there is a load balancer it can decide when to add or remove servers or service nodes from the cluster.

Availability — Achieve this by reducing the downtime of the servers and re-routing the requests to the online servers.

High efficiency — By making it possible to execute a set of tasks parallel.

There are some other advantages as well when it comes to Load balancing but above are the major advantages.

What are Sticky sessions in Load balancing?

Sometime when executing tasks we need to persist the state of the current user session. As an example think of a shopping cart application, where we need to manage the state of the user session throughout all the requests that the user is sending to that application.

What happens if a Load balancer routes one request to one server/service node in a cluster and sends the other request to another server/service node in the same cluster? Some of the user information will be lost or there has to be a mechanism to share that information. These scenarios can cause transaction failures or data losses.

In that case, we need to tell the Load balancer to send all the following requests coming from a particular user session to one server to process. To inform that to the load balancer we use a technique called sticky sessions or in other words session persistence.

Why a Load balancer should be able to add or remove servers on demand?

This is important when it comes to the payment model because most of the time and most of the payment schemes are based on the amount of load that is being handled, it can be a number of requests or it can be a number of servers which cater to the load. In that case, the load balancer should be able to add or remove servers based on the demand that it gets. Which enables the user to pay for the computing capacity it actually uses.

Hardware vs Software Load balancing

The two ways of load balancing are Hardware and Software, it depends on the nature of the business you are having and the nature of the application that you are using to cater to those business needs. Usually, hardware-based load balancers use their own proprietary applications and processors with servers to facilitate Load balancing which requires more infrastructure and which can be costly. Compared to that, Software Load balancing is easier and less expensive.

Load balancing and sticky sessions in clustering

Written by Lakshitha Samarasingha