The Free Edition of Progress Kemp LoadMaster is a member of the award-winning LoadMaster family. It runs the same code base as its commercial siblings and can be used freely, even in production environments where the throughput requirements do not exceed 20 Mbps and the services do not directly generate revenue. This makes it a perfect choice for low-traffic websites and applications, DevSecOps testing environments, technical training or any other non-commercial deployment needs that you may have.
In this blog, we’ll provide a high-level description of what load balancing is and how it works. Then we’ll highlight the restrictions placed on Free LoadMaster to differentiate it from the commercially licensed editions.
What Is Load Balancing?
At a high level, load balancing is the process of distributing incoming client requests to a pool of application servers. This is done via intelligent algorithms and network traffic monitoring to ensure that client requests get evenly distributed across all servers in the pool. Load balancers also constantly monitor the health and status of each server in the pool to prevent requests from being sent to servers that are too busy or offline. By doing so, the load balancing process helps maximize application availability and minimize downtime.
A grocery store checkout analogy helps explain what load balancers do at a network traffic level. When the number of shoppers in a store is low, having a few checkouts open suffices. However, as the evening rush hour arrives and more people start shopping, the number of people who want to check out increases. If only a few checkouts are operating, long queues build up. Opening additional checkouts adds capacity and allows for an efficient flow of people through the store.
In the same way, a load balancer spreads out client requests over the available servers. Load balancers can automatically add additional virtual machines or new cloud instances to handle any increased demand if required. And take them offline again when demand reduces.
We can extend the checkout analogy to highlight how load balancing deals with a server issue. If someone drops a bottle of tomato juice at a checkout, then that single checkout will need to be closed. People queuing there will be routed to other available checkouts until the cleaning gets completed and the closed checkout reopens. Similarly, if a server or service is unavailable, the load balancers will redirect client traffic to other available servers.
Modern load balancers, including all LoadMaster instances, can also inspect network traffic and route it to servers based on the data type. For example, network video streaming requests can get routed to servers optimized to stream video files.
How Do Load Balancers Work?
A load balancer is responsible for distributing requests for applications or services among multiple servers. There are various techniques and algorithms that load balancers use for this purpose, depending on the type of service or application. Below is a list of the methods that all LoadMaster instances can use to load balance access requests to a server pool:
DNS Round Robin - A simple method used for load balancing servers or providing fault tolerance. This method involves configuring multiple servers with the same services and unique IP addresses but using the same internet domain name. A DNS server maintains a list of all the unique IP addresses associated with the internet domain name. When the load balancer receives a request for the internet domain name, it uses the IP addresses in a random or rotating sequential manner. For instance, the first request gets the IP address of server 1, the second request receives the IP address of server 2 and so on. This ensures an even distribution of workload across multiple servers, improving the overall performance of the system.
Weighted Round Robin - Improves upon the DNS Round Robin method described above. The network administrator assigns each server in the pool a static numerical weight. The administrator assigns the most efficient and powerful server a weight of 100, while a server with half as much processing capacity could be assigned a weight of 50. Similarly, each server in the pool is assigned a weight based on its processing capacity. The server with a weight of 100 receives twice as many requests as the server with a weight of 50 and four times as many requests as a server with a weight of 25. This method ensures that the most efficient and powerful server in the pool receives a higher share of requests compared to less efficient servers.
Least Connection - The Least Connection method differs from DNS Round Robin or Weighted Round Robin in that it considers the current server load when distributing requests. Instead of simply rotating requests among servers, it sends the current request to the server that is currently servicing the least number of active connections.
Weighted Least Connection - This method assigns a numerical value to each server, similar to the Weighted Round Robin method. The load balancer management software uses these numerical values to distribute sessions to servers. If two servers have the same number of active connections, the server with the higher weighting will be given the new request. However, if the servers with the least number of connections all have the same weighting, then the weighting value isn’t used to determine where to route the incoming request.
Agent-Based Adaptive Load Balancing - This uses an agent installed on each server in the pool that reports the current load on the server to the load balancer management software. This real-time current load information gets used when deciding which server can handle a request. The decision gets made in conjunction with other techniques such as Weighted Round Robin and Weighted Least Connection.
Chained Failover (Fixed Weighted) - In this method, a predetermined order of servers is configured in a chain. All requests go to the first server in the chain. When it can’t accept any more requests, the next server in the chain gets sent all new requests until it can’t handle more, then the third server receives all requests and so on for the length of the server chain.
Weighted Response Time - Uses the response time from a server health check to determine the server responding fastest at any particular time. The next client access request then goes to that server. This ensures that any servers under heavy load, which will respond more slowly, are not sent new requests. This allows the load to even out on the available servers in the pool over time.
Source IP Hash - Uses an algorithm that takes the client and server IP addresses and uses them to generate a unique hash key. This key gets used to allocate the client to a particular server. As the client and the server pool can regenerate the key if the session breaks, this load-balancing technique can ensure that the client connects to the same server that it was using previously. Load balancers can use this to deliver session affinity for applications with shopping carts so that items placed in carts are there when a dropped connection gets re-established.
SDN Adaptive - SDN Adaptive (Software Defined Networking Adaptive) combines knowledge of upper networking layers with information about the network’s state at lower layers. Information about data from Layers 4 and 7 and information about the network from Layers 2 and 3 are combined when deciding how to route service requests. This means information about the status of the servers, the status of the applications running on the servers, the health of the network infrastructure connecting the servers, and the level of congestion on the network all play a part in the load balancer's decision-making.
The methods outlined above are not used singularly but in combination to determine the best server to send new requests to. The current level of requests to the load balancers often determines which method gets used. A simple load-balancing method is used when the load is light. But in times of heavy load, the more complex methods are used to ensure an even load under network and service stress.
The load balancer pool and services shared can be located in a single data center or geographically distributed over multiple data centers using Global Server Load Balancing (GSLB). Free LoadMaster supports GSLB, meaning organizations can use it to replicate complex multi-location deployments in development and user acceptance testing (UAT) environments.
Other Functionality Load Balancers Often Deliver
In addition to the core load balancing functionality using the techniques outlined above, LoadMaster also delivers additional support. These additional features are described on the LoadMaster Feature List page. All these load-balancing features are available in Free LoadMaster (bearing in mind the restrictions outlined previously).
Restrictions on Free LoadMaster
As Free LoadMaster is a peer member of the LoadMaster family of products, almost everything true for LoadMaster also applies to Free LoadMaster. The almost in the previous sentence highlights that there are some restrictions applied. The restrictions do not change the code in Free LoadMaster or how you configure it. Instead, they target specific capabilities and functionality more applicable to larger deployments requiring a commercial license. The restrictions on Free LoadMaster are:
You should note that when you purchase a commercial license for LoadMaster and apply it to a Free LoadMaster instance, these restrictions get removed. There is no need to remove your Free LoadMaster deployment and reconfigure a new commercial LoadMaster instance from scratch.
Try Free LoadMaster Today
Don’t just take our word for it. Try Free LoadMaster for yourself today by downloading it for deployment in your virtual infrastructure, or deploy it in the Cloud from the Azure Marketplace or AWS Marketplace.