Bread and butter of Operations role: Load Balancing Scheduling Algorithms

Round Robin

Essentially this is a simple mechanism in which the content access request is responded to by the load balance in a rotational basis, the first request grants access to the first available content server giving its IP address and the second to the second server IP address and so on.
The moment a server IP address has been given its IP address is moved to the back of the list of available IP addresses and gradually it moves back to the top of the list and becomes available again.
The frequency that it returns to the top depends on the number of available servers in the round robin server cluster being used.
A good way to think of this is a method of server allocation on a continuous looping fashion.
With this method incoming requests are distributed sequentially across the server farm (cluster), i.e. the available servers.
If this method is selected, all the servers assigned to a Virtual Service should have the similar resource capacity and host identical applications.
Choose round robin if all servers have the same or similar performance and are running the same load.

Weighted Round Robin

This method balances out the weakness of the simple round robin: Incoming requests are distributed across the cluster in a sequential manner, while taking account of a static “weighting” that can be pre-assigned per server.
The administrator simply defines the capacities of the servers available by weighting the servers.
The most efficient server A, for example, is given the weighting 100, whilst a much less powerful server B is weighted at 50.
This means that Server A would always receive two consecutive requests before Server B receives its first one, and so on.

Least Connection

Both round robin methods do not take into account that the system does not recognize how many connections are maintained over a given time.
It could therefore happen that Server B is overloaded, although it receives fewer connections than Server A, because the users of this server maintain their connections longer.
This means that the connections, and thus the load for the server, accumulate.
This potential problem can be avoided with the "least connections" method:

Requests are distributed on the basis of the connections that every server is currently maintaining.
The server in the cluster with the least number of active connections automatically receives the next request.
Basically, the same principle applies here as for the simple round robin: The servers related to a Virtual Service should ideally have the similar resource capacities.
Please note that in configurations with low traffic rates, the traffic will not balance out and the first server will be preferred.
This is because if all the servers are equal, then the first server is preferred.
Until the traffic reaches a level where the first server continually has active traffic, the first server will always be selected.

Fixed Weighted

The highest weight Real Server is only used when other Real Server(s) are given lower weight values.
However, if highest weight server falls, the Real Server with the next highest priority number will be available to serve clients.
The weight for each Real Server should be assigned based on the priority among Real Server(s).

Bread and butter of Operations role