Sunteți pe pagina 1din 35

Load Balancing for Web Server

Submitted By: Umesh Kakkad(99IT204)

Project Guide: Mr. Ketan Kotecha

Submitted To: Department of Information Technology G.H.Patel College of Engineering & Technology

Few Statistics


Goolge.com has about 30 million hit a day. eCommerce portal Amozon.com has thousands of transactions occurring at a given instance of time. Hotmail keeps track of about 2 million mail session at a given instance.

How are these sites able to stand against such a huge number of hits without ever going down?
Do they have some super computers behind them which accomplish this task? Is there the cluster of computer which works coordinately to perform the duty?

Features of present web sites


   

High Availability Minimum Response time Content Security without unnecessary overheads eCommerce

How to Run an Application Faster ?




There are 3 ways to improve performance:


Work Harder Work Smarter Get Help

Computer Analogy
Use faster hardware: e.g. reduce the time per instruction (clock cycle) Optimized algorithms and techniques Multiple computers to solve problem: That is, increase no. of instructions executed per clock cycle. i.e. Cluster computing

What is a cluster?


A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected standstandalone/complete computers cooperatively working together as a single, integrated single, computing resource. Each of these autonomous machine contributes their resources in the form of:
CPU cycles Memory Network Bandwidth Disk

Motivation for using Clusters




Surveys show utilization of CPU cycles of desktop workstations is typically <10%. Performance of workstations and PCs is rapidly improving As performance grows, percent utilization will decrease even further Organizations are reluctant to buy large supercomputers, due to the large expense and short useful life span.

Cluster in Web Server




The collection of autonomous machines acting as web servers are the working horses of the cluster. All the request are distributed among these servers evenly. Optimum load balancing techniques is used to decide which server is going to entertain the next request.

Features of Load Balanced Cluster




Server load should be evenly distributed among the machines If any server fails, the failure should be masked by distributing the requests to the remaining servers without bringing down the service. The design should have the capability to add more machines to the cluster without bringing down the service Efficient Load Balancing techniques should be used to distribute the load among the slave nodes.

Existing Load Balancing techniques for Web Server


   

HTTPHTTP-redirection approach ClientClient-side approach ServerServer-side DNS approach The Server-side single-IP-image Serversingle-IPapproach Application Gateway System

Goals of Project
 

  

High availability Design an efficient load balancing algorithm Increase performance Scalability Efficient monitoring of the system performance and adequate logging mechanism

Balancing Algorithms
  

Round Robin Algorithm Based on Probability Distribution Proposed Algorithm (Dynamic Feedback Algorithm)

Dynamic Feedback Algorithm




This algorithm calculates Balancing Metric using various system parameter :


CPU utilization Memory utilization Bandwidth utilization Balancing Metric is used to determine which slave node will entertain the next request.

Calculating balancing metric for Dynamic Feedback algorithm


Let current sampled values of CPU utilization is cpu % Memory utilization is mem % Bandwidth utilization is bw % Weighted CPU utilization, wd_cpu is defined as wd_cpu= wd_cpu(old) +cpu * RF_CPU 1+RF_CPU Where RF_CPU can be defined as the Relative Factor i.e. it determine what amount of effect does the newly sampled value of CPU utilization will have on the Weighted CPU utilization and in turn on the balancing metric. Similarly, we can have wd_mem and wd_bw as the weighted memory and band width utilization And RF_MEM and RF_BW as the Relative Factor for Memory and bandwidth.

Calculating Balancing Metric


WF_CPU denotes the Weight factor of CPU utilization towards calculating metric value. Similarly, WF_MEM and WF_BW are the weight factors for Memory and Bandwidth utilization. Now, the Balancing metric can be given as Metric= (WF_CPU * wd_cpu + WF_MEM * wd_mem + WF_BW *
WF_CPU+WF_MEM+WF_BW wd_bw)

Important Consideration for Dynamic Feedback Algorithm




 

Optimizing Relative And Weight Factors Sampling Delay Optimizing Number of Slave

Architecture of the Load Balancing System

Implementation Issues


 

Request Forwarding 1. Network Address Translation (NAT) 2. IP Tunneling 3. Distributed Packet Rewriting (DPR) 4. Request forwarding and rewriting response Site of Implementation 1. DNS Server 2. Router Single vs. Multiple Master (Load Balancer) Scalability

Working of System

Various Modules of the System




Load Balancing Algorithms


1.Round Robin Algorithm 2.Random (Based on Gaussian distribution) 3.Dynamic Feedback algorithm.

  

HTTP request redirection. Graphical User Interface Performance monitoring and sampling of system parameters Agents at slave node

Performance Panel

Status Window of a particular Slave node

Development Tools and Programming Language Used


   

Java (JDK1.4.) Visual C++ 6.0 XML version 1.0 Java Web Server 2.0

Limitations


All the responses to the requests are served through the load balancer it self. This adds unnecessary overhead of rewriting the response by the load balancer. The system has been designed at Application layer using Request forwarding and rewriting response The system is designed with only single Load Balancer, and this could be a bottle neck to the system when ,because if the load balancer fails the system as a whole fails.

Future Enhancements


Direct Response to the client by the slave node without re-writing of reresponse at Load Balancer The system can be implemented at lower layer in TCP/IP suit using NAT or IP tunneling. Using multiple Load Balancers.

BackBack-up Slides

Network Address Translation (NAT)

IP Tunneling

Distributed Packet Rewriting (DPR)

Main Panel

Status Panel

Log Panel

Configuration Panel

Clustering of Computers for Collective Computing: Trends

Cluster Computing


Collection of autonomous computer contributing their resources to accomplish a particular computing task. Resources can be in the form of:
CPU cycles Memory Network Bandwidth Disk

S-ar putea să vă placă și