What Is Load Balancing?
A load balancer is a device or software component that sits in front of a pool of servers and distributes incoming client requests across those servers according to a defined algorithm. Without load balancing, all traffic hits a single server — once that server is saturated, response times degrade and the application fails for users.
Load balancing solves three core problems simultaneously. Scalability: you can add more servers to the pool as demand grows without changing anything on the client side. Availability: if a server fails health checks, the load balancer stops routing to it and spreads traffic to the remaining healthy servers. Performance: requests are spread so each server operates within its optimal capacity, keeping response times low.
Load balancers perform health checks on pool members — periodically sending requests (TCP connection attempts, HTTP GET requests, or custom scripts) to verify each server is up and responding correctly. A server that fails a configurable number of consecutive health checks is marked down and removed from rotation until it recovers.
Load Balancing Algorithms
The algorithm determines which server receives each new connection. Different algorithms suit different workload types — no single algorithm is best for every scenario.
Layer 4 vs Layer 7 Load Balancing
The distinction between Layer 4 and Layer 7 is one of the most common exam topics. The key difference is what information the load balancer can see and act on.
A single L7 load balancer can route /api/* requests to the API server pool, /static/* requests to a CDN origin pool, and /admin/* requests to a restricted management pool — all based on the URL path in the HTTP request. An L4 load balancer cannot do this because it never sees the URL.
Session Persistence (Sticky Sessions)
Many web applications store session state locally on the server — user login tokens, shopping cart data, form progress. If a user's subsequent requests are routed to a different server, that server has no knowledge of their session and the user appears logged out or loses their cart.
Session persistence (also called sticky sessions or session affinity) solves this by ensuring all requests from the same client always go to the same server. There are two main methods:
| Method | How It Works | Pros | Cons |
|---|---|---|---|
| Cookie-based | Load balancer inserts a cookie identifying which server handled the first request. Subsequent requests include the cookie, so the LB always routes to the same server. | Works behind NAT; precise per-user targeting | Requires L7 LB; cookies can be cleared by user |
| Source IP hash | Client's source IP is hashed to a consistent server. Same IP always maps to the same server. | No application changes needed; works at L4 | NAT breaks it — all users behind one NAT IP hit the same server; poor distribution |
| Application-managed | Session state is stored in a shared database or cache (Redis, Memcached) accessible to all servers. Any server can handle any request. | True scalability; no single-server dependency | Requires application architecture changes; shared cache is a new component to manage |
The exam may present a scenario where users are randomly losing their sessions. This is the classic symptom of a stateful application running behind a load balancer without session persistence configured. The fix is either: (1) enable sticky sessions on the load balancer, or (2) refactor the application to use a shared session store.
Also remember: if a sticky server fails, that client's session is lost regardless of persistence settings — the session data only existed on that server.
High Availability — Active/Active vs Active/Passive
The load balancer itself can become a single point of failure. If the load balancer goes down, all traffic is lost. High availability (HA) configurations deploy load balancers in pairs.
Load Balancer Health Checks
Health checks are what enable automatic failover. The load balancer continuously probes each server in the pool at a configurable interval. Common health check types:
| Health Check Type | What It Tests | Use When |
|---|---|---|
| TCP check | Opens a TCP connection to the server's port. If the connection succeeds, the server is considered up. | Non-HTTP services; quick L4-level liveness check |
| HTTP/HTTPS check | Sends an HTTP GET request to a specific path (e.g. /health) and expects a 200 OK response. | Web applications; ensures the app is actually responding, not just the TCP stack |
| Custom script | Runs a script that performs a deeper check — e.g. verifies database connectivity, checks disk space, or tests a critical business function. | Critical applications where basic HTTP response isn't sufficient evidence of health |
| SSL/TLS check | Performs an HTTPS health check and validates the certificate. Useful for detecting expired certificates before they cause user-facing errors. | HTTPS endpoints; certificate monitoring |
SSL Termination and SSL Passthrough
HTTPS traffic requires careful handling at the load balancer. Two approaches exist, each with different security and performance implications.
The load balancer decrypts HTTPS traffic, inspects the plain HTTP content (enabling L7 routing, cookie injection, WAF), and then forwards to backend servers either as plain HTTP or re-encrypted HTTPS. This offloads CPU-intensive TLS handshake processing from backend servers. The traffic between load balancer and backend servers is unencrypted — only acceptable if that network segment is trusted (e.g., private VPC).
Also called: SSL offloading. The load balancer holds the TLS certificate and private key.
The load balancer passes encrypted traffic straight to backend servers without decrypting it. Backend servers each hold the TLS certificate and handle decryption themselves. This provides end-to-end encryption and hides traffic from the load balancer — but the LB cannot do L7 content inspection, cookie insertion, or WAF filtering. This is an L4-only operation.
VIP (Virtual IP address) — the IP address clients connect to; floats between load balancers in HA configurations.
Pool / server farm — the group of backend servers the load balancer distributes traffic to.
VRRP — Virtual Router Redundancy Protocol; used between load balancer pairs to manage the VIP.
ADC (Application Delivery Controller) — an enterprise load balancer with advanced features: SSL termination, WAF, compression, caching (e.g. F5 BIG-IP, Citrix ADC).
Exam Scenarios
Nail your Network+ exam
Free cheat sheets, study guides, and practice scenarios.