Elastic Load Balancing (ELB)
General Info
- Scaling
-
- Scale IN & OUT automatically
- Highly Available within a region
- Integrates with autoscaling
- VPC Integration
-
- Works with VPCs to route traffic INTERNALLY between APP tiers
- Supports integrated CERT MGMT (Certificate Manager) & SSL TERMINATION
- Listeners
-
- Process that checks for connection requests (i.e., a CNAME pointing to the A record of the LB)
- Each LB must have 1+ listeners configured
- Protocols at 2 OSI levels
- LAYER 4 (TRANSPORT) = TCP/SSL
- LAYER 7 (APP) = HTTP/HTTPS
Types
|
|
Internet-Facing LBs |
- Takes requests from clients over the Internet & distributes them to EC2
- Has a public DNS name
- Enables traffic encryption between
- CLIENTs (that initiates the HTTPS session) ↔ LB
- LB ↔ backend INSTANCES
|
Internal LBs |
- In a multi-tier app
- Route traffic to EC2 in VPCs within PRIVATE subnets
|
Application LB (ALB) - HTTP/HTTPS |
- In a security group
- Integrated with WAF
- Authentication: integrates with Cognito and supports Open ID Connect
- Can enable access logging to an S3 bucket
- Redundant across at least two subnets
- Routing: round robin, least outstanding request
|
Network LB (NLB) - TCP/TLS |
- Creates a (read only) network interface in a subnet in each AZ you choose
- Not in a security group: instance security groups must allow traffic from its IP address and from client IP addresses
- Routing: flow hash
|
Classic |
- Routing: round robin (for TCP listeners), least outstanding request (for HTTP/HTTPS listeners)
|
Security
|
|
Security Benefits |
- Takes over encryption from EC2 instances
- When in VPC, supports creation and management of security groups associated with ELB
- Supports end-to-end encryption with TLS certificates managed centrally (not on each instance)
|
Features |
Server Order Preference option: ELB selects cipher suites based on the server's prioritization instead of the client one- Supports Perfect Forward Secrecy
- Allows to identify originating IP of clients
|
Access Logs |
- IP/Port of client
- IP of instance that processed the request
- Size of request/response
|
Special Configurations
Configuration |
Description |
Idle Connection Timeout |
- For each request a client makes through a LB, the LB maintains 2 connections (1 w/ CLIENT, 1 w/ EC2)
- For each connection, LB manages an IDLE TIMEOUT (default =
60sec ), triggered when NO DATA is sent over the connection - If an HTTP request does not complete w/in the idle timeout, the LB closes the connection even if data is still being transferred
- Keep-Alive option for EC2 →
keep alive > idle timeout - Only for HTTP(S) listeners
- Enable it in web server or EC2 kernel → < %CPU
- Allows LB to reuse connections
|
Cross-Zone LB |
- Enable to ensure traffic is ROUTED EVENLY regardless of AZ
- Lower need to maintain equivalent #EC2 in each AZ
- Greater ability to handle loss of 1+ EC2
|
Connection Draining |
- LB stops sending requests to DEREGISTERING/UNHEALTHY EC2
- Specify a max time for LB to keep connections alive before reporting instance as deregistered
- Values
- Default =
300s - Min =
1s - Max =
3,600s
|
Proxy Protocol |
- When using TCP or SSL for both FRONT & BACK end connections, LB forwards reqs to EC2 w/o modifying REQUEST HEADERS
- If enabled, a human-readable header is added with info like src/dst IP address & port numbers
- LB should not be behind proxy if proxy protocol is enabled
|
Sticky Sessions (Session Affinity) |
- By default a LB routes each request INDEPENDENTLY from the registered instance with smallest load
- Enables LB to BIND USER' SESSION to a SPECIFIC INSTANCE
- Types
- If APP has Session Cookie → TTL specified by the app
- If APP does NOT have Session Cookies → ELB creates one (
AWSELB )
|
Health Checks |
- Test the states of EC2 instances behind the ELB
- States:
InService , OutOfService - Types: ping, connection attempt, page checked periodically
|
Auto Scaling
General Info
Allows to scale EC2 capacity automatically by scaling in/out
Auto Scaling Plans
|
|
Maintain Current Instance Levels |
- Maintain a minimum or specified number of running instances at all times
- Performs periodic health checks on running instances within an auto scaling group
- When it finds an unhealthy instance → terminates it & launches a new one
- STEADY STATE WORKLOADS
|
Manual Scaling |
- Specify the change in max/min/desired capacity
- Manages process of creating/terminating instances to maintain the updated capacity
- Manual scaling out for INFREQUENT EVENTS
|
Scheduled Scaling |
- PREDICTABLE SCHEDULE (recurring events)
- Scaling performed as a function of TIME and DATE
|
Dynamic Scaling |
- Define parameters that control process in a scaling policy
- Example: policy that adds more EC2 instances to web tier when network bandwidth (by CloudWatch) reaches a threshold
|
Auto Scaling Components
- Launch Configuration
-
- Description
- Template used to create new instances
- Each auto scaling group can have only 1 launch config
- Composed by
- Config name
- AMI + EC2 instance type
- IAM role to associate with created instances
- Optional settings: SGs, instance key pair, block device mapping
- Limits
- 100 launch configurations per region
- Can be modified with:
aws autoscaling describe-account-limits
- Auto scaling may trigger limits of other services (e.g., default number of EC2 you can launch within 1 region is 20)
- Auto Scaling Group (ASG)
-
- Description
- Collection of EC2 instances managed by the auto scaling service
- Each ASG contains config options that control when AS should launch/terminate instances
- Each ASG contains:
- name
- min/max #instances
- desired capacity (optional) → default = min specified
- Instance Types (1 launch config can't reference both)
|
|
ON-DEMAND |
default |
SPOT |
used by referencing a MAX BID PRICE in the launch config |
- Scaling Policy
-
- Description
- Associate CloudWatch alarms & scaling policies with an ASG to adjust AS dynamically
- When threshold is crossed, CW sends alarms to trigger changes to number of EC2 behind an ELB
- Each ASG can contain 1+ policies
- Ways to configure scaling policy
- increase/decrease by specific number of EC2
- target specific number of EC2
- adjust based on a %
- scale by steps
- Best Practice
- Scale OUT quickly
- Scale IN slowly
- So you can respond to bursts without inadvertently terminating EC2 too quickly
- Cooldown
- When to suspend AS for a short time for an aSG
- Costs
- If you start an EC2 instance → billed for 1 full hour
- Partial instance hours → billed as full hours
- BOOTSTRAPPING takes time before instance is healthy