Skip to content

Elastic Load Balancing (ELB)

General Info

  • Scale IN & OUT automatically
  • Highly Available within a region
  • Integrates with autoscaling
VPC Integration
  • Works with VPCs to route traffic INTERNALLY between APP tiers
  • Supports integrated CERT MGMT (Certificate Manager) & SSL TERMINATION
  • Process that checks for connection requests (i.e., a CNAME pointing to the A record of the LB)
  • Each LB must have 1+ listeners configured
  • Protocols at 2 OSI levels


Internet-Facing LBs
  • Takes requests from clients over the Internet & distributes them to EC2
  • Has a public DNS name
  • Enables traffic encryption between
    • CLIENTs (that initiates the HTTPS session) ↔ LB
    • LB ↔ backend INSTANCES
Internal LBs
  • In a multi-tier app
  • Route traffic to EC2 in VPCs within PRIVATE subnets
Application LB (ALB) - HTTP/HTTPS
  • In a security group
  • Integrated with WAF
  • Authentication: integrates with Cognito and supports Open ID Connect
  • Can enable access logging to an S3 bucket
  • Redundant across at least two subnets
  • Routing: round robin, least outstanding request
Network LB (NLB) - TCP/TLS
  • Creates a (read only) network interface in a subnet in each AZ you choose
  • Not in a security group: instance security groups must allow traffic from its IP address and from client IP addresses
  • Routing: flow hash
  • Routing: round robin (for TCP listeners), least outstanding request (for HTTP/HTTPS listeners)


Security Benefits
  • Takes over encryption from EC2 instances
  • When in VPC, supports creation and management of security groups associated with ELB
  • Supports end-to-end encryption with TLS certificates managed centrally (not on each instance)
  • Server Order Preference option: ELB selects cipher suites based on the server's prioritization instead of the client one
  • Supports Perfect Forward Secrecy
  • Allows to identify originating IP of clients
Access Logs
  • IP/Port of client
  • IP of instance that processed the request
  • Size of request/response

Special Configurations

Configuration Description
Idle Connection Timeout
  • For each request a client makes through a LB, the LB maintains 2 connections (1 w/ CLIENT, 1 w/ EC2)
  • For each connection, LB manages an IDLE TIMEOUT (default = 60sec), triggered when NO DATA is sent over the connection
  • If an HTTP request does not complete w/in the idle timeout, the LB closes the connection even if data is still being transferred
  • Keep-Alive option for EC2keep alive > idle timeout
    • Only for HTTP(S) listeners
    • Enable it in web server or EC2 kernel → < %CPU
    • Allows LB to reuse connections
Cross-Zone LB
  • Enable to ensure traffic is ROUTED EVENLY regardless of AZ
  • Lower need to maintain equivalent #EC2 in each AZ
  • Greater ability to handle loss of 1+ EC2
Connection Draining
  • LB stops sending requests to DEREGISTERING/UNHEALTHY EC2
  • Specify a max time for LB to keep connections alive before reporting instance as deregistered
  • Values
    • Default = 300s
    • Min = 1s
    • Max = 3,600s
Proxy Protocol
  • When using TCP or SSL for both FRONT & BACK end connections, LB forwards reqs to EC2 w/o modifying REQUEST HEADERS
  • If enabled, a human-readable header is added with info like src/dst IP address & port numbers
  • LB should not be behind proxy if proxy protocol is enabled
Sticky Sessions (Session Affinity)
  • By default a LB routes each request INDEPENDENTLY from the registered instance with smallest load
  • Types
    • If APP has Session Cookie → TTL specified by the app
    • If APP does NOT have Session Cookies → ELB creates one (AWSELB)
Health Checks
  • Test the states of EC2 instances behind the ELB
  • States: InService, OutOfService
  • Types: ping, connection attempt, page checked periodically

Auto Scaling

General Info

Allows to scale EC2 capacity automatically by scaling in/out

Auto Scaling Plans

Maintain Current Instance Levels
  • Maintain a minimum or specified number of running instances at all times
  • Performs periodic health checks on running instances within an auto scaling group
  • When it finds an unhealthy instance → terminates it & launches a new one
Manual Scaling
  • Specify the change in max/min/desired capacity
  • Manages process of creating/terminating instances to maintain the updated capacity
  • Manual scaling out for INFREQUENT EVENTS
Scheduled Scaling
  • PREDICTABLE SCHEDULE (recurring events)
  • Scaling performed as a function of TIME and DATE
Dynamic Scaling
  • Define parameters that control process in a scaling policy
  • Example: policy that adds more EC2 instances to web tier when network bandwidth (by CloudWatch) reaches a threshold

Auto Scaling Components

Launch Configuration
  • Description
    • Template used to create new instances
    • Each auto scaling group can have only 1 launch config
  • Composed by
    • Config name
    • AMI + EC2 instance type
    • IAM role to associate with created instances
    • Optional settings: SGs, instance key pair, block device mapping
  • Limits
    • 100 launch configurations per region
    • Can be modified with: aws autoscaling describe-account-limits
    • Auto scaling may trigger limits of other services (e.g., default number of EC2 you can launch within 1 region is 20)
Auto Scaling Group (ASG)
  • Description
    • Collection of EC2 instances managed by the auto scaling service
    • Each ASG contains config options that control when AS should launch/terminate instances
    • Each ASG contains:
      • name
      • min/max #instances
      • desired capacity (optional) → default = min specified
  • Instance Types (1 launch config can't reference both)
ON-DEMAND default
SPOT used by referencing a MAX BID PRICE in the launch config
Scaling Policy
  • Description
    • Associate CloudWatch alarms & scaling policies with an ASG to adjust AS dynamically
    • When threshold is crossed, CW sends alarms to trigger changes to number of EC2 behind an ELB
    • Each ASG can contain 1+ policies
  • Ways to configure scaling policy
    • increase/decrease by specific number of EC2
    • target specific number of EC2
    • adjust based on a %
    • scale by steps
  • Best Practice
    • Scale OUT quickly
    • Scale IN slowly
    • So you can respond to bursts without inadvertently terminating EC2 too quickly
  • Cooldown
    • When to suspend AS for a short time for an aSG
  • Costs
    • If you start an EC2 instance → billed for 1 full hour
    • Partial instance hours → billed as full hours
    • BOOTSTRAPPING takes time before instance is healthy