ELB & ASG
- Scalability:
- Vertical Scalability: increasing the size of instance, e.g. t2.micro -> t2.large(non-distributes system like database)
- Horizontal Scalability: increasing the number of instance(distributed system like web/modern app)
- High Availability: running system in at least 2 data centers(AZs) to survive a data center loss
Load Balance: servers that forward traffic to multiple servers downstream like EC2 instances
- spread load across multiple downstream instances
- expose a single point of access(DNS) to application
- seamlessly handle failures of downstream instances
- do regular health checks to instances
- provide SSL termination(HTTPS) for your websites
- enforce stickiness with cookies
- high availability across zones
- separate public traffic from private traffic
4 Types of Managed Load Balancers:
- Classic Load Balancer(old generation): HTTP, HTTPS, TCP, SSL(secure TCP)
- Application Load Balancer: HTTP, HTTPS, WebSocket
- Network Load Balancer: TCP, TLS(secure TCP), UDP
- Gateway Load Balancer - GWLB: operates at layer 3 (network layer) - IP Protocol
1. ELB (Elastic Load Balancer)
- an ELB is a managed load balancer: AWS guarantees it will work, upgrades, maintenance, high availability
- ELBs cost less than to set up our own load balancer
- ELB is integrated with many AWS offerings / services
- health checks enable load balancers to know if instances it forwards to are available to reply request
- Only Network Load Balancer provides both static DNS name and static IP. While, Application Load Balancer provides a static DNS name but it does NOT provide a static IP.
2. CLB (Classic Load Balancer)
- supports TCP(layer 4), HTTP & HTTPS(layer 7)
- health checks are TCP or HTTP based
- fix hostname
3. ALB (Application Load Balancer)
- layer 7 (HTTP)
- load balancing to multiple HTTP applications across machines
- load balancing to multiple applications on the same machine
- support for HTTP/2 and WebSocket
- support Redirect (e.g. from HTTP to HTTPS)
- routing tables to different target groups:
- routing based on path in URL (example.com/users & example.com/posts)
- routing based on hostname in URL (one.example.com & other.example.com)
- routing based on query string, headers (example.com/users?id=123&order=false)
- ALB is a great fit for micro services & container-based application
- has a port mapping feature to redirect to a dynamic port in ECS
- in comparison, we’d need multiple Classic Load Balancer per application
- Target Groups: ALB can route to multiple Target Groups, and Health Checks are at the Target Group level
- EC2 instance - HTTP
- ECS tasks - HTTP
- lambda functions
- IP address - must be private IPs
- fixed hostname
- the application servers don’t see the IP of the client directly, check from header:
- true IP -> X-Forwarded-For
- port -> X-Forwarded-Port
- proto -> X-Forwarded-Proto
4. NLB (Network Load Balancer)
- layer 4 (TCP & UDP)
- handle millions of request per second & less latency
- has one static IP per AZ, and supports assigning Elastic IP(helpful for whitelisting specific IP)
- used for extreme performance, TCP or UDP traffic
- not free
- Target Groups:
- EC2 Instances
- IP Addresses - must be private IPs
- ALB
5. GWLB (Gateway Load Balancer)
- deploy, scale and manage a fleet of 3rd party network virtual appliances in AWS like firewalls, intrusion detection. etc
- layer 3 (Network Layer) - IP Packets
- two functions:
- Transparent Network Gateway: single entry/exit for all traffic
- Load Balancer: distributes traffic to your virtual appliances
- use the GENEVE protocol on port 6081
- Target Groups:
- EC2 Instances
- IP Addresses - must be private IPs
6. Sticky Sessions (Session Affinity)
- to implement stickiness so that the same client is always redirected to the same instance behind a load balancer
- works for CLB & ALB
- “cookie” used for stickiness has expiration date
- make sure the user doesn’t lose the session data
- may bring imbalance to the load over the backend EC2 instances
- Cookie Names:
- Application-based Cookies:
- Custom Cookie
- generated by target
- can include any custom attributes required by the application
- cookie name must be specified individually for each target group
- don’t use AWSALB, AWSALBAPP, AWSALBTG(reserved for use by the ELB)
- Application Cookie:
- generated by the load balancer
- cookie name is AWSALBAPP
- Custom Cookie
- Duration-based Cookies:
- cookie generated by the load balancer
- cookie name is AWSALB for ALB, AWSELB for CLB
- Application-based Cookies:
7. Cross-Zone Load Balancing
- each load balancer instance distributes evenly across all registered instances in all AZ
- 3 types:
- Application Load Balancer: always on & no charges for inter AZ data
- Network Load Balancer: disabled by default & need pay for inter AZ data if enabled
- Classic Load Balancer: disabled by default & no charges for inter AZ data if enabled
8. SSL & TLS
- an SSL Certificate allows traffic between your clients and your load balancer to be encrypted in transit(in-flight encryption)
- SSL (Secure Sockets Layer): used to encrypt connections
- TLS (Transport Layer Security): a newer version
- public SSL certificates are issued by CA (Certificate Authorities)
- SSL certificates have an expiration date(you set) and must be renewed
- For Load Balancers:
- CLB: support only one SSL certificate, must use multiple CLB for multiple hostname with multiple SSL certificates
- ALB & NLB: support multiple listeners with multiple SSL certificates, use SNI to make it work
8.1 SNI (Server Name Indication)
- solves the problem of loading multiple SSL certificates onto one web server (to serve multiple websites)
- requires the client to indicate the hostname of the target server in the initial SSL handshake
- the server will find the correct certificate, or return the default one
- only works for ALB & NLB, CloudFront, does not work for CLB
9. Connection Draining
- Feature Naming:
- Connection Draining - for CLB
- Deregistration Delay - for ALB & NLB
- time to complete “in-flight” request while the instance is de-registering or unhealthy
- stops sending new requests to the EC2 instance which is de-registering
- set to a low value if you requests are short
10. Auto Scaling Groups
- Scale Out -> adding instances
- Scale In -> removing instances
- attributes:
- a launch configuration:
- AMI + Instance Type
- EC2 User Data
- EBS Volumes
- Security Groups
- SSH Key Pair
- Min Size / Max Size / Initial Capacity
- Network + Subnet Information
- Load Balancer Information
- Scaling Policies: defines what will trigger a scale in and a scale out
- a launch configuration:
- scaling policies can be on CPU, Network and custom metrics or on a schedule
- ASGs use launch configurations or templates, to update an ASG, must provide a new template or configuration
- IAM roles attached to an ASG will get assigned to EC2 instances
- ASG are free, but the underlying resources being launched need to be paid
10.1 Scaling Policies
- Dynamic Policies:
- Target Tracking Scaling: most simple and easy to set-up, like average ASG GPU to stay around 40%
- Simple / Step Scaling: when a CloudWatch alarm is triggered(CPU > 70%), then add 2 units
- Scheduled Actions: scaling based on known usage patterns
- Predictive Scaling: continuously forecast load and schedule scaling ahead
- Good Metrics:
- CPUUtilization: average CPU utilization across your instances
- RequestCountPerTarget: to make sure the number of requests per EC2 instances is stable
- Average Network In / Out: if your application is network bound
- Any Custom Metric: that you push using CloudWatch
- Scaling Cooldown:
- after a scaling activity happens, you’re in a cooldown period (default 300 seconds)
- during the cooldown period, the ASG will not launch or terminate additional instances to allow for metrics to stabilize
Route 53
- DNS: Domain Name System which translates the human friendly hostnames into the machine IP addresses
- it uses the hierarchical naming structure like .com, example.com, www.example.com
- Terminologies:
- Domain Registrar(域名注册商): Amazon Route53, GoDaddy…
- DNS Records: A, AAAA, CNAME, NS…
- Zone File: contains DNS records
- Name Server: resolves DNS queries(Authoritative or Non-Authoritative)
- Root: .
- Top Level Domain(TLD): .com.
- Second Level Domain(SLD): .example.com.
- Sub Domain: .www.example.com.
- Domain Name: api.www.example.com.
- Protocol: http
- Fully Qualified Domain Name(FQDN): http://api.www.example.com.
1. Route 53 Overview
- Route 53 is the only AWS service which provides 100% availability SLA
- how you want to route traffic for a domain
- each record contains:
- Domain/ Subdomain Name: e.g., example.com
- Record Type: e.g., A or AAAA
- Value: e.g., 12.34.56.78
- Routing Policy: how Route53 responds to queries
- TTL: amount of time the record cached at DNS Resolvers
- supports the following DNS record types:
- A: maps a hostname to IPv4
- AAAA: maps a hostname to IPv6
- CNAME: maps a hostname to another hostname
- the target is a domain name which must have an A or AAAA record
- can’t create a CNAME record for the top node of a DNS namespace like example.com, but can create for www.example.com
- NS: Name Servers for the Hosted Zone, controls how traffic is routed for a domain
- Hosted Zone is a container for records that define how to route traffic to a domain and its subdomains
- Public Hosted Zone: contains records that specify how to route traffic on the Internet (public domain names)
- Private Hosted Zone: contains records that specify how to route traffic within one or more VPCs (private domain names)
- Records TTL (Time To Live): within the TTL, the client would not query the DNS server again, it is mandatory for each DNS record except for Alias record
- CNAME vs Alias
- CNAME:
- points to a hostname to any other hostname (app.mydomain.com -> blabla.anything.com)
- only for non root domain like sth.mydomain.com
- Alias:
- points to a hostname to an AWS Resource (app.mydomain.com -> blabla.amazonaws.com)
- works for both root domain and non root domain
- free
- native health check
- always of type A/AAAA for AWS resources
- can’t set the TTL
- cannot set an Alias record for an EC2 DNS name
- CNAME:
- Health Check is only for public resources, you can create a CloudWatch Metric and associate a CloudWatch Alarm, then create a Health Check that checks the alarm itself
2. Routing Policy
2.1 Simple
- route traffic to a single resource
- can specify multiple values in the same record, if multiple values are returned, a random one is chosen by the client
- when alias enabled, specify only one AWS resource
- can’t be associated with health checks
2.2 Weighted
- control the percentage of the requests that go to each specific resource
- assign each record to a relative weight:
- traffic = weight for a specific record / sum of all the weights for all records
- weights don’t need to sum up to 100
- DNS records must have the same name and type
- can be associated with health checks
- assign a weight of 0 to a record to stop sending traffic to a resource
- if all records have weight of 0, then all records will be returned equally
2.3 Latency-based
- redirect to the resource that has the least latency close to us
- latency is based on traffic between user and AWS regions
- can be associated with health check (has a failover capability)
2.4 Failover
It will direct to the secondary EC2 instance if the primary EC2 instance failed the health check.
2.5 Geolocation
- different from latency-based
- based on user location
- should create a default record in case there’s no match on location
- use cases: website localization, restrict content distribution, load balancing…
- can be associated with health checks
2.6 Geoproximity
- based on both user and resource location
- shift more traffic to resources based on the defined bias
- to change the size of geographic region, specify bias values:
- to expand(1 to 99) - more traffic to resources
- to shrink(-1 to -99) - less traffic to resources
- resources can be:
- AWS resources: specify AWS region
- non-AWS resources: specify latitude and longitude
- must use Route53 Traffic Flow(advanced) to use this feature
Traffic Flow
- visual editor to manage complex routing decision trees
- configurations can be saved as Traffic Flow Policy:
- can be applied to different Route 53 Hosted Zone (different domain names)
- supports versioning
2.7 Multi-Value
- use when routing traffic to multiple resources
- Route 53 return multiple values / resources
- can be associated with health checks (returns only values for healthy resources)
- up to 8 healthy records are returned for each Multi-Value query
- is not a substitute for ELB