[AWS] Auto Scaling

Here is the confusing part. AWS provides multiple auto scaling services: notably ASW Auto Scaling and EC2 Auto Scaling.

AWS Auto Scaling lets you configure and manage scaling for your scalable AWS resources through a scaling plan.
EC2 Auto Scaling is an AWS service that automatically increases or decreases the number of on-demand instances based on chosen CloudWatch metrics.

In short, AWS Auto Scaling is an extension to EC2 Auto Scaling and scales a collection of related resources. You need to create a scaling plan, which is a collection of scaling instructions for multiple AWS resources.

You can use EC2 Auto Scaling when you scale only EC2 instances.

AWS Auto Scaling

You should use AWS Auto Scaling to manage scaling for multiple resources across multiple services.

You can include existing EC2 Auto Scaling groups to an AWS auto scaling plan.
To scale resources other than EC2, use the Application Auto Scaling API, which allows you to define scaling policies to scale your AWS resources automatically.

Uses Cases of AWS Auto Scaling
- Auto scaling EC2 instances in an auto scale group
- Spot Fleet requests
- ECS services
- Aurora read replicas
- DynamoDB tables and global secondary indexes

EC2 Auto Scaling

Auto Scaling Components

You need to create an auto scaling group to specify what to scale, such as web servers or application servers.
An auto scaling group uses the scaling options to determine how to scale based on the specified conditions (a dynamic scale) or on the schedule.
An auto scaling group uses a launch template (or configuration) to launch a new EC2 instance.

Launch Templates

A Launch Template is an instruction on how to create a new instance.
- AMI, Instance type, Storage, Key pair, IAM role, User data, Purchase option, Security groups
It includes Tagging and more advance instance/purchasing options.
Launch Templates cannot be edited after creation.
- A new configuration should be created with a new version.
Advanced Settings
- Termination Protection:
  - Once it is enabled, an instance cannot be terminated using Console, API, or CLI until the setting is disabled.
- Instance Auto-recovery

Auto Scaling Groups

Auto Scaling Group is a logical grouping of EC2 instances for scaling and management.

You can configure the AGS (Auto Scaling Group) like this:

Launch Template
- Pick your Launch Template
- Select the version
Networking and Purchasing options
- Select the VPC and Subnets
  - can be configured to use multi-AZs to improve high availability.
- Instance Purchase Options
  - Instance distribution (% On-demand & % Spot) – Spot Fleet
- Override the instance type requirements (from the launch template)
  - vCPUs, Memory, …
Load Balancing and Health Check
- Attach the load balancer if needed
- VPC Lattice Integration Options
  - VPC Lattice facilitates communications between AWS services and helps you connect and manage your applications across compute services in AWS.
- Select the Health Check Type
  - EC2 health check is always enabled
  - Optionally, you can enable ELB health check, or VPC Lattice health check.
  - Health Check Grace period: delays the first health check until the instance finishes initialization. (300 seconds by default)
- When the health check fails, ASG terminates the instance and launch a new instance.
  - Note that ASG does not keep the unhealthy instance.
Monitoring
- Enable “group metrics collection” within CloudWatch
- You can enable “default instance warmup” time.
  - The amount of time that CloudWatch metrics for new instances do not contribute to the group’s aggregated instance metrics, as their usage data is not reliable yet.
Auto Scaling Group Size
- Set the initial size of ASG – Desired Capcity
Scaling Options
- Scaling Limits
  - Min Capacity
  - Max Capacity
- Automatic Scaling
  - No scaling policies (Manual)
    - The ASG remains at the initial size.
  - Target tracking scaling policy
    - based on the metrics (such as CPU Usage, Network In/Out, ALB Request Count)
Notification
- Send notifications to SNS topics

Metrics

The following metrics can be used to scale resources:

CPUUtilization
RequestCountPerTarget
Average Network In/Out

EC2 Auto Scaling Options

Once you created the ASG, you can fine-tune the scaling policies.

Manual Scaling

It is the most basic option, and you update the options manually
minimum
- The lowest number of EC2 instances that are running
- At least 2 for high availability
maximum
- The highest number of EC2 instances that are running
- You will never have more instances than this number.
desired
- The number of instances you want right now.

Automatic Scaling – Dynamic Scaling Policy

	Target tracking scaling	Step scaling	Simple scaling
Feature	the target value for a specific metric	a set of scaling adjustments, known as step adjustments	a single scale adjustment
Triggers	Metric type & Target Value	Cloud Watch Alarms	Cloud Watch Alarm
Actions		Steps as a set of actions: Add, Remove, or Set to	A single action: Add, Remove, or Set to
Example	Keep the average CUP utilization at 70%	When a CloudWatch alarm (CPU < 30%) triggers, set to 1 instance When a CloudWatch alarm (CPU > 30%) triggers, add 1 instance When a CloudWatch alarm (CPU > 70%) triggers, add 2 instances	When a CloudWatch alarm (CPU > 70%) triggers, add 1 instance
Cooldown Period	NOT applied	NOT applied	Applied

Dynamic Scaling Policy Types

[Note] The cooldown period (300 seconds default) is a setting that ensures not to launch or terminate additional instances during the cooldown period when the scaling action happens. It applies to Simple scaling but not for target tracking, step, or scheduled scaling.

Automatic Scaling – Predictive Scaling Policy

It looks at historic traffic patterns and forecasts them into the future to schedule changes.

Scheduled Scaling

The scaling is done automatically based on specified time and date.
You create a scheduled action, which performs a scaling action at specified times. To create a scheduled scaling action, you specify the start time when the scaling action should take effect, and the new minimum, maximum, and desired sizes for the scaling action.
Recurrence
- Once, Cron, or Every (5 mins, 30 mins, …)
It is useful when the workload is predictable.

Auto Scaling Health Checks

Health Checks identify any instances that are not healthy
- EC2 status checks (default)
- ELB health checks
- Custom health checks
Unhealthy instances are terminated and new instances are created based on the health checks.
Auto Scaling can send SNS notifications when scaling occurs.

Achieving Highly Available and Fault-tolerant Architecture

Deploy instances in different AZs
To achieve fault-tolerance, you need to provision redundant resources, which entails an extra cost.

EC2 Auto Scaling lifecycle hooks

Lifecycle hooks are used to perform custom actions by pausing instances when an Auto Scaling Group launches or terminates them.

Lifecycle
- autoscaling:EC2_Instance_Launching: before an instance is in service
  1. Pending
  2. Pending:Wait
  3. Pending:Proceed
  4. InService
- autoscaling:EC2_Instance_Terminating: before an instance is terminated
  1. Terminating
  2. Terminating:Wait
  3. Terminating:Proceed
  4. Terminated
When an instance is paused, it remains in a wait state until the lifecycle action is completed.
By default, an action is completed when the timeout period ends. But you can complete the action with complete-lifecycle-action command.
Lifecycle hooks can be used to install or configure software on instances before they start receiving traffic or do some cleanup actions before instances are terminated.
You can use EventBridge rules to receive the event.

{
  "version": "0",
  "id": "468fe059-f4b7-445f-bb22-2a271b94974d",
  "detail-type": "EC2 Instance-terminate Lifecycle Action",
  "source": "aws.autoscaling",
  "account": "123456789012",
  "time": "2015-12-22T18:43:48Z",
  "region": "us-east-1",
  "resources": ["arn:aws:autoscaling:<region>:<account>:autoScalingGroup:..."],
  "detail": {
    "LifecycleActionToken": "630aa23f-48eb-45e7-aba6-799ea6093a0f",
    "AutoScalingGroupName": "sampleASG",
    "LifecycleHookName": "SampleLifecycleHook-6789",
    "EC2InstanceId": "i-12345678",
    "LifecycleTransition": "autoscaling:EC2_INSTANCE_TERMINATING"
  }
}

EC2 Auto Scaling SNS Notification

ASG can send a notification via SNS for the following events:

autoscaling:EC2_INSTANCE_LAUNCH
autoscaling:EC2_INSTANCE_LAUNCH_ERROR
autoscaling:EC2_INSTANCE_TERMINATE
autoscaling:EC2_INSTANCE_TERMINATE_ERROR

EC2 Auto Scaling Termination Policies

You can determine which instance is terminated first when the ASG scales down or the instances are refreshed. You can combine policies with the evaluation order.

Default Termination
1. Select an AZ with the largest number of instances
2. Terminate the instance with the oldest Launch Template
3. If the instances are still in the same launch table, terminate the instance that is closest to the next billing hour
AllocationStrategy
- e.x) lowest price for Spot Instances
OldestLaunchTemplate
CloestToNextInstanceHour
NewestInstance
OldestInstance
Custom – backed by a Lambda function

EC2 Auto Scale Warm Pools

You can reduce the latency of launching new instances using ASG warm pools.

Instances are pre-initialized.
After creating instances in the warm pool, you can move the instance to the stopped or hibernated state for cost saving.
Settings
- Minimum Warm Pool size
- Max Prepared Capacity (MAX capacity of ASG by default)
- Warm Pool instance State
  - Running
  - Stopped
  - Hibernated

	Running	Stopped	Hibernated
Scale out	Fastest – immediate	Slow – need to be started	Medium – need to be waked, applications are already in memory
Cost	High – paying all the time	Low – paying the attached volume only	Low – paying the attached volume only (might need a bigger volume)

Instance Reuse Policy
- When the scale-in event occurs, the active instance might be moved back to the warm pool.

Warm pool Lifecycle Hooks
1. Warmed:Pending
2. Warmed:Pending:Wait
3. Warmed:Pending:Proceed
4. Warmed:Running, Warmed:Stopped, Warmed:Hibernated

Task – Recreate all EC2 Instances with a new AMI

Create a new Launch Template with a new AMI
Set the minimum healthy percentage (such as 60%)
Call the API StartInstanceRefresh
Old instances are destroyed and new instances are created.

[AWS] Auto Scaling

AWS Auto Scaling