Here is the confusing part. AWS provides multiple auto scaling services: notably ASW Auto Scaling and EC2 Auto Scaling.
- AWS Auto Scaling lets you configure and manage scaling for your scalable AWS resources through a scaling plan.
- EC2 Auto Scaling is an AWS service that automatically increases or decreases the number of on-demand instances based on chosen CloudWatch metrics.
In short, AWS Auto Scaling is an extension to EC2 Auto Scaling and scales a collection of related resources. You need to create a scaling plan, which is a collection of scaling instructions for multiple AWS resources.
- You can use EC2 Auto Scaling when you scale only EC2 instances.
AWS Auto Scaling
- You should use AWS Auto Scaling to manage scaling for multiple resources across multiple services.
- You can include existing EC2 Auto Scaling groups to an AWS auto scaling plan.
- To scale resources other than EC2, use the Application Auto Scaling API, which allows you to define scaling policies to scale your AWS resources automatically.
- Uses Cases of AWS Auto Scaling
- Auto scaling EC2 instances in an auto scale group
- Spot Fleet requests
- ECS services
- Aurora read replicas
- DynamoDB tables and global secondary indexes
EC2 Auto Scaling
Auto Scaling Components
- You need to create an auto scaling group to specify what to scale, such as web servers or applications.
- An auto scaling group uses the scaling options to determine how to scale based on the specified conditions (a dynamic scale) or based on a schedule.
- An auto scaling group uses a launch template (or configuration) to launch a new EC2 instance.
Launch Templates
- A Launch Template is an instruction on how to create a new instance.
- AMI, Instance type, Storage, Key pair, IAM role, User data, Purchase option, Security groups
- It supports versioning.
- It includes Tagging and more advance instance/purchasing options.
- Launch Templates cannot be edited after creation. A new configuration should be created.
Auto Scaling Groups
Auto Scaling Group is a logical grouping of EC2 instances for scaling and management.
You can configure the AGS (Auto Scaling Group) like this:
- Launch Template
- Pick your Launch Template and select the version
- Networking and Purchasing options
- Select VPC and Subnets
- can be configured to use multi-AZs to improve high availability.
- Override the instance type requirements
- Select VPC and Subnets
- Load Balancing and Health Check
- Attach the load balancer
- Select the Health Check Type: EC2 or ELB
- Monitoring
- Enable group metrics collection within CloudWatch
- Scaling Policy
- Manual: specify its minimum, maximum, and desired number of EC2 instances.
- Dynamic: based on the metrics such as CPU Usage
- Scheduled
- Notification
- Send notifications to SNS topics
EC2 Auto Scaling Options
- Scale to maintain current instance levels at all times
- Through health checks, ASGs terminate and relaunch instances to maintain the same number of running instances.
- You can set the same value for maximum, minimum, and desired capacity.
- Manual Scaling
- It the most basic option, and you update the options manually
- minimum
- The lowest number of EC2 instances that are running
- At least 2 for high availability
- maximum
- The highest number of EC2 instances that are running
- You will never have more instances than this number.
- desired
- The number of instances you want right now.
- Scheduled Scaling
- The scaling is done automatically based on specified time and date.
- You create a scheduled action, which performs a scaling action at specified times. To create a scheduled scaling action, you specify the start time when the scaling action should take effect, and the new minimum, maximum, and desired sizes for the scaling action.
- Recurrence: Once, Cron, or Every—.
- It is useful when the workload is predictable.
- Reactive Scaling (Dynamic Scaling)
- Based on demand
- You need to specify parameters or conditions (scaling policies) that controls the scaling process. An example is CPU Utilization.
- It is useful when the workload is not predictable.
- Predictive Scaling
- It looks at historic traffic patterns and forecasts them into the future to schedule changes.
EC2 Auto Scaling Policy Types
- Target tracking scaling: based on the target value for a specific metric, such as keeping the average CPU utilization at a certain %. Cooldown periods are NOT applied.
- Step scaling: based on a set of scaling adjustments, known as step adjustments. Cooldown periods are NOT applied.
- Simple scaling: based on a single scale adjustment. Cooldown periods are applied.
[Note] The cooldown period is a setting that ensures not to launch or terminate resources before scaling takes effect. It applies to Simple scaling but not for target tracking, step, or scheduled scaling.
Auto Scaling Health Checks
- Health Checks identify any instances that are not healthy
- ECS status checks (default)
- ELB health checks
- Custom health checks
- Unhealthy instances are terminated and recreated based on the health checks.
- Auto Scaling can send SNS notifications when scaling occurs.
Achieving Highly Available and Fault-tolerant Architecture
- Deploy instances in different AZs
- To achieve fault-tolerance, you need to provision redundant resources, which entails an extra cost.
Amazon EC2 Auto Scaling lifecycle hooks
Lifecycle hooks are used to perform custom actions by pausing instances as an Auto Scaling group launches or terminates them.
- When an instance is paused, it remains in a wait state until the lifecycle action is completed.
- By default, an action is completed when the timeout period ends. But you can complete the action with complete-lifecycle-action command.
- Lifecycle hooks can be used to install or configure software on instances before they start receiving traffic or do some cleanup actions before instances are terminated.