Here is the confusing part. AWS provides multiple auto scaling services: notably ASW Auto Scaling and EC2 Auto Scaling.
- AWS Auto Scaling lets you configure and manage scaling for your scalable AWS resources through a scaling plan.
- EC2 Auto Scaling is an AWS service that automatically increases or decreases the number of on-demand instances based on chosen CloudWatch metrics.
In short, AWS Auto Scaling is an extension to EC2 Auto Scaling and scales a collection of related resources. You need to create a scaling plan, which is a collection of scaling instructions for multiple AWS resources.
- You can use EC2 Auto Scaling when you scale only EC2 instances.
AWS Auto Scaling
- You should use AWS Auto Scaling to manage scaling for multiple resources across multiple services.
- You can include existing EC2 Auto Scaling groups to an AWS auto scaling plan.
- To scale resources other than EC2, use the Application Auto Scaling API, which allows you to define scaling policies to scale your AWS resources automatically.
- Uses Cases of AWS Auto Scaling
- Auto scaling EC2 instances in an auto scale group
- Spot Fleet requests
- ECS services
- Aurora read replicas
- DynamoDB tables and global secondary indexes
EC2 Auto Scaling
Auto Scaling Components
- You need to create an auto scaling group to specify what to scale, such as web servers or applications.
- An auto scaling group uses the scaling options to determine how to scale based on the specified conditions (a dynamic scale) or based on a schedule.
- An auto scaling group uses a launch template (or configuration) to launch a new EC2 instance.
Launch Configurations (Templates)
- A Launch configuration is an instruction on how to create a new instance. (Legacy option)
- AMI, Instance type, Storage, Key pair, IAM role, User data, Purchase option, Security groups
- A Launch Template is an upgraded version of a launch instruction. It includes Version, Tagging, and more advance instance/purchasing options.
- Launch Configurations/Templates cannot be edited after creation. A new configuration should be created.
Auto Scaling Groups
- Auto Scaling Group is a logical grouping of EC2 instances for scaling and management. When you create a group, you can specify its minimum, maximum, and desired number of EC2 instances.
- ASG uses launch configurations or templates to automatically scale-out or scale-in based on metrics.
- ASGs are often paired with ELB and can be configured to use multi-AZs to improve high availability.
EC2 Auto Scaling Options
- Scale to maintain current instance levels at all times
- Through health checks, ASGs terminate and relaunch instances to maintain the same number of running instances.
- You can set the same value for maximum, minimum, and desired capacity.
- Scale manually
- It the most basic option, and you update the options manually – maximum, minimum, and desired capacity.
- Scale based on a schedule
- The scaling is done automatically based on specified time and date.
- You create a scheduled action, which performs a scaling action at specified times. To create a scheduled scaling action, you specify the start time when the scaling action should take effect, and the new minimum, maximum, and desired sizes for the scaling action.
- Recurrence: Once, Cron, or Every—.
- It is useful when the workload is predictable.
- Scale based on demand (Dynamic Scaling)
- You need to specify parameters or conditions (scaling policies) that controls the scaling process. An example is CPU Utilization.
- It is useful when the workload is not predictable.
[Note] AWS Auto Scaling provides the Predictive Scaling, which looks at historic traffic patterns and forecasts them into the future to schedule changes.
EC2 Auto Scaling Policy Types
- Target tracking scaling: based on the target value for a specific metric, such as keeping the average CPU utilization at a certain %. Cooldown periods are NOT applied.
- Step scaling: based on a set of scaling adjustments, known as step adjustments. Cooldown periods are NOT applied.
- Simple scaling: based on a single scale adjustment. Cooldown periods are applied.
[Note] The cooldown period is a setting that ensures not to launch or terminate resources before scaling takes effect. It applies to Simple scaling but not for target tracking, step, or scheduled scaling.
Auto Scaling Health Checks
- Health Checks identify any instances that are not healthy
- ECS status checks (default)
- ELB health checks
- Custom health checks
- Unhealthy instances are terminated and recreated based on the health checks.
- Auto Scaling can send SNS notifications when scaling occurs.
Achieving Highly Available and Fault-tolerant Architecture
- Deploy instances in different AZs
- To achieve fault-tolerance, you need to provision redundant resources, which entails an extra cost.
Amazon EC2 Auto Scaling lifecycle hooks
Lifecycle hooks are used to perform custom actions by pausing instances as an Auto Scaling group launches or terminates them.
- When an instance is paused, it remains in a wait state until the lifecycle action is completed.
- By default, an action is completed when the timeout period ends. But you can complete the action with complete-lifecycle-action command.
- Lifecycle hooks can be used to install or configure software on instances before they start receiving traffic or do some cleanup actions before instances are terminated.