[AWS] CloudWatch

CloudWatch monitors the performance of AWS services – a repository service for metric data.

Amazon CloudWatch

  • A CloudWatch metric is a set of data points over time. (ex. CPU Utilization of EC2 instances)
  • Metrics can be configured with alarms that can take actions.
  • Autoscaling is dependent on CloudWatch to trigger the addition or removal of instances.

Data retention

  • one-hour metrics (for 455 days)
  • five-minute metrics (for 63 days)
  • one-minute metrics (for 15 days)

Monitoring Plans

  • Basic: Data is available in 5-minute periods at no charge.
  • Detailed: Data is available in 1-minute periods with an additional charge.

CloudWatch Components

Namespaces

  • a container for ClouldWatch metrics
  • The naming convention: aws/service

Metrics

  • A time-ordered set of data points
  • Exist only in a region where they are created
  • Cannot be deleted. But old data are aggregated, and data older than 15 months are dropped.

Dimensions

  • A name/value pair that uniquely identifies a metric.

Statistics

  • Aggregated metric data over specified periods of time
  • Minimum, Maximum, Average, Sum, SampleCount …

Alarms

  • Insufficient: not enough data
  • Alarm: the threshold is breached.
  • OK: The metric is within the defined threshold.

The components of alarms are:

  • Metric: The data points being measured
  • Threshold: the criteria to check it is normal or abnormal
  • Period: How long the state over the threshold is bad before an alarm is generated
  • Action: What needs to be done when an alarm is triggered
    • SNS Notification
    • EC2 Actions: Stop, terminate, or reboot an EC2 instance
    • Auto Scaling Actions: Execute an Auto Scaling policy

[Note] CloudWatch does not collect some metrics for EC2 instances. You need to install a CloudWatch Agent in the instances.

  • Default Metrics: CPU Utilization, Disk Reads/Writes, and Network Utilization (Network In/Out)
  • Custom Metrics with CloudWatch agents: Memory utilization and disk space/swap usages

CloudWatch Events and EventBridge

Amazon CloudWatch Events is a feature of CloudWatch and delivers a near real-time stream of system events that describe changes in AWS resources.

  • Events are the changes in the AWS resources.
  • Rules match incoming events and route them to targets.
  • Targets accept events and process them.
  • The main benefit of CloudWatch Events is that it monitors events and takes actions based on rules.
  • The source can be the event pattern or the schedule.
  • The targets can be Lambda functions, Kinesis streams, Step functions and state machines, ECS tasks, SNS topics, or SQS queues. Also, you can call EC2 APIs such as CreateSnapshot, StopInstances, TerminateInstances, or RebootInstances.

Amazon EventBridge is a serverless event bus service that makes it easy to connect your applications with data from a variety of sources. It helps you to build event-driven architectures that are loosely coupled and distributed.

  • EventBridge delivers a stream of real-time data from your applications or AWS services and routes that data to targets such as AWS Lambda using the routing rules.

CloudWatch Logs

CloudWatch Logs store, monitor, and access logs.

  • CloudWatch logs accepts connections from AWS services (such as EC2, Lambda, or CloudTrail), from API streams (from a custom application), or CloudWatch agents.
  • A metric filter uses pattern matches to analyze logs and create metrics.
  • A log event is a timestamp and a raw message.
  • A log stream is a sequence of log events with the same source.
  • A log group is a container for log streams. It controls retention, monitoring, and access. You can set filters in a group.

CouldWatch Events

CloudWatch Events provides near real-time tracking of changes that happen within an AWS account.

  • Using rules -> match account events and deliver them to many targets.
  • Rules can be invoked in 2 ways: by event pattern matchings or by schedules.
  • Targets can be EC2 instances, Lambda functions, Step functions, SNS topics, or SQS queues.
  • When a rule is matched, “CloudWatch Events” is aware of operational changes and takes corrective actions necessary by invoking actions on selected targets.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s