CloudWatch Fundamentals — AWS Solutions Architect Associate Complete Course
Chapter 18: CloudWatch Fundamentals for the AWS Solutions Architect Associate Certification
We've mentioned Amazon CloudWatch in many chapters during this course, and it was always regarding monitoring. For example, in the SQS chapter, we discussed how to scale based on the SQS Queue Length parameter, measured using CloudWatch. Let's see in detail why this service is one of the most important ones from AWS!
- Amazon CloudWatch Metrics
- Amazon CloudWatch Logs
- Amazon CloudWatch Alarms
- Amazon EventBridge (CloudWatch Events)
- AWS CloudTrail
- AWS Config
- Typical CloudWatch Questions
Remember that you can find all the chapters from the course at the following link:
AMAZON CLOUDWATCH METRICS
Amazon CloudWatch is the AWS service to monitor other AWS services, providing data and insights in logs, metrics, and events. It collects metrics and logs from all your AWS services, and you can monitor them, visualize the data, and act automatically to changes in these services.
Metrics is the central part of Amazon CloudWatch, and these are the most important concepts about it:
- Namespaces → Container for CloudWatch metrics so that metrics from different applications are aggregated into different places.
- Metrics → Main part of Amazon CloudWatch. Metrics are time-ordered sets of data points that are published to CloudWatch. It's like a variable to monitor; for example, we can measure the CPU usage for each specific EC2 instance. They have dimensions and timestamps. We can also create dashboards to see these metrics. You can create graphs and dashboards. Dashboards are global (even though metrics belong to a specific region), including charts from different regions.
- Dimensions → They are like attributes from the metrics. It's a name/value pair used to identify a metric, and you can have a maximum of 10 dimensions per metric.
AMAZON CLOUDWATCH LOGS
Amazon CloudWatch Logs lets you monitor, store, and access your log files from services like Amazon EC2, Route 53... You can have all your logs from all your applications and services in the same place.
When we talk about Amazon CloudWatch logs, we need to know these two concepts, especially for the AWS Solutions Architect Associate Certification:
- Log stream → Sequence of log events that share the same source. So, for example, an application composed of three different EC2 instances should have three different log streams.
- Log group → They are groups of log streams that share the same retention, monitoring, and access control settings. It usually represents an application. So, for example, an application composed of three different EC2 instances should have the same log group.
This service also provides other features, like querying your log data (valuable if you need to look for some particular event), monitoring the logs, sending notifications when the error rate exceeds a certain threshold, and retaining/archiving them. You have to pay for the storage of these logs, so it's a good practice to create an expiration policy. You can see more information about the price of storing the logs on the following page.
AMAZON CLOUDWATCH ALARMS
You can also configure CloudWatch Alarms to send notifications depending on any metric or invoke actions when the alarm changes its state. You can also add alarms to CloudWatch dashboards and monitor them visually. For example, imagine you need to be notified when the CPU usage of an EC2 instance is over 85%. You could do that with CloudWatch alarms.
As mentioned in the EC2 Auto Scaling Groups chapter, CloudWatch alarms are the base of this service. We can scale out based on any EC2 metric, like the CPU usage if necessary, and CloudWatch Alarms will do this.
A metric alarm has the following possible states:
- OK → The metric or expression is within the defined threshold.
- ALARM → The metric or expression is outside of the specified threshold.
- INSUFFICIENT_DATA → The alarm has just started, and not enough data is available.
How often do we evaluate a metric? This is set by the period, and we usually set it to 5 minutes. It will depend on our application's criticality; we can fix it to 10 and 30 seconds even with High-Resolution Custom metrics.
And when should the state change from OK to ALARM? It will depend on the data points to alarm. These are the evaluation points that must breach to trigger the alarm. We can see this in the following example. Setting the alarm to three, like in the first case, will trigger an alarm as three different metrics are over the threshold. In contrast, in the second case (right), even though the metric is higher than the others, it's just one metric over the threshold, so the alarm wouldn't be triggered.
AMAZON EVENTBRIDGE (CLOUDWATCH EVENTS)
System events that describe changes in AWS resources. Amazon EventBridge, formerly known as CloudWatch Events, becomes aware of operational changes as they occur, responding to these functional changes and taking action.
You can also use Amazon EventBridge to schedule automated actions that self-trigger using CRON or rate expressions at specific times. This is, for example, a CRON expression that will trigger a lambda function every day at 11.00 pm UTC.
cron(0 11 * * ? *)
AWS CLOUDTRAIL
AWS CloudTrail monitors and records account activity across your AWS infrastructure, giving you control over storage, analysis, and remediation actions. Imagine someone removes an EC2 instance, and we want to know who did that. In this case, we could check it with AWS CloudTrail. We can also deliver log files to an Amazon S3 bucket for further analysis.
We can also filter between events to make it easier to search for something, such as "delete" operations. What we may be asked about this service in the exam would be something like, if we delete a resource, where do we have to look at it? Here it would be.
We also have two types of events in Amazon CloudTrail:
- Data events → Provide visibility into the resource operations performed on or within a resource.
- Management events → Provide visibility into management operations performed in our AWS accounts.
AWS CONFIG
AWS Config enables you to assess, audit, and evaluate the configurations of your AWS resources. You can save your AWS resource configuration, notifying us if there is any change in one. It is fundamental to understand that this service does not prevent the actions from happening; it only notifies us.
TYPICAL EXAM QUESTIONS
We want to monitor the logs of an application and receive a notification whenever a specific number of occurrences of certain HTTP status code errors occur. Which tool should we use?
- CloudWatch Logs
- CloudWatch Metrics
- Amazon EventBridge
- CloudTrail Trails
Solution: 1. Amazon CloudWatch Logs enables you to monitor, store, and access your log files generated by different AWS Services, such as Amazon EC2 or AWS Lambda. You can then retrieve the associated log data from CloudWatch Logs to analyze them, looking for different HTTP status code errors.
Metrics are data about the performance of your system, for example, the CPU utilization of EC2 instances or the NetworkOut. You will use it with CloudWatch Logs; however, you need CloudWatch Logs to monitor and interpret the log files.
Finally, using Amazon EventBridge, previously known as CloudWatch Events, you can create triggers to respond to some events in your AWS resources. Still, it doesn’t directly deal with the monitoring of log files.
How can we be notified by email when an RDS database exceeds certain metric thresholds?
- Create a CloudTrail alarm and configure a notification event to send an SMS.
- Create a CloudWatch alarm and associate an SNS topic with it that sends an email notification.
- Create an Amazon CloudWatch Logs rule that triggers an AWS Lambda function to send emails using AWS SES.
- Setup an RDS alarm to send emails.
Solution: 2. CloudWatch alarms can be configured to send notifications or automatically make changes to the resources you monitor based on the rules you define. CloudWatch cannot send notifications by itself; you need to create an SNS topic, subscribe your email to that topic, and then configure your CloudWatch Alarm to notify this SNS topic when it triggers. You can find more information about SNS at the following link.
Which service provides visibility into user activity by recording actions taken on your account?
- Amazon CloudWatch.
- Amazon CloudFormation.
- Amazon CloudTrail.
- Amazon CloudHSM.
Solution: 3. Amazon CloudTrail is a service that provides event history of your AWS account activity. For example, you can see the user activity inside AWS. This is one of the main differences with Amazon CloudWatch, which is primarily a monitoring service for AWS resources and the applications you run on AWS. Using CloudWatch, you can collect and track metrics, collect and monitor log files, set alarms, etc., but it doesn’t record the actions taken on your account.
SUMMING UP
In this chapter, we've seen three different AWS services to monitor the AWS infrastructure, so let's put an example to understand them fully. Imagine we want to monitor an EC2 instance; let's see how the previous services would monitor it:
- Amazon CloudWatch → Monitoring incoming metrics, like CPU Usage, Network I/O…
- AWS CloudTrail → Track who made any changes in the EC2 instance, like the person who created it.
- AWS Config → View the configuration of the EC2 security groups, like port rules.
More Questions?
- Do you want more than 500 AWS practice questions?
- Access to a real exam simulator to thoroughly prepare for the exam.
- You can download all of the AWS questions on PDF.
All of this and more at FullCertified!
Thanks for Reading!
If you like my work and you want to support me…
- The BEST way is to follow me on Medium here.
- Feel free to clap if this post is helpful for you! :)