This is an old revision of the document!

How to setup and configure Cloudwatch for an AWS Lightsail instance

Summary: This wiki page demonstrates setting up CloudWatch monitoring for a DokuWiki instance on AWS Lightsail, including CloudWatch agent installation with custom Apache log collection, Vector integration for systemd journal logs, and dashboard/alarm configuration.
Date: 14 July 2025

aws, cloudwatch

In my previous page I describe how I setup dokuwiki on an AWS lightsail instance. In this page I will describe how to setup Cloudwatch for monitoring the instance.

Overall, the following techniques are used:

Setup and install the Cloudwatch agent
Use the Cloudwatch configuration file to configure the collection of logfiles and metrics
Configure the Apache server to log to custom log files for Cloudwatch
Use Vector (by datadog) to send the systemd journalctl logs to Cloudwatch
Create and configure a Cloudwatch dashboard
Create and configure Cloudwatch alarms

Setup Cloudwatch Agent

The Cloudwatch agent is used to collect metrics and logs from the instance and send them to Cloudwatch.

Create IAM User

We need an IAM user with the required permissions to allow the Cloudwatch agent to send metrics and logs to Cloudwatch. Follow these steps:

In the IAM console, create a new IAM user with the following settings:
- Name: lightsail-cloudwatch-agent
- Policy: CloudWatchAgentServerPolicy
- Programmatic access: Access key & secret access key: See Lastpass - `wiki.getshifting.com - lightsail cloudwatch agent`

Install Cloudwatch Agent

Now we can download and install the cloudwatch agent:

# download the latest cloudwatch agent package
wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb
sudo dpkg -i -E ./amazon-cloudwatch-agent.deb
 
# Setup the credentials so that the cloudwatch agent can write to cloudwatch
 
sudo aws configure --profile AmazonCloudWatchAgent
AWS Access Key ID [None]: See Lastpass
AWS Secret Access Key [None]: See Lastpass
Default region name [None]:
Default output format [None]:

Configure Cloudwatch Agent

Now we can create an initial configuration for the agent using the config wizard. Try to set the values as much as possible as below, but we will change the config file afterwards to include the required metrics and logs.

Configure Agent with sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard and use the following settings:
- OS: Linux
- Where: EC2
- User: root
- StatsD daemon: yes
  - Port: 8125
  - Collect interval: 60s
  - Aggregation interval: 60s
- CollectD: No
- Host metrics: Yes
- CPU metrics per core: No
- EC2 dimensions: No
  - Aggregate EC2 dimensions: No
- High resolution: 60s
- Default metrics: Standard
- Config OK: Yes
- Existing config: No
- Monitor Log files: No
- X-ray traces: No
- SSM parameter store: No

Choose Standard to create a basic setup

Edit the file afterwards as explained here, so now open the config file: sudo vi /opt/aws/amazon-cloudwatch-agent/bin/config.json:

{
        "agent": {
                "metrics_collection_interval": 60,
                "run_as_user": "root"
        },
        "logs": {
                "logs_collected": {
                        "files": {
                                "collect_list": [
                                        {
                                                "file_path": "/opt/bitnami/apache/logs/access_log_cloudwatch",
                                                "log_group_name": "apache/access",
                                                "log_stream_name": "ApacheAccess",
                                                "retention_in_days": 90
                                        },
                                        {
                                                "file_path": "/opt/bitnami/apache/logs/error_log_cloudwatch",
                                                "log_group_name": "apache/error",
                                                "log_stream_name": "ApacheError",
                                                "retention_in_days": 90
                                        },
                                        {
                                                "file_path": "/var/log/dpkg.log",
                                                "log_group_name": "dpkg-logs",
                                                "log_stream_name": "dpkg",
                                                "retention_in_days": 90
                                        }
                                ]
                        }
                }
        },
        "metrics": {
                "metrics_collected": {
                        "cpu": {
                                "measurement": [
                                        "cpu_usage_idle",
                                        "cpu_usage_iowait",
                                        "cpu_usage_user",
                                        "cpu_usage_system",
                                        "cpu_usage_active"
                                ],
                                "metrics_collection_interval": 60,
                                "totalcpu": true
                        },
                        "disk": {
                                "measurement": [
                                        "used_percent"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ]
                        },
                        "diskio": {
                                "measurement": [
                                        "io_time"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ]
                        },
                        "mem": {
                                "measurement": [
                                        "mem_used_percent"
                                ],
                                "metrics_collection_interval": 60
                        },
                        "statsd": {
                                "metrics_aggregation_interval": 60,
                                "metrics_collection_interval": 60,
                                "service_address": ":8125"
                        },
                        "swap": {
                                "measurement": [
                                        "swap_used_percent"
                                ],
                                "metrics_collection_interval": 60
                        },
                        "processes": {
                                "measurement": [
                                        "total",
                                        "idle",
                                        "wait",
                                        "running",
                                        "sleeping",
                                        "dead",
                                        "zombies"
                                ]
                        }
                }
        }
}

Note that we also added the /var/log/dpkg.log log file to the configuration, which is used for monitoring package installations and updates.

Now we need to configure the credentials:

sudo vi /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml
# Uncomment and edit the following lines:
[credentials]
shared_credential_profile = "AmazonCloudWatchAgent"

Setup Apache Logs

As you can see in the config file above, we will collect apache logs from a custum log, which we need to configure. For that we will follow the tutorial from here. We need to change the logging section in the Apache setup. To make the changes a bit more clear, I'll first show the original logging section, and then the new logging section with the changes.

Open the apache config: sudo vi /opt/bitnami/apache/conf/httpd.conf

Original logging section

#
# ErrorLog: The location of the error log file.
# If you do not specify an ErrorLog directive within a <VirtualHost>
# container, error messages relating to that virtual host will be
# logged here.  If you *do* define an error logfile for a <VirtualHost>
# container, that host's errors will be logged there and not here.
#
ErrorLog "logs/error_log"
 
#
# LogLevel: Control the number of messages logged to the error_log.
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
#
LogLevel warn
 
<IfModule log_config_module>
    #
    # The following directives define some format nicknames for use with
    # a CustomLog directive (see below).
    #
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common
 
    <IfModule logio_module>
      # You need to enable mod_logio.c to use %I and %O
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
    </IfModule>
 
    #
    # The location and format of the access logfile (Common Logfile Format).
    # If you do not define any access logfiles within a <VirtualHost>
    # container, they will be logged here.  Contrariwise, if you *do*
    # define per-<VirtualHost> access logfiles, transactions will be
    # logged therein and *not* in this file.
    #
    CustomLog "logs/access_log" common
 
    #
    # If you prefer a logfile with access, agent, and referer information
    # (Combined Logfile Format) you can use the following directive.
    #
    #CustomLog "logs/access_log" combined
</IfModule>

Updated logging section

#
# ErrorLog: The location of the error log file.
# If you do not specify an ErrorLog directive within a <VirtualHost>
# container, error messages relating to that virtual host will be
# logged here.  If you *do* define an error logfile for a <VirtualHost>
# container, that host's errors will be logged there and not here.
#
ErrorLog "/opt/bitnami/apache/logs/error_log_cloudwatch"
ErrorLogFormat "{\"time\":\"%{%usec_frac}t\", \"function\" : \"[%-m:%l]\" , \"process\" : \"[pid%P]\" ,\"message\" : \"%M\"}"
 
#
# LogLevel: Control the number of messages logged to the error_log.
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
#
LogLevel warn
 
<IfModule log_config_module>
    #
    # The following directives define some format nicknames for use with
    # a CustomLog directive (see below).
    #
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common
    LogFormat "{ \"time\":\"%{%Y-%m-%d}tT%{%T}t.%{msec_frac}tZ\", \"process\":\"%D\", \"filename\":\"%f\", \"remoteIP\":\"%a\", \"host\":\"%V\", \"request\":\"%U\",\"query\":\"%q\",\"method\":\"%m\", \"status\":\"%>s\", \"userAgent\":\"%{User-agent}i\",\"referer\":\"%{Referer}i\"}" cloudwatch
 
    <IfModule logio_module>
      # You need to enable mod_logio.c to use %I and %O
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
    </IfModule>
 
    #
    # The location and format of the access logfile (Common Logfile Format).
    # If you do not define any access logfiles within a <VirtualHost>
    # container, they will be logged here.  Contrariwise, if you *do*
    # define per-<VirtualHost> access logfiles, transactions will be
    # logged therein and *not* in this file.
    #
    CustomLog "/opt/bitnami/apache/logs/access_log_cloudwatch" cloudwatch
 
    #
    # If you prefer a logfile with access, agent, and referer information
    # (Combined Logfile Format) you can use the following directive.
    #
    #CustomLog "logs/access_log" combined
</IfModule>

Now we need to restart the apache server to apply the changes:

sudo /opt/bitnami/ctlscript.sh restart apache

Start Cloudwatch Agent

To start, or restart the Cloudwatch agent, we can use the following commands:

sudo amazon-cloudwatch-agent-ctl -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -a fetch-config -s

To check the status of the Cloudwatch agent, we can use the following command: sudo amazon-cloudwatch-agent-ctl -a status

Troubleshooting the agent

In case something doesn't work, you can check the cloudwatch agent log:

tail -f /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log

Create Cloudwatch Dashboard

For all the metrics widgets below, you need to add a new widget of data type “Metrics” and a widget type “Line”. Then go to the source and then add the json as shown below.
To add log widgets, go to the cloudwatch console and select the Logs Insights tab. Then select the log group you want to query, which in this case is `/apache/access` for the access logs and `/apache/error` for the error logs. You can then use the queries below to get insights into the logs. Then, when the query has run, you can click “Add to dashboard” and select the dashboard you want to add the widget to.

Metrics: Processes

{
    "metrics": [
        [ "CWAgent", "processes_running", "host", "wiki", { "region": "eu-west-1", "label": "Running" } ],
        [ ".", "processes_sleeping", ".", ".", { "region": "eu-west-1", "label": "Sleeping" } ],
        [ ".", "processes_dead", ".", ".", { "region": "eu-west-1", "label": "Dead" } ],
        [ ".", "processes_zombies", ".", ".", { "region": "eu-west-1", "label": "Zombie" } ],
        [ ".", "processes_total", ".", ".", { "region": "eu-west-1", "label": "Total" } ],
        [ ".", "processes_idle", ".", ".", { "region": "eu-west-1", "label": "Idle" } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-west-1",
    "period": 300,
    "stat": "Average",
    "title": "wiki.getshifting.com - Processes"
}

Metrics: Memory

{
    "metrics": [
        [ "CWAgent", "mem_used_percent", "host", "wiki", { "label": "Memory usage", "region": "eu-west-1" } ],
        [ ".", "swap_used_percent", ".", ".", { "label": "Swap usage", "region": "eu-west-1" } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-west-1",
    "title": "wiki.getshifting.com - Memory",
    "period": 300,
    "stat": "Average"
}

Metrics: Disk Usage

{
    "metrics": [
        [ "CWAgent", "disk_used_percent", "path", "/", "host", "wiki", "device", "nvme0n1p1", "fstype", "ext4", { "label": "Disk Space Usage", "region": "eu-west-1" } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-west-1",
    "title": "wiki.getshifting.com - Disk Usage",
    "period": 300,
    "stat": "Average"
}

Metrics: Disk IO Time

{
    "metrics": [
        [ "CWAgent", "diskio_io_time", "host", "wiki", "name", "nvme0n1p1", { "label": "Disk IO Time (The amount of time that the disk has had I/O requests queued)", "region": "eu-west-1" } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-west-1",
    "title": "wiki.getshifting.com - Disk IO Time",
    "period": 300,
    "stat": "Average"
}

Metrics: CPU

{
    "metrics": [
        [ "CWAgent", "cpu_usage_user", "host", "wiki", "cpu", "cpu-total", { "region": "eu-west-1", "label": "User" } ],
        [ ".", "cpu_usage_system", ".", ".", ".", ".", { "region": "eu-west-1", "label": "System" } ],
        [ ".", "cpu_usage_iowait", ".", ".", ".", ".", { "region": "eu-west-1", "label": "IO Wait" } ],
        [ ".", "cpu_usage_idle", ".", ".", ".", ".", { "region": "eu-west-1", "visible": false, "label": "Idle" } ],
        [ ".", "cpu_usage_active", ".", ".", ".", ".", { "region": "eu-west-1", "label": "Active" } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-west-1",
    "period": 300,
    "title": "wiki.getshifting.com - CPU",
    "stat": "Average"
}

Apache Access Logs: UniqueVisits

fields @timestamp, remoteIP, method, status |
filter status="200" and method= "GET" |
stats count_distinct(remoteIP) as UniqueVisits

Apache Error Logs: All messages

fields @timestamp, message | limit 20

Dpkg Logs: All messages

fields @timestamp, message | limit 10

Journalctl

Traditionally, log files on a linux system were stored in the `/var/log` directory, but nowadays on systemd-based systems, the logs are stored in the systemd journal. You could check cat /var/log/README for confirmation. To still be able to send the logs to cloudwatch, we'll configure Vector.dev], which is a tool from datadog, to send the journalctl entries to cloudwatch.

Setup IAM User

We need an IAM user with the required permissions to allow the vector agent to send logs to Cloudwatch. Follow these steps:

In the IAM console, create a new IAM user with the following settings:
- Name: lightsail-vector-agent
- Attach policies directly
- Create policy:
```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CloudWatchLogsPermissions",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogGroups",
        "logs:DescribeLogStreams",
        "logs:ListTagsLogGroup"
      ],
      "Resource": "*"
    }
  ]
}
```
- Policy name: MinimumCloudwatchPermissions
  - Description: Originally drafted for vector with the help of google search AI results
- Click on Create policy
- Programmatic access: Access key & secret access key: See Lastpass - `wiki.getshifting.com - lightsail vector agent`

Now we can create a new credentials file for the vector agent:

sjoerd@wiki:~$ aws configure --profile VectorAgent
AWS Access Key ID [None]: See Lastpass
AWS Secret Access Key [None]: See Lastpass
Default region name [None]: eu-west-1
Default output format [None]:

Now we need to set permissions so that vector can read the credentials file:

chmod o+r .aws/credentials

Setup vector

As lightsail uses the dpkg package manager, we can install vector using the following steps:

curl \
  --proto '=https' \
  --tlsv1.2 -O \
  https://apt.vector.dev/pool/v/ve/vector_0.48.0-1_amd64.deb
 
sudo dpkg -i vector_0.48.0-1_amd64.deb

- Configure vector: sudo vi /etc/vector/vector.yaml

cat /etc/vector/vector.yaml | grep -v '^\s*$\|^\s*\#'

sources:
  journald_source:
    type: "journald"
sinks:
  cloudwatch_sink:
    type: "aws_cloudwatch_logs"
    auth:
      credentials_file: "/home/sjoerd/.aws/credentials"
      profile: "VectorAgent"
    inputs:
      - "journald_source"
    compression: "gzip"
    encoding:
      codec: "json"
    region: "eu-west-1"
    group_name: "systemd-journal"
    stream_name: "journalctl"

- Validate vector config: sudo vector validate /etc/vector/vector.yaml - Start vector: sudo systemctl start vector - Enable vector: sudo systemctl enable vector - Check the status: sudo systemctl status vector

sudo journalctl -u vector.service

JournalCtl Logs: All messages

fields @timestamp, message | limit 50

Certificate Manager

We want to monitor the certificate expiration date:

In the Cloudwatch console, go to the dashboard, and click on Add widget
- Data type: Metrics
- Widget type: Number
Click next and in the new 'Add metric graph' screen, first make sure the region is set to N. Virginia (us-east-1)
- Click on 'CertificateManager' in the list of services → 'Certificate Metrics'
- You can select all the certificates you want
- Click on 'Create widget' and save the dashboard

You can also set a label and provide a new name. Then this will be the source:

{
    "metrics": [
        [ "AWS/CertificateManager", "DaysToExpiry", "CertificateArn", "arn:aws:acm:us-east-1:410123456772:certificate/175bbc5b-cd9b-45b2-b906-059e12589237", { "region": "us-east-1", "label": "getshifting.com" } ],
        [ "...", "arn:aws:acm:us-east-1:410123456772:certificate/2598de1a-fea6-40c0-9296-e6cb18ae8a26", { "region": "us-east-1", "label": "wiki.getshifting.com" } ]
    ],
    "sparkline": true,
    "view": "singleValue",
    "region": "us-east-1",
    "period": 300,
    "stat": "Average",
    "title": "Certificate - DaysToExpire"
}

Share CloudWatch Dashboard

To share a dashboard, from the dashbard, go to Actions → Share dashboard.
In our case, I want to share it with myself so I don't have to login to the AWS console every time I want to check the dashboard.
Select the 'Share your dashboard and require a username and password' option.
Enter a username and password, and click on 'Share dashboard'.

Afterwards, you need to change the permissons to allow acces to the loggroups and alarms. Click on IAM role, from the sharing overview and change the policy as below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeTags",
        "cloudwatch:GetMetricData"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "cloudwatch:GetInsightRuleReport",
        "cloudwatch:DescribeAlarms",
        "cloudwatch:GetDashboard"
      ],
      "Resource": [
        "arn:aws:cloudwatch::410123456772:dashboard/GetShiftingDashboard"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
          "logs:FilterLogEvents",
          "logs:StartQuery",
          "logs:StopQuery",
          "logs:GetLogRecord",
          "logs:DescribeLogGroups"
      ],
      "Resource": [
          "arn:aws:logs:eu-west-1:410123456772:log-group:apache/access:*",
          "arn:aws:logs:eu-west-1:410123456772:log-group:apache/error:*",
          "arn:aws:logs:eu-west-1:410123456772:log-group:dpkg-logs:*",
          "arn:aws:logs:eu-west-1:410123456772:log-group:systemd-journal:*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": "cloudwatch:DescribeAlarms",
      "Resource": "*"
    }
  ]
}

Set Alarm for full root disk

We want to be notified when the root disk is almost full, so we will create an alarm for that. We will use the Cloudwatch agent metrics to monitor the disk usage.

Setup an SNS Topic

First we need to create an SNS topic to send the alarm notifications to. Follow these steps:

In the AWS console → go to the SNS service → Topics
Click on 'Create topic'
- Type: Standard
- Name: Monitoring
- Display name: Monitoring from cloudwatch
Click on 'Create topic'

Now we need to subscribe to the topic, so we can receive the notifications:

Click on the topic you just created
Click on 'Create subscription'
- Protocol: Email
- Endpoint: sjoerd@getshifting.com
Click on 'Create subscription'
You will receive an email to confirm the subscription, click on the link in the email to confirm the subscription

If required, you can test the subscription by publishing a test message to the topic, through the 'Publish message' option in the topic details page.

Create Alarm for Full Root Disk

We want to create an alarm that will notify us when the root disk is almost full. We will use the Cloudwatch agent metrics to monitor the disk usage. The alarm will be triggered when the disk usage exceeds 90%.

Go to the Cloudwatch console and follow these steps:

Go to the Alarms tab
Click on 'Create alarm'
All metrics → CWAgent → device, fstype, host, path
- Host: origin-prd / origin-acc
Click on 'Select metric'
In the next screen, set the following options:
- Statistic: Average
- Period: 5 minutes
- Conditions: Static
  - Threshold type: Greater than
  - Threshold value: 75
Click on 'Next'
In the next screen, set the following options:
- Notification: In alarm
  - Select the SNS topic you created earlier: Monitoring
Click on 'Next'
In the next screen, set the following options:
- Name: Wiki - Full Root Disk
- Description: Alarm for full root disk on wiki.getshifting.com
Click on 'Create alarm'

Add the alarm to the dashboard

Go to the Cloudwatch console and select the dashboard you want to add the alarm to
Click on 'Add widget'
- Data type: Alarms
- Widget type: Alarm status
Click Next
Select the alarm you just created: Wiki - Full Root Disk
Click on 'Create widget' → Click on 'Add to dashboard'
Click on 'Save dashboard'

wiki.getshifting.com

Table of Contents