Loki and Grafana for Logs
I wanted to aggregate logs from a few different servers I'm responsible for; some in the cloud, some on my own network. Since it's all personal stuff, I didn't want to pay for Splunk or similar to do the ingestion/log aggregation, so opted for Grafana Loki as a self-hosted solution.
Components
Grafana Loki is actually a few different components:
- Loki - responsible for ingesting, managing and querying logs
- Grafana - querying/dashboarding for logs (by calling Loki with queries and processing results)
- Promtail - gathers logs from various sources and sends them to Loki
Loki doesn't care what shape the logs are in - it just deals with the storage and querying of those strings. It's also architected as such that it supports different storage backends, and the example Docker Compose scripts use a mock Amazon S3 blob store.
Log Patterns
Log patterns are only relevant when querying logs - Loki processes log messages as strings, meaning that it's actually in Grafana that you bring meaning to them using a pattern to extract labels. The patterns below are the ones I use to query various log sources.
Standard Nginx Log Pattern
This is the standard, out of the box Nginx log pattern.
pattern `<ip> - - <_> "<method> <uri> <_>" <status> <size> "<_>" "<agent>"`
Nginx Proxy Manager Log Format
The nginx.conf for the Nginx Proxy Manager defines a "proxy" format.
pattern `[<_>] <_> <upstream_status> <status> - <request_method> <scheme> <host_header> "<request_uri>" [Client <remote_addr>] [Length <body_bytes_sent>] [Gzip <_>] [Sent-to <_>] "<http_user_agent>" "<_>"`
Amazon CloudFront Logs
See Standard log file format - Amazon CloudFront.
pattern `<date> <time> <edge_location> <sc_bytes> <c_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
Example Queries
Here's a few example queries, using the log patterns above:
CloudFront: URIs and Status Codes Over Time
sum by (uri) (
count_over_time(
{__aws_log_type="s3_cloudfront"} | pattern `<date> <time> <edge_location> <sc_bytes> <cs_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
| label_format uri=`{{ .cs_method }} {{ .cs_protocol }}://{{ .host_header }}{{ .cs_uri_stem }} ({{ .sc_status }})`
[$__interval]
)
)
CloudFront: Airport Codes Over Time
sum by (airport) (
count_over_time(
{__aws_log_type="s3_cloudfront"} | pattern `<date> <time> <edge_location> <sc_bytes> <c_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
| label_format airport=`{{ regexReplaceAll "([A-Z]+)([0-9]+)-(.*)" .edge_location "$1" }}`
[$__interval]
)
)
CloudFront: Average Time to First Byte Over Time
avg by (host_header) (
avg_over_time({__aws_log_type="s3_cloudfront"}
| pattern `<date> <time> <edge_location> <sc_bytes> <c_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
| unwrap time_to_first_byte [$__auto])
)
CloudFront: Bytes Transferred By Edge Result Type Over Time
sum by (edge_detailed_result_type) (
sum_over_time (
{__aws_log_type="s3_cloudfront"} | pattern `<date> <time> <edge_location> <sc_bytes> <c_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
| label_format total_bytes=`{{ add .sc_bytes .cs_bytes }}`
| unwrap total_bytes
[$__interval]
)
)
CloudFront: Specific URL Parts Over Time
sum by (post) (
count_over_time(
{__aws_log_type="s3_cloudfront"} | pattern `<date> <time> <edge_location> <sc_bytes> <c_ip> <cs_method> <cs_host> <cs_uri_stem> <sc_status> <cs_referer> <cs_user_agent> <cs_uri_query> <cs_cookie> <edge_result_type> <edge_request_id> <host_header> <cs_protocol> <cs_bytes> <time_taken> <forwarded_for> <ssl_protocol> <ssl_cipher> <edge_response_result_type> <cs_protocol_version> <fle_status> <fle_encrypted_fields> <c_port> <time_to_first_byte> <edge_detailed_result_type> <sc_content_type> <sc_content_len> <sc_range_start> <sc_range_end>`
| host_header = "alanedwardes.com"
| cs_uri_stem =~ "/blog/posts/(.*)/"
| label_format post=`{{ .cs_method }} {{ regexReplaceAll "/blog/posts/(.*)/" .cs_uri_stem "${1}" }} ({{ .sc_status }})`
| sc_status >= 200 and sc_status < 400
[$__interval]
)
)
Promtail on Lambda
To allow CloudFront logs to be piped from Amazon S3 into Loki, there is a handy CloudFormation template which uses a Docker container hosted on AWS Lambda. There are a few steps to get this working however, documented below:
Mirroring the Docker Image on a Private Registry
Unfortunately, AWS Lambda functions can only use Docker image URIs hosted on an ECR private registry in the same region as the function, meaning we must mirror the official image. First, pull the latest lambda-promtail
image:
docker pull public.ecr.aws/grafana/lambda-promtail:main
Ensure you have created a private ECR repo, and authenticated your machine. First, tag the lambda-promtail
docker image with your private ECR repo URI:
docker tag public.ecr.aws/grafana/lambda-promtail:main <aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<ecr-repo-name>:<your-tag>
Push the image to your private registry:
docker push <aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<ecr-repo-name>:<your-tag>
Enable S3 Bucket Events to EventBridge
Under the "Properties" tab for the S3 Bucket in the AWS console, there is an option to enable events for AWS EventBridge. This needs to be enabled.
🏷️ log loki logs cloudfront image grafana docker nginx private querying query amazon s3 patterns aws
Please click here to load comments.