Analyzing Cloudflare Logs with AWS Athena
As with many features with Cloudflare, you can enable their Logpush service with the click of a button. Logpush sends your HTTP request logs to your cloud storage provider every 5 minutes.
If you are using AWS S3 for your storage, you can then utilize Athena to analyze your logs.
Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
So with a little setup and some simple SQL, you can analyze your Cloudflare logs.
In order to being querying, however, you need to create an external table in Athena that matches the format of your Cloudflare logs, which are JSON with a newline delineating each record.
Luckily, this is pretty easy to setup in Athena. Here is the DDL for all of the fields currently included in Cloudflare Logpush.
Note: You can customize the fields that Logpush includes, so if you have, your list of fields may not match the below exactly.
CREATE EXTERNAL TABLE cloudflare_logs (
FirewallMatchesActions ARRAY < string >,
FirewallMatchesSources ARRAY < string >,
FirewallMatchesRuleIDs ARRAY < string >,
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION 's3://my-cloudflare-logs/'
Of course, change
s3://my-cloudflare-logs/ to the name of your bucket that you used when setting up Logpush.
An now that we have a table created in Athena, we can analyze our logs in a myriad of ways.
How about checking to see how many requests you’ve received by the request protocol?
SELECT count(*) as requests,
FROM "cloudflare_logs"."cloudflare_logs" c
GROUP BY c.clientrequestprotocol
ORDER BY count(*) DESC limit 10; requests clientrequestprotocol
1 15063737 HTTP/2
2 6842951 HTTP/1.1
3 4342 HTTP/1.0
Or maybe for reasons unknown, you want to see the average client request size in bytes for today, grouped by the Cloudflare edge colo ID.
-- Assuming you are using the default date format with Logpush
WHERE date_trunc('day', from_iso8601_timestamp(edgestarttimestamp)) = current_date
GROUP BY edgecoloid
ORDER BY avg(clientrequestbytes) ASC;
As you can imagine, the ways that you can slice and dice your Cloudflare HTTP logs is nearly limitless. Enjoy diving deep on your Cloudflare logs!
Originally published at https://nicholasduffy.com on August 3, 2019.