Analyzing Cloudflare Logs with AWS Athena

As with many features with Cloudflare, you can enable their Logpush service with the click of a button. Logpush sends your HTTP request logs to your cloud storage provider every 5 minutes.

If you are using AWS S3 for your storage, you can then utilize Athena to analyze your logs.

Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.

So with a little setup and some simple SQL, you can analyze your Cloudflare logs.

Table DDL

In order to being querying, however, you need to create an external table in Athena that matches the format of your Cloudflare logs, which are JSON with a newline delineating each record.

Luckily, this is pretty easy to setup in Athena. Here is the DDL for all of the fields currently included in Cloudflare Logpush.

Note: You can customize the fields that Logpush includes, so if you have, your list of fields may not match the below exactly.

Of course, change s3://my-cloudflare-logs/ to the name of your bucket that you used when setting up Logpush.

Querying

An now that we have a table created in Athena, we can analyze our logs in a myriad of ways.

How about checking to see how many requests you’ve received by the request protocol?

Or maybe for reasons unknown, you want to see the average client request size in bytes for today, grouped by the Cloudflare edge colo ID.

As you can imagine, the ways that you can slice and dice your Cloudflare HTTP logs is nearly limitless. Enjoy diving deep on your Cloudflare logs!

Originally published at https://nicholasduffy.com on August 3, 2019.

I write about Python, Go, Node.js, cloud infrastructure, and data. https://pipefail.dev