Wednesday, November 4, 2009

Web log analysis and statistics for Amazon S3 with S3STAT

I've been using Amazon Web Services for several months now. Like anything else I need to know what's going on with my services - what's being downloaded, how often, from where, etc. In the middle of last month I finally found a service which bridges the gap allowing a good view into what's going on with my S3 buckets. S3STAT is a service that takes the detailed server access logs provided by Amazon's Cloudfront and Simple Storage Service (S3), and translates them into human readable statistics, reports and graphs.

Every night they download my access logs, translate them, sort them, and run them through Webalizer, then they stick the processed log files right back into my Amazon S3 Bucket for me to view.

S3STAT provides the following benefits
  • Get Access to your Cloudfront and S3 Web Logs in a format that you can use. S3STAT will set it all up for you automatically.
  • Track your Cloudfront and S3 Usage Statistics through graphical reports generated on a nightly basis.
  • Identify performance bottlenecks caused by slow loading content. S3STAT keeps statistics on S3 processing time and system latency.
  • Consolidate your web usage reports by downloading nightly log files in Common Logfile Format and Combined Logfile Format.
  • Industry Standard web statistics provided by Webalizer, the leading web log analysis and reporting package.
  • They do all this for only $5 a month!
S3STAT provides two ways to process logs.  This first is to give them direct access to my S3 account, but being the paranoid admin that I am I didn't like this idea.  They even acknowledge this may be an issue, "Don't Trust Us?  If you really don't want to hand over your S3 credentials, it is still possible to use S3STAT in self-managed mode."  I opted for the self-managed mode although it's a pain in the you-know-what to setup.

Enter CloudBerry Explorer for Amazon S3.  The good folks at CloudBerry Labs have integrated the S3STAT self-managed setup and configuration into CloudBerry Explorer.  In CloudBerry Explorer just right-click your S3 bucket you want to use with S3STAT and select properties (you can also get there by right-clicking on the bucket, select Logging, Log Settings).

On the CloudFront Logging tab choose Use S3Stat logging.  Click OK.

Next, logon to your S3STAT account (make sure you set it up to use self-managed mode).  From your main account page select Add an S3 bucket.  Enter your bucket name and click Verify.

Sit back, relax and wait a couple days for the stats to accumulate and to be processed by S3STAT.  Once you have some stats you can access them easily though links (for each bucket) from your S3STAT account page.

This will take you to your stats page, which is actually stored right in your analyzed bucket.

So far I've been fairly pleased with S3STAT, especially considering I haven't paid a dime during the 30-day free trial.  However, I have noticed one issue - on a few of the days I have little to no stats, while I know I've had traffic.  Not sure if this is a bug with S3STAT or just what.  I'm not a huge fan of the Webalizer interface, but I can deal with it.  Otherwise S3STAT has been great and saved me a ton of time by not having to setup my own analytics for my S3 buckets.

One other small drawback - At the moment, there is not a way to configure Cloudfront distributions in self-managed mode.  According to S3STAT, "Cloudfront doesn't yet allow you to change the ACL for delivered logfiles, which means we can't read them unless we have your AWS credentials.  Never fear, though. We're working with the Cloudfront team to make this possible."

I definitely recommend giving it a try!


  1. Thanks for mentioning CloudBerry Explorer in your blog!

    Andy, CloudBerry lab

  2. Thanks for the post. I'll check out s3stat. Just starting looking for a stats solution to my amazon s3 and found your post. Aaron