Wednesday
Nov302011

Don't log to the same bucket in Amazon S3

I’ve been investigating using Amazon’s S3 service to host and serve media files for my existing and potential future podcasts. To that end, I’ve been looking at S3’s logging capabilities to ensure that they adequately capture the data that I’d like them to - date and time of request, IP address, HTTP response, bytes sent, and so on.

For simplicity’s sake, I first tried putting the logs directly in the bucket root. Well, as it turns out, S3 loves to spit out hundreds of log files with only a few lines in them (in my testing at least). This is less than ideal as I’d love to have a single log file like Apache produces, but it’s something that I can conceivably work with. To avoid cluttering up the root of the bucket, I created a folder within the bucket and sent the logs there. I added a few sample files, issued some sample queries, and shortly went to bed.

I woke up today to find, once again, hundreds of log files. Strangely, the times on the files indicated that they had been written on a regular basis throughout the night. I downloaded them and opened them up to find lines that looked like the following (I’ve omitted some fields for brevity):

[30/Nov/2011:14:26:36 +0000] 10.118.17.24 71A3D26E8993B42A REST.PUT.OBJECT log/2011-11-30-14-26-36-9FFC7B797E7F436B "PUT /my-bucket/log/2011-11-30-14-26-36-9FFC7B797E7F436B HTTP/1.1" 200 - - 386 25 9 "-" "Jakarta Commons-HttpClient/3.0" -

Basically, it’s logging that it wrote a log file. Writing this line to a log file triggers another log, and another, etc. Fortunately S3’s logging isn’t in real-time - it seems to only write a few hundred files per day. But these logs of logs, at least to me, are completely useless.

Anyway, the solution is to log to a separate bucket, and not to log that bucket.

PrintView Printer Friendly Version

EmailEmail Article to Friend

References (14)

References allow you to track sources for this article, as well as articles that were written in response to this article.
  • Response
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Namita Chittoria
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Jared Londry
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Jeff Halevy
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Art Falcone
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Art Falcone
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: David Drwencke
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Jared Londry
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Jared Londry
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: Namita Chittoria
    Kyle Cronin - Blog - Don't log to the same bucket in Amazon S3
  • Response
    Response: planet resume
    Now this is some useful information that you have shared here with us. It will help us going through our documents and will also help in conserving our time. Thanks a lot for sharing this post with us.
  • Response
  • Response
  • Response

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
« Annoying Alfred usability issue | Main | I've run out of patience for Linux »