Mike Saffitz

Production Ready MongoDB

Over the past two and half years, I’ve gone from dabbling with MongoDB in personal projects to using it as the foundation for paid consulting work and to building, scaling, and maintaining a MongoDB clusters.

The vast majority of this journey has been with Ruby on Rails as an application layer, however increasingly I’ve been using Node.js as well as a healthy random assortment of clients written in Java, Python, Ruby, and JavaScript.

In 2011, I was fortunate to be able to attend the 10gen MongoDB Conference here in Seattle, and this year I’m really honored to have been asked to give a lightening talk on my experience running Production MongoDB.

As the name implies, the lightening talks are short-format, and in the spirit of the infamous Joel Test, I’ve decided to use the time to present the 6 questions – plus 3 bonus EC2 questions – I feel best rate how production ready a MongoDB instance is.

Come back later today for the full talk notes and slides…

Optimizing Software

Recently I was asked to help evaluate whether a piece of software developed by an outside development shop had been “optimized or not”.

Enumerating S3 Directory Structures With AWS SDK for Ruby

A few months ago, Amazon released an official Ruby gem for use with Amazon Web Services, and I’ve slowly been moving my projects over to it as the syntax is extremely friendly and powerful.

Today, I hit an unexpected gotcha. When enumerating objects stored in a directory structure using the standard bucket.objects.each ... syntax, both leaf keys as well as non-leaf keys are returned. This means if you have a bucket with a single object with a key of directory/object, your block will called twice: once with an object with key directory/ and once with an object with key directory/object.

Unfornately, the gem doesn’t include a leaf? method for S3Objects, but there are two easy ways to address this.

First, the Tree classes allow for navigating a bucket using a tree and selecting only the leaf objects. If you go this route, however, you’ll have to iterate through the tree, enumerating leafs at each branch— children only returns immediate descendants from a given node.

Alternatively, in the block enumerating the objects, you can check for a trailing slash on the object key, and when present, skip the object:

bucket.objects.each do |obj|
  next if obj.key.end_with? '/'
  ...
end

I’m curious if anyone has a scenario where enumerating objects should return the branch nodes interspersed with the leaf nodes?