A few months ago, Amazon released an official Ruby gem for use with Amazon Web Services, and I’ve slowly been moving my projects over to it as the syntax is extremely friendly and powerful.
Today, I hit an unexpected gotcha. When enumerating objects stored in a directory structure using the standard
bucket.objects.each ... syntax, both leaf keys as well as non-leaf keys are returned. This means if you have a bucket with a single object with a key of
directory/object, your block will called twice: once with an object with key
directory/ and once with an object with key
Unfornately, the gem doesn’t include a
leaf? method for S3Objects, but there are two easy ways to address this.
Tree classes allow for navigating a bucket using a tree and selecting only the leaf objects. If you go this route, however, you’ll have to iterate through the tree, enumerating leafs at each branch— children only returns immediate descendants from a given node.
Alternatively, in the block enumerating the objects, you can check for a trailing slash on the object key, and when present, skip the object:
bucket.objects.each do |obj| next if obj.key.end_with? '/' ... end
I’m curious if anyone has a scenario where enumerating objects should return the branch nodes interspersed with the leaf nodes?