Direct Uploads to AWS S3 with Rails, CarrierWaveDirect, jQuery

Posted on Mar 3, 2014 in Blog

As a web app evolves and grows, new bottlenecks tend to appear as users start to stress the system in new ways. As you optimize the request-response lifecycle by moving tasks to background processes, it may make sense to evaluate the server resources required for file uploading. For one of our apps at Suntoucher, we made the choice to implement direct uploading of files to Amazon S3 storage from our user’s browser, to free up server resources which had been tied up for the duration of each file upload. As a point of reference, Heroku recommends implementing direct uploads to S3 if your upload size is greater than 4MB.

We learned from several good resources on this subject, starting with Ryan Bates’ Railscasts episode on Uploading to Amazon S3. This is a an excellent place to start and covers using the CarrierWaveDirect gem along with a custom solution for multiple file uploads.

With the Railscast episode as a starting point, this post will cover some of the gotchas we discovered along the way while integrating CarrierWaveDirect with the jQuery File Upload Plugin to support uploading of multiple files at once.

There is a lot of documentation for the jQuery File Upload Plugin, but I found these pages the most helpful:

Breaking Down the Steps Involved

To keep track of the steps required, I created the diagram below which outlines the order of steps from user initiation of the upload to the final submit of the form containing the new images.

Direct Uploads with Amazon S3

Changing the CarrierWave Storage Directory

This is an important point if you are already storing images or files with the CarrierWave gem. The default CarrierWave approach is to include the model id in the path for each uploaded file. With the direct upload approach we won’t know the model id, until after the file has been uploaded to S3. For this reason we had to change the store_dir and the path where our images were stored. CarrierWaveDirect will generate a guid in the S3 form which will become the key for the file on S3. This guid/filename string will be appended to the store_dir to get get the entire path to the image.

CarrierWave uploader default store_dir
CarrierWave uploader store_dir with model id removed for CarrierWaveDirect

To construct the file path, CarrierWaveDirect appends the GUID and the filename to the store_dir, so GUID/finame.ext is stored in the uploader column of your model. For instance, if you have mounted your CarrierWave uploader in your Photo model as :photo.
mount_uploader :photo, PhotoUploader
The photo column of your model in the database will contain something like this: fe5c7d48-1de6-47e7-8ea4-d9449010328f/myphoto.jpg

To make our existing image paths compatible with the new scheme, we updated the filenames for all existing images in the database by adding the model id to the photo column to look like this: 100/myphoto.jpg

As a result, a single SQL update made all of our photos accessible after adding CarrierWaveDirect, without needing to change any existing files on S3.

Building the S3 Form Data

Amazon S3 supports direct uploads through pre-authorized HTML POST forms. You can learn more about the form fields from these Amazon documents

CarrierWaveDirect will generate all of the required policy and signature by related methods on the uploader. In our case we created a rails controller to respond to an ajax request for the form data, indicated by step 1 of the main diagram.

In our controller we create a new Photo object, set the success_action_status (see S3 HTML Forms for more details), and render json containing everything we need for the form.

We initiate the request for form data by overriding the submit function in the jQuery File Upload Plugin. On success of the request, the formData property is set with the json returned.

Using the fileuploaddone callback, documented here, we have an event handler for the upload response. The XML response from S3 is parsed for the url (step 4), and that is added as the key param in a post (step 5) to create a new photo in our rails app. The key is stored in the photo object and the rails response (step 6) contains the id of our new photo. The new photo ID can be added to our form, so the photo can be saved with the associated report.

Background Image Processing

CarrierWaveDirect provides an example on how to implement background processing of your images after they are uploaded to S3. Here we will show exactly how we got it all working. In our photos controller we created a model method save_and_process_photo to save a photo from either the controller or from a background worker depending on the existence of a {now => true} option. Controller

We use the CarrierWave method remote_photo_url to trigger a download of the photo from S3 to a local tmp directory on the worker, where it is processed and the new versions uploaded back to S3.

The background worker simply looks up the photo based on ID, and calls the save_and_process_photo with now => true Background Worker

We hope this info is a helpful addition to all of the great documentation that is available from Railscasts, CarrierWaveDirect and

jQuery File Upload Plugin!

Let us know if you have any comments or questions by sending a tweet to @suntouchers. Tweet