Information Technology Dark Side

Struggles of a Self-Taught Coder

Information Technology Dark Side header image 2

Uploading Huge Files with Rails

January 27th, 2013 · 7 Comments

I’m working on an application that involves a lot of video content – great big files of 1 GB or more – and we found that uploading those huge files through a multipart form sucked. Duh. I started looking around at various solutions, such as chunked uploaders, and then a thought occurred to me: uploading huge files through the browser SUCKS. It locks up my browser, and to a certain degree, my laptop for the entire upload (if I start an upload, close my laptop and go somewhere, and then reopen it, the upload ain’t gonna work, so I’m stuck carrying my laptop around the house with the screen up like some crazed internet cats junky).

But you know what rocks for moving huge files around? Dropbox! They’ve already figured all this crap out – what to do if the computer gets turned off mid upload, loses it’s network, etc. And they have a super intuitive user interface – drag the file and wait for it to sync.

Also, when you’re uploading 1.5 GB of video, time isn’t exactly your main concern. You’re totally cool with dragging the file into the dropbox and going to sleep. If it shows up where you want it the next day, Yay!

So, here’s what I ended up building. I might redesign the workflow a bit once a real user sees it.

1) The user puts the huge file in a special dropbox folder that my app has access to
2) When they are ready to “upload” the huge file, they go to my app, pick the file, pick the model they want it to be associated with (in this case, it’s a LectureSegment), and save.
3) Resque kicks off a worker that copies the file to S3 using Carrierwave.

The code for copying the file is pretty simple, once you get the Dropbox session stuff figured out, which is weirdly convoluted IMO.

require 'dropbox_sdk'
class VideoProcessorWorker
  @queue = :eo_video_queue
  def self.perform(lecture_segment_id)
    log = Logger.new 'log/resque.log'
    @lecture_segment = LectureSegment.find(lecture_segment_id)
    dropbox_session = DropboxSession.deserialize(Organization.first.dropbox_session)
    client = DropboxClient.new(dropbox_session, :app_folder)
    log.debug("Got the client for #{@lecture_segment.video_dropbox_path}")
    tmp_file_name = "#{Rails.root}/tmp#{@lecture_segment.video_dropbox_path}"
    tmp_file = File.open(tmp_file_name, 'wb')
    tmp_file.write(client.get_file(@lecture_segment.video_dropbox_path))
    @lecture_segment.video = tmp_file
    @lecture_segment.save!
    log.debug("File saved to: #{@lecture_segment.video.url}")
    File.delete(tmp_file_name)
    
    @lecture_segment.update_attribute :video_dropbox_path, nil
    @lecture_segment.update_attribute :upload_status, "Upload Complete"

    log.debug("Clean up complete")
  end
end

I’m not sure that’s the best way to move the file to S3, but it’s easy to just let CarrierWave worry about it. It’d be cool if I didn’t have to create a local copy of the file. I’d be very interested in opinions on how to make this better. Also, I only just finished early testing of it – I haven’t busted it out on the 1.5 GB file, so it may still not even work!

UPDATE
I switched to the Dropbox Chooser, as suggested, and the code for the worker gets event simpler:

require 'open-uri'
class VideoProcessorWorker
  @queue = :eo_video_queue
  def self.perform(lecture_segment_id)
    lecture_segment = LectureSegment.find(lecture_segment_id)
    lecture_segment.remote_video_url = lecture_segment.video_dropbox_path
    lecture_segment.save!
    
    lecture_segment.update_attribute :video_dropbox_path, nil
    lecture_segment.update_attribute :upload_status, "Upload Complete"

    lecture_segment.update_column 'web_video', File.basename(lecture_segment.video.path)
  end
end
If you enjoyed this post, make sure you subscribe to my RSS feed!
Stumble it!

Tags: Uncategorized

7 responses so far ↓

  • 1 David Baldwin // Jan 27, 2013 at 5:23 pm

    Sounds like a great idea and I would like to hear how this works out for the very large file problem.

    If the videos aren’t super private, I like to just have clients use their YouTube account, mark them as “unlisted” and save the video URL into my app. Then I just embed on the site.

  • 2 David Christiansen // Jan 27, 2013 at 9:24 pm

    It works awesome so far, but I haven’t given it the ultimate test yet. That will happen tomorrow. So far, the background worker is able to transfer a 60MB file in about a second, which is much faster than a browser upload.

  • 3 Joel Meador // Jan 27, 2013 at 11:39 pm

    We did pretty much this exact thing on an app we worked on last year. DB is great for async giant file handling. Browser not so great.

  • 4 David Christiansen // Jan 28, 2013 at 11:21 am

    Just finished the 1 GB test. Took less than 10 minutes. I wasn’t timing it, I kicked it off and ignored it for ten minutes so it could have been even less.

  • 5 David Radcliffe // Jan 28, 2013 at 11:31 am

    Couldn’t you use the dropbox chooser to pick the file so your app doesn’t need to have access to any specific folder? https://www.dropbox.com/developers/chooser

  • 6 David Christiansen // Jan 28, 2013 at 11:52 am

    Oh heck yeah, I should have paid attention to the chooser instead of integrating the same way I have in the past!

  • 7 David Christiansen // Jan 30, 2013 at 3:12 pm

    Updated the blog post to use the chooser…

Leave a Comment