I load a lot of data into TroopTrack from CSV files by mapping fields in the CSV file to active record. Most of the time I am importing data from my competitors using CSV files they provide (and control the format of). In the past I have tended to write pretty sucky code for doing this where I build a hash of values that I then use to update active record, like this:
CSV.parse(@file_to_import, { :col_sep => "t" }) do |row| household_hash_1 = { :name => row[25], :line1 => row[27], :line2 => row[28], :city => row[29], :state => row[30], :zip_code => row[31] } end
This code seemed okay when I wrote it, but then I learned that at least one of my competitors would add a new field to the CSV file every couple of months, which meant I had to redo the mapping of the attributes. PFFT.
Last week I started a project loading data into ActiveRecord from Microsoft SQL Server. There are about a hundred different tables that I need to transform and load into TroopTrack. Needless to say I wasn’t feeling very thrilled about using the approach mentioned above. The mere thought of THOUSANDS of lines of code mapping a hash to a CSV file makes me want to vomit.
Method Missing to the Rescue
If you don’t know about method missing you’re probably not reading my blog, so I’m not going to explain. If you wandered here somehow with no ruby or rails experience and still care, just google “override ruby method missing”.
All the code below does is let me treat a row from a CSV file more like an active record entry, so that I can write mapping code that is more readable and avoid thousands of lines of that crap above.
In other words, given a row from a csv file with a header that includes “last_name”, I can write code like this:
row.get_last_name
That’s pretty handy.
class DataGenius def initialize(filename) @data_items = [] rows = CSV.read([Rails.root, 'db', 'data', 'csv', filename].join('/')) rows.each_with_index do |row, i| if i == 1 @header = Hash[*row.each_with_index.map{|val, idx| [val, idx]}.flatten] elsif i > 1 @data_items << DataItem.new(row, self) end end end def header @header end def data_items @data_items end class DataItem def initialize(row, data_genius) @row = row @data_genius = data_genius end def row @row end def method_missing(meth, *args, &block) if meth.to_s =~ /^get_(.+)$/ p $1 @row[@data_genius.header[$1]] else super # You *must* call super if you don't handle the # method, otherwise you'll mess up Ruby's method # lookup. end end end end

1 response so far ↓
1 David Christiansen // Dec 2, 2013 at 9:25 am
I should mention that in these particular CSV files, the header is actually the second row in the file, so if you swipe this code you will need to change that.
Leave a Comment