Profile

wshaffer: (Default)
wshaffer

September 2021

S M T W T F S
   123 4
56789 1011
12131415161718
19202122232425
2627282930  

Custom Text

Most Popular Tags

Oct. 3rd, 2014

I'm doing a project at work where I've got a bunch of CSV files with several thousands of lines of data. I need to slice and dice this data in various ways, mostly by pulling out subsets of lines with certain strings occurring in them, counting the number of times certain values occur, and so on.

Looking at this data, it became clear that I could either a) become a serious Microsoft Excel power user, or b) put my slowly growing Ruby scripting skills to work. That wasn't much of a contest.

I actually managed to knock together the skeleton of a useful script pretty quickly. Now I'm polishing it up to make it useable and adding a bit of basic error-checking. I encountered two little issues that strike me as the kind of thing that I'm likely to forget about and then encounter again at some point in the future. So, blogging for my own reference, and because it might possibly be useful to some other Ruby newbie.
How do I find the file name extension? )
How do I unfreeze my string? )
Actually, as I was checking in my most recent changes, it occurred to me that Ruby probably has a class with built-in methods for doing things like handling file name extensions. But reinventing the occasional wheel is educational.
After a bit of research, I found a much better way to do the file name manipulation I was talking about in my previous post.

Basically, it boils down to:

require "pathname"
input_file = Pathname.new(ARGV[0])
new_base = (input_file.basename(input_file.extname)).to_s + "_counts" + input_file.extname.to_s
output_file = input_file.dirname + Pathname.new(new_base)


I used Pathname instead of File because the documentation suggests that it's more robust at dealing with different file pathing conventions on different OSes.

I'm a little dubious about the dance I had to do there of converting path fragments to strings, concatenating them, and then converting back to a pathname, but trying to concatenate the path fragments directly kept giving me extra / in the path.

Expand Cut Tags

No cut tags

Style Credit