I read a profile of Mary Meeker in Wired earlier today. When asked what she’d learned about venture capitalists since becoming one a few years ago, she said they worked harder than she’d thought. I’m feeling the same way about data-driven journalism. How do I explore non-geographical elements of a dataset that’s only available as a KML file? How do I clean up the Excel file I get from a conversion when it’s formatted like a bunch of two-column cards?
Because I have yet to find a civilized solution to the second problem, my first move was to limit my dataset on traffic fatalities in California to just those in San Francisco and then scrape the data out manually, pinpoint by pinpoint.
The biggest question the data presented was why there were so many fatal accidents in which the driver’s blood alcohol level wasn’t tested. I’m still working on the why, but here’s the how many. Which is to say, after all of my technical feats, I still have some reporting to do!
I welcome input from more experienced hands: Which tools make most sense to learn right away? (Something that prettifies better than Excel would be great!) And what do you think of my first efforts? Just don’t forget to be nice — I’m new here!