Google Refine 2.0

21:04 Thu 11 Nov 2010
[, , , , ]

This post is actually aimed more at my less technical readers than my programmer friends.

Google Refine, formerly Freebase Gridworks, is a data cleanup and transformation tool. These days, though, it seems as if everyone has to work with messy data. Lists of addresses, employment rosters, film collections, sports stats, and/or any amount of public material. Such data is rarely clean, and that’s precisely what makes a tool like Google Refine so useful.

I really wish something like it had been around when I worked at Nimblefish, where working with messy data took up maddening amounts of time. Google Refine would have been nice for the engineers, but even better for the people who couldn’t fall back on programmatic solutions and had to clean it manually.

If you’ve ever worked with a CSV file, or you’ve ever tried to extract meaningful data from any dodgy format, you should watch the intro videos; this is the first one.

« (previous)

2 Responses to “Google Refine 2.0”

  1. gever Says:

    Yeah, and I can’t run it on my Mac… so I have to install ubuntu on an old laptop so that I can install Java 1.6 in order to run Refine? Is anyone running an instance on a publicly accessible server that I can point my browser at?

  2. Jeff Says:

    Hey Gever, sorry you’re running into problems. Most of us Metawebbers who are running Refine run it from Macs so that in and of itself isn’t the problem. If you’re interested, we have a mailing list where you might be able to get help (and documentation and an issue tracker and a bunch more) linked from http://code.google.com/p/google-refine/

Leave a Reply