This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

31.3.12

Free (and rebuild) the tweets! Export TwapperKeeper archives using Google Refine.

Free (and rebuild) the tweets! Export TwapperKeeper archives using Google Refine...

So here's a way you can make a copy of a Twapper Keeper archive and rebuild the data using Google Refine.

30.3.12

Working with Organisation XML files in Google Refine : IATI Support

Working with Organisation XML files in Google Refine : IATI Support

Example to open an xml file with the google refine 2.5

25.3.12

Looking up Images Trademarked By Companies Using OpenCorporates and Google Refine

Looking up Images Trademarked By Companies Using OpenCorporates and Google Refine
Listening to Chris Taggart talking about OpenCorporates at netzwerk recherche conf – data, research, stories, I figured I really should start to have a play…Looking through the example data available from an opencorporates company ID via the API, I spotted that registered trademark data was ...

20.3.12

Rejex: the JavaScript regular expression editor

Rejex: the JavaScript regular expression editor
Google refine support regex. This online regular expression editor is quite handy to test the expression before using it on grefine.

15.3.12

LOD2 extension for Grefine · GitHub

LOD2 extension for Grefine · GitHub

LOD2 Google Refine is a version of Google Refine, which includes some extensions
to help you deal with Linked Open Data. With these extensions you can:
- reconcile your data with DBpedia or RDF file or SPARQL endpoint
- to extend your reconciled data with data from DBpedia
- to export data into RDF
- to extract entities from full text descriptions in your data
- and more...

13.3.12

Using Google Refine and taxonomic databases (EOL, NCBI, uBio, WORMS)

Using Google Refine and taxonomic databases (EOL, NCBI, uBio, WORMS)

Tutorial to use grefine and reconciliate against 4 taxonomic databases

11.3.12

Using Google Refine to add administrative geography

Using Google Refine to add administrative geography

I've recently been pulling a list of the 92 top-level football grounds together - as I'm interested to play around with linking this with various aspects of administrative geography and census-type data. It's a niche!So - I compiled a list and the grounds and their addresses via.. Wikipedia. Took ...

9.3.12

Difference between a record and a row

Google refine make a clear distinction between a row and a record. We will see what's the difference between the two and advantages to works in records mode.

Fill down the right and secure way

The fill down function consist to taking the content of a cells and copying down following blank cells. This is done based on the rows number. When you perform this action using the fill down function, Google refine does not take into account if rows belong to different records or not, if the following rows is blank, it will fill it down with the previous rows content. If you do not use this function with extra care you can easily corrupt the integrity of your data set. Here is why, and how to avoid that.

6.3.12

Google refine extension for linkedgov.org


LinkedGov is a community project to collaboratively clean and make usable data from local authorities and other public bodies.
See the documentation and the code on GitHub