This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

28.1.12

Merging Datasets with Common Columns in Google Refine


Merging Datasets with Common Columns in Google Refine

It's an often encountered situation, but one that can be a pain to address – merging data from two sources around a common column. Here's a way of doing it in Google Refine…Here are a couple of example datasets to import into separate Google Refine projects if you want to play along, both courtesy ...

27.1.12

Fragments: Glueing Different Data Sources Together With Google Refine

Fragments: Glueing Different Data Sources Together With Google Refine 
I'm working on a new pattern using Google Refine as the hub for a data fusion experiment pulling together data from different sources. I'm not sure how it'll play out in the end, but here are some fragments….Grab Data into Google Refine as CSV from a URL (Proxied Google Spreadsheet Query via Yahoo ...

24.1.12

Chapter 1. Using Google Refine to Clean Messy Data


Chapter 1. Using Google Refine to Clean Messy Data 

Google Refine (the program formerly known as Freebase Gridworks) is described by its creators as a "power tool for working with messy data" but could very well be advertised as "remedy for eye fatigue, migraines, depression, and other symptoms of prolonged data-cleaning."Even journalists with ...

23.1.12

Remove all number from a string

To remove all digits value from a string the following regex should do the work (from the menu edit cell > transform):
replace(value, /\d/, '')

This expression replace all numbers (identified by the regular expression /\d/) by blank. 
All regular expression (regex) in google refine should starts and ends by the character: /

21.1.12

Use Google Refine to Export JSON

Use Google Refine to Export JSON 
Google Refine is great at cleaning large sets of data. But one amazing under-documented feature is the ability to design and output JSON files. With Google Refine, you can turn a simple spreadsheet into a straight forward JSON dataset or multidimensional array quickly and easily. If you haven't ...

20.1.12

Comparing Columns in Google Refine

Comparing Columns in Google Refine

A reader (Cosmin Cabulea) writes: "I have two columns (A and B) and want to identify identical cells."I think I misapprehended the point of the question, but it prompted me to create this simple example.In something like Google Spreadsheets, we could use an if statement to set the value of cells in ...

17.1.12

Delete multiple projects at once (using the explorer)

At the end of a large project, you may have created multiple of google refine projects and want to clear them from the Open Project view. Screenshot (click on picture to enlarge it) below are from google refine 2.5. The same process is applicable to previous version, only the button browse workspace directory changed is place.

8.1.12

Sliced and diced aid data with google refine

Sliced and diced aid data with google refine

How to prepare, explore and do some basic cleaning with your data using google refine