This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

Showing posts with label remove. Show all posts
Showing posts with label remove. Show all posts

24.2.12

Selecting a string within a cell using smartSplit

The function smartSplit is a variation on split function that allow you to split the cell content based on any string of character and then select the leg you want to work on. This function is very useful to extract or remove string within cells without creating multiple columns and then merging them back.

23.1.12

Remove all number from a string

To remove all digits value from a string the following regex should do the work (from the menu edit cell > transform):
replace(value, /\d/, '')

This expression replace all numbers (identified by the regular expression /\d/) by blank. 
All regular expression (regex) in google refine should starts and ends by the character: /

17.1.12

Delete multiple projects at once (using the explorer)

At the end of a large project, you may have created multiple of google refine projects and want to clear them from the Open Project view. Screenshot (click on picture to enlarge it) below are from google refine 2.5. The same process is applicable to previous version, only the button browse workspace directory changed is place.

5.10.11

Extract from twitter hastag and reference


This case has been brought to me by cosmin who wanted to extract hastag from tweets for some analysis and data visualization. Data have been gather using ScraperWiki and their ability to scrap twitter data into one single document (see the video tutorial).

4.10.11

Video tutorial to clean up your dataset (by free your metadata)

A great video tutorial from free your metadata which show you how to:

18.9.11

Google Refine 2.0 Training video

In this video you will learn to:

15.8.11

Remove duplicate rows

This is a quick tutorial to remove duplicate rows or records based on one field. This turial is adapated (add screenshot) from David Huynh answer on the google refine mailing list.

28.7.11

remove " (quotation) mark

Hard time removing the " (quote sign) from your expression. Instead of quoting your quote mark with double quote, do it with simple like this :

22.7.11

Remove or replace a specific character in a column

You want to remove a space or a specific character from your column like the sign # before some number.