This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

Showing posts with label reconciliation. Show all posts
Showing posts with label reconciliation. Show all posts

21.12.12

The Named Entity Extractor extension by Free You Metadata (from around the web)

The Free Your Metadata Named Entity Extractor extension helps you to enrich your data in OpenRefine using AlchemyAPI, DBpedia Lookup and Zemanta. The extension works on plain text field and any unstructured (meta)data

7.11.12

From Excel file to RDF with links to DBpedia and Europeana (from around the web)

DERI Galway the author of the RDF extension (download and documentation here) show steps by steps how to use the RDF extension to reconcile your data against DBpedia and Europeana. This tutorial also go through the step to create an RDF schema

27.6.12

Google Refine Reconciliation Service support for Apache Standbol (from around the web)

Add support for the Reconciliation Service API to the Apache Stanbol

Entityhub RESTful API (see documentation). The Google Refine ReconciliationServiceApi allows to reconcile String values with Entities.  The Entityhub is very well suited for implementing this service as it can execute those queries very efficiently based on the SolrYard implementation.

9.3.12

Difference between a record and a row

Google refine make a clear distinction between a row and a record. We will see what's the difference between the two and advantages to works in records mode.

29.11.11

facet by facet count


Google refine offers the possibilities to facet by name or choice count. This can be useful to focus an analysis or transformation only on value having more than twenty records for example.

Sort facet by name using toTitlecase(value) expression


When using the text facet option, google refine, sort all available value either by choice count or by name. When sorting by name, all values will be sorted by number first and then by alphabetical order (capital first and then lower case) into something like this:

23.10.11

Fetch City and Province / State based on the postal code


In the US, Canada and UK postal code are pretty good code to retrieve information on a location. In this tutorial we will use the yahoo place finder API to add geographical content to a data set based on the postal code. This tutorial can be easily turned around and used to run a query based on a  latitude and longitude (see the end of this post).

19.10.11

Reconcile against open corporates database

Here is a great video tutorial on reconciliation. It also introduce Open Corporates, an reconciliation source that contains more than 26 millions companies across 31 jurisdiction.

18.9.11

Google Refine 2.0 Training video

In this video you will learn to:

24.6.11