This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

30.3.20

OpenRefine March 2020 review

The latest edition of the OpenRefine review is ready. Through March the community published a LOT of new video tutorials in six languages! 

Do not forget to subscribe to our newsletter to get our monthly update right in your mailbox. 

29.3.20

Concatenate Column in OpenRefine 3.0 and 3.3


We all know the pain of merging different columns in OpenRefine when you have null values. Before version 3.0, it required writing a complex GREL expression or managing multiple filters to ensure we are not losing any data. 

Those shortcomings have been addressed in the latest version! 

Starting OpenRefine 3.0, we have the coalesce() function:  which natively handles the null correctly. 

But evermore importantly, OpenRefine 3.3 introduced a user interface that offers tons of flexibility, including defining how you want to concatenate one or multiple columns together. 

I recorded a quick video demonstration: 

27.3.20

Solving Google’s reCAPTCHA v2 with ParseHub Agent


ParseHub is a great point and click web scraping software. While projects run on ParseHub servers, you can connect with third party proxies like BrightData or captcha resolution service like 2Captcha

In this tutorial, we will show you how to bypass Google Recaptcha v2 test page with ParseHub Agent and 2Captcha service. You will need to create an account with 2Captcha and have an API key to complete this tutorial. 

Don't hesitate to contact us if you want to access the ParseHub project, have questions or need help to implement web scraping projects.