This is the RefinePro knowledge base about OpenRefine. We build it over the years, and keep adding to it. From great tutorials and how-to, to handy GREL expressions and links to external resources, you will find here one of the most comprehensive list of resources to learn OpenRefine.

For a comprehensive documentation you should refer to the official OpenRefine wiki.

Don't where to get started? Search for a specific function below, or read our most popular article from the right side menu.

Showing posts with label html. Show all posts
Showing posts with label html. Show all posts

17.2.15

Parsing and extracting HTML tag and links in Refine

I recently helped someone on stackoverflow to parse and extract information from an HTML page.  Refine with GREL offer multiple ways to select specific element and contant. This article will review the main functions and specific use cases to illustrate when to use them.

18.10.11

Parse mark up language (JSON, html, xml ...)


In this tutorial we will see how to parse mark up language like JSON, html or xml. Those language are great to parse because there is often an easily identifiable markup right before or after the content you want to extract.  In this tutorial we will use a JSON language and extract relevant information by following a six steps process.

On a similar topic: