Your subscription could not be saved. Please try again.
Your subscription has been successful.

Subscribe to receive our monthly OpenRefine roundups with new tutorials, release updates and community announcements.


Remove duplicate rows

This is a quick tutorial to remove duplicate rows or records based on one field. This turial is adapated (add screenshot) from David Huynh answer on the google refine mailing list.

1. First, sort your column.

2. Then, invoke "Re-order rows permanently" in the "Sort" dropdown menu that appears on top of the middle of the data table.
3. Then invoke Edit cells and Blank down on the email column.

4. Then on that column, invoke menu Facet > Custom facets and Facet by blank.
5. Select true in that facet, and invoke Remove matching rows in the left
most "all" dropdown menu.
6. Remove the facet.

This post was


Post a Comment