19.9.11

Use google refine to navigate data solely (facet, filter, flag)

Google refine can also be used to solely navigate and explore data without editing them. I find it very useful to explore large data-set thanks to its good user interface, this avoid to develop a specific one to visualize data. In this post I will present the three mains option to interact with the data: facet, text filer and flags.



Text facet

Watch the video tutorial





May be the most basic feature of google refine. Text facet is very similar to filter in a spreadsheet software and enable the user quickly the different value for a column. For every column the Text facet option is available in option > text facet (see 1). A list of choice is displayed on the left part of the screen (see 2). The list of value can be sort by name (alphabetical order) or by count (from the value with the more entry to the value with the less) (see 3).


AND vs. OR
You combine different facet together to limit the data-set display to a limited number of value. To do so, add as many facet as needed as describe previously. You can facet several time the same column to add (and option) constraints filter.
In the example below google refine display records matching the hastag #eDiscovery AND #Change which represent 37 records.


If you select multiple value within the same facet module, then google refine will display all records marching. To select multiple value within the same facet, click on the include option display in orange close the value name.
In the example below google refine display records matching the hastag #eDiscovery OR #Session which represent 668records.


Invert
An invert option is available in the header of the facet module. By clicking it, google refine will display all records that does not match the selection, in the example below are display all records that does not contains the hastag #eDiscovery which represent 6 220 records.




Click on the cross at the top left of the facet module to remove it. More about facet with gridworks presentation videos.

Text filter

Watch the video tutorial



Available directly from the drop down menu Text filter operate like Text facet. Text filter enable free text entry and display records containing this text string. It can be useful to search for a sub-string present in multiple words, for example by searching the string ilta google refine display records containing #ilta11, ILTA.






Star and flag



I recommend using the star function to indicate interesting items and the flag function for junk, spam and other out of the scope records. You can either star / flag a row by clicking the icon next to it, or star / flag in bulk with the option edit rows > star / flag rows.  This will star / flag all rows.

In the below example Refine will star the 3 rows matching the facet twitter name = @76mel. Please note that a star and a flag are independent, and rows can be flagged and starred at the same time.



By keeping a facet false on flagged rows, they will not be displayed anymore, regardless of whatever other facets you add on the top. By using the facet by star and selecting true only records you marked as interesting will be displayed.


I guess now you are all set to explore your data set and find interesting correlations and links