vlookup is a very useful formula in Excel and it has is equivalent in google refine under the name cell.cross. Using this expression you can import data from a google refine project to an other based on a key value.
The complete expression:
cell.cross("Name of the source project", "name of the reference column").cells["Name of the column you want to import"].value[0]
The Process
In this example we have two files / projects:
- city containing name of 10 cities in the World
- state containing the matching state / country for those cities.
As you notice the two files share a common id. We will use this id to import the name of the country / state for every cities. For example Toronto have two matching state / country: Ontario and Canada.
In the city project, create add a new column based on ID field
In the formula enter: cell.cross("state", "ID").cells["Country / State"].value[0] (click on the image to zoom)
and click OK to import the data:
Multiple value available for an unique ID
The value in your reference column might return to more than one records in your source file. In our example this is the state and country for some records (Ontario / Canada for ID 1). If you want to display the second or third or n value, update the last parameter:
- value will display all value in the preview but empty cells once validated
- Value[0] will return the first value
- Value[1] will return the second value
- Value[n-1] will retourn the n value
only add value |
for value[1] |
Error: cannot retrieved field from null
A bug exist in this formula and even if you keyed your expression properly google refine will return: Error: cannot retrieved field from null
This bug have been reported and should be corrected soon (see the discussion list and issue 432)