Selecting a string within a cell using smartSplit

The function smartSplit is a variation on split function that allow you to split the cell content based on any string of character and then select the leg you want to work on. This function is very useful to extract or remove string within cells without creating multiple columns and then merging them back.

smartSplit, introduction
For example, in a address field, we want to extract the first word. We key the following expression: smartSplit(value, ' ')[0] remember that in google refine 0 represent always the first group of character.

Remove specific words based on their position
If we want to remove the first word, we just have to embed this expression within the replace function:
replace(value, smartSplit(value,' ')[0],'')

Select from the end of a string
One of the nice thing about the smartSplit function is that is also accept reverse selection. To select the last word of a cell, use the expression:
smartSplit(value, ' ')[-1] 

And to select the second word starting from the end of the string:
 smartSplit(value, ' ')[-2]

Edit a only words within a cell
In this example, we set the first name to upper case and leave the rest of the cells with one single formula:
toUppercase(smartSplit(value,' ')[0])+ ' '+value.replace(smartSplit(value,' ')[0],'')

This post was


Post a Comment