Count how often a character occurs in a cell

Did you know that Refine can count how often an string or character appears in a cell?

To achieve this, I first recommend that you store the count result in a separate column (so you do not write over your initial content). Select your reference column (where you want to do the count per cells) and create a new column based on this column. An other option is to store the result in a custom text facet.

We will use the Grel expression value.split(" ").length().

However if the cells does not contains the value Refine will still return '1'. I found two ways to work around this issue.

By substracting 1 at the final result
In the following expression replace you string by the expression you want to count. The -1 will correct the final result properly:
value.split("your string").length()-1 

By using a if condition
An other solution is to count only for cells that contains the expression your are counting. The if expression works quite like the Excel one: if(test, if_true, if_false).
Please note that I had inconsistent result with this expression.
 if(value.contains("your string"), value.split( "  your string " ).length(),"0") 

Remember that Google refine is case sensitive and "Paris" and "paris" are not the same string. You can use toLowercase function to harmonized your cells at the beginning of your expression: toLowercase(value).split .... 

Feel free to discuss your findings in the comments section, I'll be happy to have your feedback on those two solutions.

This post was


Post a Comment