23.10.11

Fetch City and Province / State based on the postal code


In the US, Canada and UK postal code are pretty good code to retrieve information on a location. In this tutorial we will use the yahoo place finder API to add geographical content to a data set based on the postal code. This tutorial can be easily turned around and used to run a query based on a  latitude and longitude (see the end of this post).


1. Convert postal into Geocoding
  1. Create a yahoo api account.

  2. Clean your postal code column to match the following format A1A 1A1 (please note the space separating the two legs). Use the split by length function as describe here.

  3. Add new column by fetching URLS... using the following expression
    "http://where.yahooapis.com/geocode?q=" + escape( join( [ forNonBlank(value, v, v, "") ], " "), "url" )+"+country=CA&appid=**YOUR_YAHOO_API_KEY**"Copy past your yahoo api key instead of **YOUR_YAHOO_API_KEY**


  4. Yahoo place finder answer is by default xml. But JSON and php are also supported. To change to different answer type, update the flag parameter in the query as describe in the API documentation.

    In our example we did not precise and flag parameters, so results will be in xml format. However we specify the country to Canada  by adding the following parameter in the query:  country=CA

2. Runig the query / throttle delay.
Within google refine adjust the throttle delay (time in millisecond between two queries). If you set it on 1000, refine will sent a query every second to the yahoo API service. So if your data set contains 60 records, the total time to run this query will be a minute.

Depending on the size of the project, the query can take more or less time. Please note that google refine currently does not support query that last more than 30 minutes. This should be fixed in the release 2.5.

3. Clean the geocoding
A new column will be added with the full xml (or JSON) answer from yahoo place finder API. Now you need to parse the query result to extract the state or province and city name. Check this post to learn how to parse and extract the information from an API.

4. Query based on a latitude and longitude
If you want to retrieve information based on a latitude and longitude:

  1. Concatenate your two values and separated them by a comma within a new column named Lat/Long. 
  2. Then add a column by fetching urls based on this new column and use the following expression:
    where.yahooapis.com/geocode?location="+value+"&gflags=R&appid=**YOUR_YAHOO_API_KEY**"Copy past your yahoo api key instead of **YOUR_YAHOO_API_KEY**


The new column will be created with the yahoo place finder xml answer.