What is Geocoding?

Geocoding is taking a human readable address (e.g. 1600 Pennsylvania Ave NW, Washington DC 20500) or a name of a place (e.g. The White House) and turns it into a geographic position using measures of longitude and latitude.

Reverse Geocoding is what it sounds like; it takes a lat-long pair and converts it into an address or a place.

The Geocoding Process

Why is geocoding helpful? When is it used?

Geocoding is helpful when we want to do spatial work. For example, maybe we have data on voter addresses and want to visualize part allegiance. Keep in mind we have very precise data with geocoding. Another example, maybe we are wondering who is affected by a certain watershed. With postal addresses we don’t know much but by geocoding we learn a good deal more. Commuter habits, crime trends, pandemic evolution, and (fill in your example here) analyses are all improved with geocoding. Thanks, geocoding!

##Geocodio There are many geocoding platforms (services that do the geocoding or reverse geocoding for you). Today we are going to look at Geocodio and its R interface rGeocodio–as well as briefly touch on Google’s geocode platform.

Let’s check out how the website works: geocodio’s website.

Things to consider:

So you fill out a big spreadsheet and send it to the good people of geocodio.

Using rgeocodio

Why use the r package? It can be nice when you want to build it into a function, or a chron job. It also is nice for teaching… NOTE: be careful because every time you run the R command it makes a request to the Geocodio platform (which can cost you big dollars!).

In order to install rgeocodio you will need to load the devtools package. Install it if you haven’t already install.packages("devtools"). Once devtools is loaded run: devtools::install_github('hrbrmstr/rgeocodio').

geocodio uses…wait for it…an API! To get an API visit geocodio’s website. Then save it in your Renviron. Recall that you can do this by typing usethis::edit_r_environ() and saving your API code from the geocodio website.

We still need to authorize our API in our R session. Do so by running gio_auth().

library(pacman)
p_load(rgeocodio, ggmap)


#gio_auth(force = F) <- run this 


rgeocodio::gio_geocode('1600 Pennsylvania Ave NW, Washington DC 20500')
## # A tibble: 2 x 16
##   formatted_addre… accuracy accuracy_type source number street suffix
##   <chr>               <dbl> <chr>         <chr>  <chr>  <chr>  <chr> 
## 1 1600 Pennsylvan…      1   rooftop       City … 1600   Penns… Ave   
## 2 1600 Pennsylvan…      0.7 rooftop       City … 1600   Penns… Ave   
## # … with 9 more variables: postdirectional <chr>, formatted_street <chr>,
## #   city <chr>, county <chr>, state <chr>, zip <chr>, country <chr>,
## #   location_lat <dbl>, location_lng <dbl>

Most of these variables are intuitive but I want to spend a few seconds on accuracy and accuracy type which we can learn more about here.

For more resources visit the github for rgeocodio. Essentially geocodio wants you to know how confident you should be in the results of your geocoded query.

Lets look at a few examples:

rgeocodio::gio_geocode("Oregon")
## # A tibble: 1 x 8
##   formatted_addre… accuracy accuracy_type source state country location_lat
##   <chr>               <int> <chr>         <chr>  <chr> <chr>          <dbl>
## 1 Oregon                  1 state         US Ce… OR    US              44.0
## # … with 1 more variable: location_lng <dbl>
rgeocodio::gio_geocode("Multnomah County, Oregon")
## # A tibble: 1 x 11
##   formatted_addre… accuracy accuracy_type source city  county state zip  
##   <chr>               <dbl> <chr>         <chr>  <chr> <chr>  <chr> <chr>
## 1 Multnomah, OR 9…      0.9 place         TIGER… Mult… Multn… OR    97219
## # … with 3 more variables: country <chr>, location_lat <dbl>,
## #   location_lng <dbl>
rgeocodio::gio_geocode("Multnomah County Courthouse, Oregon, 97204")
## # A tibble: 1 x 11
##   formatted_addre… accuracy accuracy_type source city  county state zip  
##   <chr>               <dbl> <chr>         <chr>  <chr> <chr>  <chr> <chr>
## 1 Portland, OR 97…     0.67 place         TIGER… Port… Multn… OR    97204
## # … with 3 more variables: country <chr>, location_lat <dbl>,
## #   location_lng <dbl>
rgeocodio::gio_geocode("1021 SW 4th ave, Portland, OR, 97204")
## # A tibble: 2 x 16
##   formatted_addre… accuracy accuracy_type source number predirectional street
##   <chr>               <int> <chr>         <chr>  <chr>  <chr>          <chr> 
## 1 1021 SW 4th Ave…        1 rooftop       Portl… 1021   SW             4th   
## 2 1021 SW 4th Ave…        1 rooftop       Multn… 1021   SW             4th   
## # … with 9 more variables: suffix <chr>, formatted_street <chr>, city <chr>,
## #   county <chr>, state <chr>, zip <chr>, country <chr>, location_lat <dbl>,
## #   location_lng <dbl>

What about geocoding the rest of the world, chico?

rgeocodio::gio_geocode('523-303, 350 Mokdongdong-ro, Yangcheon-Gu, Seoul, South Korea 07987')
## # A tibble: 0 x 0

gasp Geocodio only works, from my understanding, in the United States and Canada, just like–never mind. We can use Google’s geocoder to do the rest of the world.

Here is the website to help you get started. Google’s R interface is the package ggmap. Note that you will need an API for ggmap.

p_load(ggmap)
ggmap::geocode('523-303, 350 Mokdongdong-ro, Yangcheon-Gu, Seoul, South Korea 07987')
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=523-303,+350+Mokdongdong-ro,+Yangcheon-Gu,+Seoul,+South+Korea+07987&key=xxx-ljl00ko
## # A tibble: 1 x 2
##     lon   lat
##   <dbl> <dbl>
## 1  127.  37.5

I have found that Google’s geocoder also is better at guessing what we are trying to geocode:

rgeocodio::gio_geocode("The White House")
## # A tibble: 1 x 11
##   formatted_addre… accuracy accuracy_type source city  county state zip  
##   <chr>               <int> <chr>         <chr>  <chr> <chr>  <chr> <chr>
## 1 Whitehouse, TX …        1 place         TIGER… Whit… Smith… TX    75791
## # … with 3 more variables: country <chr>, location_lat <dbl>,
## #   location_lng <dbl>
ggmap::geocode("the White House") 
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=the+White+House&key=xxx-ljl00ko
## # A tibble: 1 x 2
##     lon   lat
##   <dbl> <dbl>
## 1 -77.0  38.9

You can easily plot your geocoded data:

address_vector<- c('1814 N underwood st, Arlington VA',
  '43 Kruse St Omak, WA 98841',
  '3337 chestnut avenue, trevose PA, 19053',
  '426 n hambden st chardon, OH 44024',
  '1009 Brookwood Road, Jacksonville Florida 32207',
  '4922 N ardmore whitesfish bay, WI, 53217',
  'beaverton, oregon', 
  'chatanooga, Tennesse'
  )

z<- geocode(address_vector)
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=1814+N+underwood+st,+Arlington+VA&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=43+Kruse+St+Omak,+WA+98841&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=3337+chestnut+avenue,+trevose+PA,+19053&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=426+n+hambden+st+chardon,+OH+44024&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=1009+Brookwood+Road,+Jacksonville+Florida+32207&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=4922+N+ardmore+whitesfish+bay,+WI,+53217&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=beaverton,+oregon&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=chatanooga,+Tennesse&key=xxx-ljl00ko
y<-get_map(zoom=3, maptype = "roadmap", location = 'united states')
## Source : https://maps.googleapis.com/maps/api/staticmap?center=united%20states&zoom=3&size=640x640&scale=2&maptype=roadmap&language=en-EN&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=united+states&key=xxx-ljl00ko
ggmap(y)+geom_point(aes(x = lon, y = lat), data = z, size = 3, color="black")+
  xlab('Loungin Dude')+ylab('Later Dude')

######

z<- geocode('523-303, 350 Mokdongdong-ro, Yangcheon-Gu, Seoul, South Korea 07987')
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=523-303,+350+Mokdongdong-ro,+Yangcheon-Gu,+Seoul,+South+Korea+07987&key=xxx-ljl00ko
y<-get_map(zoom=10, maptype = "watercolor", location = 'Seoul')
## maptype = "watercolor" is only available with source = "stamen".
## resetting to source = "stamen"...
## Source : https://maps.googleapis.com/maps/api/staticmap?center=Seoul&zoom=10&size=640x640&scale=2&maptype=terrain&key=xxx-ljl00ko
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=Seoul&key=xxx-ljl00ko
## Source : http://tile.stamen.com/watercolor/10/871/395.jpg
## Source : http://tile.stamen.com/watercolor/10/872/395.jpg
## Source : http://tile.stamen.com/watercolor/10/873/395.jpg
## Source : http://tile.stamen.com/watercolor/10/874/395.jpg
## Source : http://tile.stamen.com/watercolor/10/871/396.jpg
## Source : http://tile.stamen.com/watercolor/10/872/396.jpg
## Source : http://tile.stamen.com/watercolor/10/873/396.jpg
## Source : http://tile.stamen.com/watercolor/10/874/396.jpg
## Source : http://tile.stamen.com/watercolor/10/871/397.jpg
## Source : http://tile.stamen.com/watercolor/10/872/397.jpg
## Source : http://tile.stamen.com/watercolor/10/873/397.jpg
## Source : http://tile.stamen.com/watercolor/10/874/397.jpg
ggmap(y)+geom_point(aes(x = lon, y = lat), data = z, size = 5, shape=8, color="purple")+
  xlab('Loungin Dude')+ylab('Later Dude')+ggtitle("Song Bird's Roost")

?get_map

You can compare and contrast and decide which platform is best for you. Here is one of the comparisons. Like I mentioned, Google seems better equipped for foreign addresses and filling in the missing info. However, if you have the address numbers and it’s domestic Geocodio is the cheaper and superior option.