More than one time, we have found a dataset with coordinates (longitude and latitude), and we struggle to transform it into a human-readable format (a process called reverse geocoding). The difficulty of the transformation will depend on the number of observations that we want to change and the level of detail that we want in the result.

Available Methods

Google Maps API: In my opinion, this is the best method if you want a fast and detailed result. The only downside, you have to pay if you’re going to use it.
Photon (Alternative to Google Maps): The best of using the API of Photon to reverse geocoding is that it is free. However, it is a bit slow, and sometimes the locations are not found.
Maps Package: This is the fastest method that I found to reverse geocoding if you want only a general location, like a country or state. Nevertheless, the transformation to state is only available for some countries.
sp and rgdal Packages: This is the most versatile method. It will work on anything that you want if you have a specific map with the boundaries. The difficulty resides in finding the map and the time that the process takes.

Data

To test each of the previous methods, we are going to use the data about bushfires provided by NASA. They collect the information using Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In the dataset, each row corresponds to the intensity of a fire in an area of 0.742 km x 0.776 km. Table 1 shows the first rows of the dataset.

Table 1: Sample observations of the fire dataset
latitude	longitude	date	intensity
-42.60197	147.0059	2020-01-01	29.9
-42.60178	147.0112	2020-01-01	29.9
-42.59640	146.9995	2020-01-01	18.5
-41.63182	147.9495	2020-01-01	9.1
-41.62063	147.9539	2020-01-01	13.8
-41.59829	147.8400	2020-01-01	5.7

Two partitions of the dataset are used to analyze the reverse geocoding methods, one with data of the United States and other with data of Australia. Each partition has 1000 observations.

Reverse geocoding

Using Photon

I am going to use the United States fire data to test the method using Photon. Table 2 shows some of the results obtained. The code I used is the following:

library(revgeo)

start <- Sys.time()
#This line do all the reverse geocoding using Photon as a provider
results<-revgeo(longitude=usafire$longitude, 
                            latitude=usafire$latitude, 
                            provider = 'photon', output="frame")

end <- Sys.time()

Table 2: First rows of the result using Photon
housenumber	street	city	state	zip	country
House Number Not Found	Street Not Found	City Not Found	State Not Found	Postcode Not Found	Country Not Found
House Number Not Found	Street Not Found	Long Beach	California	90813	United States of America
House Number Not Found	Street Not Found	City Not Found	Arizona	Postcode Not Found	United States of America
House Number Not Found	Street Not Found	City Not Found	State Not Found	Postcode Not Found	Country Not Found
House Number Not Found	Street Not Found	Pelican	Louisiana	71063	United States of America
House Number Not Found	Street Not Found	City Not Found	State Not Found	Postcode Not Found	Country Not Found

As you can see in the above table, some observations have values not found because most of the coordinates provided were remote locations. The process took almost 9.9 minutes, so it is something to take into consideration with large datasets.

Maps Package

This package has only a few included maps to reverse geocoding. Therefore, if some of the following maps have the information that you want, this is the package for you.

world: Have the boundaries of each country, so it will only return the name of the country.
state: It has the boundaries of each state of the USA. It returns the name of the state.
county: It has the boundaries of each county of the USA. It returns the name of the county.
nz: It has the boundaries of the 3 main islands and 19 smaller coastal islands.

Because the maps for the USA are more detailed, I am going to use the United States fire data to obtain the country, state, and county. Table 3 shows the first results. The code that I used is the following:

library(maps)

startm <- Sys.time()
#I am doing the reverse geocoding using the function map.where from the maps package.
#Here I used three maps of the package; world, state and county
country<-map.where(database="world", 
                           usafire$longitude, usafire$latitude)
state<-map.where(database="state", 
                         usafire$longitude, usafire$latitude)
county<-map.where(database="county", 
                        usafire$longitude, usafire$latitude)
endm <- Sys.time()

Table 3: First rows of the result using maps package
country	state	county
USA	ohio	ohio,cuyahoga
USA	ohio	ohio,cuyahoga
USA	ohio	ohio,cuyahoga
USA	ohio	ohio,cuyahoga
USA	ohio	ohio,cuyahoga
USA	pennsylvania	pennsylvania,allegheny

The advantage of this method is that it only took 0.2 seconds, so it is one of the best alternatives if you only need general information.

Using sp and rgdal Packages

This method works in a way similar to the maps package. However, you need to provide the maps with the boundaries. In this example, I am going to use the fire data of Australia and the following three maps:

Countries: It indicates the boundaries of 247 countries. Link here.
States: It shows the boundaries for states and provinces around the world. Link here
LGAs NSW: It indicates the boundaries of the LGAs in NSW. Link here

Each of the maps is stored in a separate folder. See my GitHub respository if you need more details about how I stored the maps. The code that I used is the following:

library(sp)
library(rgdal)

startg <- Sys.time()

#Reading each of the maps. dsn is the folder of the map and layer is the name of the .shp file inside.
countries_map<- readOGR(dsn="country_map", layer="ne_10m_admin_0_countries")
states_map <- readOGR(dsn="states_map", layer="ne_10m_admin_1_states_provinces")
lgas_map <- readOGR(dsn="lga_nsw_map", layer="NSW_LGA_POLYGON_shp")

#This is a function to reverse geocoding based on coordinates
rev_geo<-function(lat,long){
              #First the coordinates are transformed to spatialpoints
              points<-SpatialPoints(matrix(c(long,
                                             lat),ncol=2,nrow=1))
              #Creating a projection of the coordinates on the map of countries
              proj4string(points) <- proj4string(countries_map)
              #To see where the name of the country is stored in the map object, you need to explore it in R and see the “data” element. In this case, “NAME” has the information that we want. The function over returns the name of the country given the coordinates projected in the countries_map
              country<-as.character(over(points, countries_map)$NAME)
              
              #The same for state
              proj4string(points) <- proj4string(states_map)
              state<-as.character(over(points, states_map)$name)
              
              #The same for LGA (I have only the map for NSW LGAs)
              proj4string(points) <- proj4string(lgas_map)
              LGA<-as.character(over(points, lgas_map)$NSW_LGA__3)
              
              return(as.vector(c(country,state,LGA)))
}

library(snow)
library(foreach)
library(doParallel)

cl <- makeCluster(detectCores() - 1) 
registerDoParallel(cl)

#Now for each row in the dataset I am going to return the reverse geocoding
# I am using parallel processing here to make the process faster
map_info<-foreach(i=1:nrow(ausfire), 
                      .packages = c("sp", "rgdal"), .combine=rbind) %dopar% {
                                    rev_geo(as.numeric(ausfire[i,"latitude"]),
                                            as.numeric(ausfire[i,"longitude"]))
}

stopCluster(cl)

endg <- Sys.time()

Table 4 shows the results of using this method.

Table 4: First rows of the result using sp and rgdal package
Country	State	LGA
Australia	New South Wales	SNOWY VALLEYS
Australia	New South Wales	SNOWY VALLEYS
Australia	New South Wales	EUROBODALLA
Australia	New South Wales	EUROBODALLA
Australia	New South Wales	SNOWY VALLEYS
Australia	New South Wales	SNOWY VALLEYS

This method took 3.3 minutes.

Conclusion

Table 5 shows the processing time of each of the methods. However, they are not enterily comparable because the result is different.

Table 5: Summary of the results
Method	Time_minutes
Photon	9.883
Maps	0.004
sp & rgdal	3.263

Depending on your objective, now you have three methods that you can use. If you only need general information maps package is the fastest. On the other extreme, you can get detailed info using Photon. Finally, in the middle, you have a customized method to reverse geocoding using sp and rgdal packages.

How to reverse geocoding in R

Felipe Monroy

26-05-2020