More than one time, we have found a dataset with coordinates (longitude and latitude), and we struggle to transform it into a human-readable format (a process called reverse geocoding). The difficulty of the transformation will depend on the number of observations that we want to change and the level of detail that we want in the result.

Available Methods

Data

To test each of the previous methods, we are going to use the data about bushfires provided by NASA. They collect the information using Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In the dataset, each row corresponds to the intensity of a fire in an area of 0.742 km x 0.776 km. Table 1 shows the first rows of the dataset.

Table 1: Sample observations of the fire dataset
latitude longitude date intensity
-42.60197 147.0059 2020-01-01 29.9
-42.60178 147.0112 2020-01-01 29.9
-42.59640 146.9995 2020-01-01 18.5
-41.63182 147.9495 2020-01-01 9.1
-41.62063 147.9539 2020-01-01 13.8
-41.59829 147.8400 2020-01-01 5.7

Two partitions of the dataset are used to analyze the reverse geocoding methods, one with data of the United States and other with data of Australia. Each partition has 1000 observations.

Reverse geocoding

Using Photon

I am going to use the United States fire data to test the method using Photon. Table 2 shows some of the results obtained. The code I used is the following:

library(revgeo)

start <- Sys.time()
#This line do all the reverse geocoding using Photon as a provider
results<-revgeo(longitude=usafire$longitude, 
                            latitude=usafire$latitude, 
                            provider = 'photon', output="frame")

end <- Sys.time()
Table 2: First rows of the result using Photon
housenumber street city state zip country
House Number Not Found Street Not Found City Not Found State Not Found Postcode Not Found Country Not Found
House Number Not Found Street Not Found Long Beach California 90813 United States of America
House Number Not Found Street Not Found City Not Found Arizona Postcode Not Found United States of America
House Number Not Found Street Not Found City Not Found State Not Found Postcode Not Found Country Not Found
House Number Not Found Street Not Found Pelican Louisiana 71063 United States of America
House Number Not Found Street Not Found City Not Found State Not Found Postcode Not Found Country Not Found

As you can see in the above table, some observations have values not found because most of the coordinates provided were remote locations. The process took almost 9.9 minutes, so it is something to take into consideration with large datasets.

Maps Package

This package has only a few included maps to reverse geocoding. Therefore, if some of the following maps have the information that you want, this is the package for you.

  • world: Have the boundaries of each country, so it will only return the name of the country.
  • state: It has the boundaries of each state of the USA. It returns the name of the state.
  • county: It has the boundaries of each county of the USA. It returns the name of the county.
  • nz: It has the boundaries of the 3 main islands and 19 smaller coastal islands.

Because the maps for the USA are more detailed, I am going to use the United States fire data to obtain the country, state, and county. Table 3 shows the first results. The code that I used is the following:

library(maps)

startm <- Sys.time()
#I am doing the reverse geocoding using the function map.where from the maps package.
#Here I used three maps of the package; world, state and county
country<-map.where(database="world", 
                           usafire$longitude, usafire$latitude)
state<-map.where(database="state", 
                         usafire$longitude, usafire$latitude)
county<-map.where(database="county", 
                        usafire$longitude, usafire$latitude)
endm <- Sys.time()
Table 3: First rows of the result using maps package
country state county
USA ohio ohio,cuyahoga
USA ohio ohio,cuyahoga
USA ohio ohio,cuyahoga
USA ohio ohio,cuyahoga
USA ohio ohio,cuyahoga
USA pennsylvania pennsylvania,allegheny

The advantage of this method is that it only took 0.2 seconds, so it is one of the best alternatives if you only need general information.

Using sp and rgdal Packages

This method works in a way similar to the maps package. However, you need to provide the maps with the boundaries. In this example, I am going to use the fire data of Australia and the following three maps:

  • Countries: It indicates the boundaries of 247 countries. Link here.
  • States: It shows the boundaries for states and provinces around the world. Link here
  • LGAs NSW: It indicates the boundaries of the LGAs in NSW. Link here

Each of the maps is stored in a separate folder. See my GitHub respository if you need more details about how I stored the maps. The code that I used is the following:

library(sp)
library(rgdal)

startg <- Sys.time()

#Reading each of the maps. dsn is the folder of the map and layer is the name of the .shp file inside.
countries_map<- readOGR(dsn="country_map", layer="ne_10m_admin_0_countries")
states_map <- readOGR(dsn="states_map", layer="ne_10m_admin_1_states_provinces")
lgas_map <- readOGR(dsn="lga_nsw_map", layer="NSW_LGA_POLYGON_shp")

#This is a function to reverse geocoding based on coordinates
rev_geo<-function(lat,long){
              #First the coordinates are transformed to spatialpoints
              points<-SpatialPoints(matrix(c(long,
                                             lat),ncol=2,nrow=1))
              #Creating a projection of the coordinates on the map of countries
              proj4string(points) <- proj4string(countries_map)
              #To see where the name of the country is stored in the map object, you need to explore it in R and see the “data” element. In this case, “NAME” has the information that we want. The function over returns the name of the country given the coordinates projected in the countries_map
              country<-as.character(over(points, countries_map)$NAME)
              
              #The same for state
              proj4string(points) <- proj4string(states_map)
              state<-as.character(over(points, states_map)$name)
              
              #The same for LGA (I have only the map for NSW LGAs)
              proj4string(points) <- proj4string(lgas_map)
              LGA<-as.character(over(points, lgas_map)$NSW_LGA__3)
              
              return(as.vector(c(country,state,LGA)))
}

library(snow)
library(foreach)
library(doParallel)

cl <- makeCluster(detectCores() - 1) 
registerDoParallel(cl)

#Now for each row in the dataset I am going to return the reverse geocoding
# I am using parallel processing here to make the process faster
map_info<-foreach(i=1:nrow(ausfire), 
                      .packages = c("sp", "rgdal"), .combine=rbind) %dopar% {
                                    rev_geo(as.numeric(ausfire[i,"latitude"]),
                                            as.numeric(ausfire[i,"longitude"]))
}

stopCluster(cl)

endg <- Sys.time()

Table 4 shows the results of using this method.

Table 4: First rows of the result using sp and rgdal package
Country State LGA
Australia New South Wales SNOWY VALLEYS
Australia New South Wales SNOWY VALLEYS
Australia New South Wales EUROBODALLA
Australia New South Wales EUROBODALLA
Australia New South Wales SNOWY VALLEYS
Australia New South Wales SNOWY VALLEYS

This method took 3.3 minutes.

Conclusion

Table 5 shows the processing time of each of the methods. However, they are not enterily comparable because the result is different.

Table 5: Summary of the results
Method Time_minutes
Photon 9.883
Maps 0.004
sp & rgdal 3.263

Depending on your objective, now you have three methods that you can use. If you only need general information maps package is the fastest. On the other extreme, you can get detailed info using Photon. Finally, in the middle, you have a customized method to reverse geocoding using sp and rgdal packages.