More than one time, we have found a dataset with coordinates (longitude and latitude), and we struggle to transform it into a human-readable format (a process called reverse geocoding). The difficulty of the transformation will depend on the number of observations that we want to change and the level of detail that we want in the result.
Google Maps API: In my opinion, this is the best method if you want a fast and detailed result. The only downside, you have to pay if you’re going to use it.
Photon (Alternative to Google Maps): The best of using the API of Photon to reverse geocoding is that it is free. However, it is a bit slow, and sometimes the locations are not found.
Maps Package: This is the fastest method that I found to reverse geocoding if you want only a general location, like a country or state. Nevertheless, the transformation to state is only available for some countries.
sp and rgdal Packages: This is the most versatile method. It will work on anything that you want if you have a specific map with the boundaries. The difficulty resides in finding the map and the time that the process takes.
To test each of the previous methods, we are going to use the data about bushfires provided by NASA. They collect the information using Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In the dataset, each row corresponds to the intensity of a fire in an area of 0.742 km x 0.776 km. Table 1 shows the first rows of the dataset.
| latitude | longitude | date | intensity |
|---|---|---|---|
| -42.60197 | 147.0059 | 2020-01-01 | 29.9 |
| -42.60178 | 147.0112 | 2020-01-01 | 29.9 |
| -42.59640 | 146.9995 | 2020-01-01 | 18.5 |
| -41.63182 | 147.9495 | 2020-01-01 | 9.1 |
| -41.62063 | 147.9539 | 2020-01-01 | 13.8 |
| -41.59829 | 147.8400 | 2020-01-01 | 5.7 |
Two partitions of the dataset are used to analyze the reverse geocoding methods, one with data of the United States and other with data of Australia. Each partition has 1000 observations.
I am going to use the United States fire data to test the method using Photon. Table 2 shows some of the results obtained. The code I used is the following:
library(revgeo)
start <- Sys.time()
#This line do all the reverse geocoding using Photon as a provider
results<-revgeo(longitude=usafire$longitude,
latitude=usafire$latitude,
provider = 'photon', output="frame")
end <- Sys.time()
| housenumber | street | city | state | zip | country |
|---|---|---|---|---|---|
| House Number Not Found | Street Not Found | City Not Found | State Not Found | Postcode Not Found | Country Not Found |
| House Number Not Found | Street Not Found | Long Beach | California | 90813 | United States of America |
| House Number Not Found | Street Not Found | City Not Found | Arizona | Postcode Not Found | United States of America |
| House Number Not Found | Street Not Found | City Not Found | State Not Found | Postcode Not Found | Country Not Found |
| House Number Not Found | Street Not Found | Pelican | Louisiana | 71063 | United States of America |
| House Number Not Found | Street Not Found | City Not Found | State Not Found | Postcode Not Found | Country Not Found |
As you can see in the above table, some observations have values not found because most of the coordinates provided were remote locations. The process took almost 9.9 minutes, so it is something to take into consideration with large datasets.
This package has only a few included maps to reverse geocoding. Therefore, if some of the following maps have the information that you want, this is the package for you.
Because the maps for the USA are more detailed, I am going to use the United States fire data to obtain the country, state, and county. Table 3 shows the first results. The code that I used is the following:
library(maps)
startm <- Sys.time()
#I am doing the reverse geocoding using the function map.where from the maps package.
#Here I used three maps of the package; world, state and county
country<-map.where(database="world",
usafire$longitude, usafire$latitude)
state<-map.where(database="state",
usafire$longitude, usafire$latitude)
county<-map.where(database="county",
usafire$longitude, usafire$latitude)
endm <- Sys.time()
| country | state | county |
|---|---|---|
| USA | ohio | ohio,cuyahoga |
| USA | ohio | ohio,cuyahoga |
| USA | ohio | ohio,cuyahoga |
| USA | ohio | ohio,cuyahoga |
| USA | ohio | ohio,cuyahoga |
| USA | pennsylvania | pennsylvania,allegheny |
The advantage of this method is that it only took 0.2 seconds, so it is one of the best alternatives if you only need general information.
This method works in a way similar to the maps package. However, you need to provide the maps with the boundaries. In this example, I am going to use the fire data of Australia and the following three maps:
Each of the maps is stored in a separate folder. See my GitHub respository if you need more details about how I stored the maps. The code that I used is the following:
library(sp)
library(rgdal)
startg <- Sys.time()
#Reading each of the maps. dsn is the folder of the map and layer is the name of the .shp file inside.
countries_map<- readOGR(dsn="country_map", layer="ne_10m_admin_0_countries")
states_map <- readOGR(dsn="states_map", layer="ne_10m_admin_1_states_provinces")
lgas_map <- readOGR(dsn="lga_nsw_map", layer="NSW_LGA_POLYGON_shp")
#This is a function to reverse geocoding based on coordinates
rev_geo<-function(lat,long){
#First the coordinates are transformed to spatialpoints
points<-SpatialPoints(matrix(c(long,
lat),ncol=2,nrow=1))
#Creating a projection of the coordinates on the map of countries
proj4string(points) <- proj4string(countries_map)
#To see where the name of the country is stored in the map object, you need to explore it in R and see the “data” element. In this case, “NAME” has the information that we want. The function over returns the name of the country given the coordinates projected in the countries_map
country<-as.character(over(points, countries_map)$NAME)
#The same for state
proj4string(points) <- proj4string(states_map)
state<-as.character(over(points, states_map)$name)
#The same for LGA (I have only the map for NSW LGAs)
proj4string(points) <- proj4string(lgas_map)
LGA<-as.character(over(points, lgas_map)$NSW_LGA__3)
return(as.vector(c(country,state,LGA)))
}
library(snow)
library(foreach)
library(doParallel)
cl <- makeCluster(detectCores() - 1)
registerDoParallel(cl)
#Now for each row in the dataset I am going to return the reverse geocoding
# I am using parallel processing here to make the process faster
map_info<-foreach(i=1:nrow(ausfire),
.packages = c("sp", "rgdal"), .combine=rbind) %dopar% {
rev_geo(as.numeric(ausfire[i,"latitude"]),
as.numeric(ausfire[i,"longitude"]))
}
stopCluster(cl)
endg <- Sys.time()
Table 4 shows the results of using this method.
| Country | State | LGA |
|---|---|---|
| Australia | New South Wales | SNOWY VALLEYS |
| Australia | New South Wales | SNOWY VALLEYS |
| Australia | New South Wales | EUROBODALLA |
| Australia | New South Wales | EUROBODALLA |
| Australia | New South Wales | SNOWY VALLEYS |
| Australia | New South Wales | SNOWY VALLEYS |
This method took 3.3 minutes.
Table 5 shows the processing time of each of the methods. However, they are not enterily comparable because the result is different.
| Method | Time_minutes |
|---|---|
| Photon | 9.883 |
| Maps | 0.004 |
| sp & rgdal | 3.263 |
Depending on your objective, now you have three methods that you can use. If you only need general information maps package is the fastest. On the other extreme, you can get detailed info using Photon. Finally, in the middle, you have a customized method to reverse geocoding using sp and rgdal packages.