GeoJSON is an open data format that extends JSON, the JavaScript Object Notation to encode spatial information.
GeoJSON has the following advantages:
The disadvantages are:
As illustrated by this post, it is easy to read and write GeoJSON files in R, using the rgdal package, which calls gdal commands. Here’s an example using publicly available GeoJSON file:
library(rgdal) # the library we'll be using
## Loading required package: sp
## rgdal: version: 0.9-3, (SVN revision 530)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.11.2, released 2015/02/10
## Path to GDAL shared files: /usr/share/gdal/1.11
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: (autodetected)
## Linking to sp version: 1.1-0
u <- "https://raw.githubusercontent.com/codeforboston/open_data_cambridge/master/Infrastructure/UtilityFacilities.geojson"
downloader::download(url = u, destfile = "/tmp/gas.GeoJSON")
gas <- readOGR(dsn = "/tmp/gas.GeoJSON", layer = "OGRGeoJSON")
## OGR data source with driver: GeoJSON
## Source: "/tmp/gas.GeoJSON", layer: "OGRGeoJSON"
## with 14 features
## It has 3 fields
summary(gas)
## Object of class SpatialPointsDataFrame
## Coordinates:
## min max
## coords.x1 -71.14731 -71.07930
## coords.x2 42.36032 42.39874
## Is projected: FALSE
## proj4string :
## [+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0]
## Number of points: 14
## Data attributes:
## SITE_NAME ADDRESS
## 3rd Street :1 10-12 Moulton St :1
## Alewife Bulk :1 10 Ware St :1
## AT&T Bent St Switch:1 12 Cambridge Center :1
## BBN Technology :1 191 Alweife Brook Parkway:1
## Biogen Inc :1 200 Bent St :1
## ComGas :1 250 Bent St :1
## (Other) :8 (Other) :8
## TYPE
## Electric Generating Plant :1
## Gas Regulating Station :2
## Major Data Hub :1
## Major Electric Transformer Station :4
## Power Generating Plant :2
## Steam Power Generating Plant :1
## Telephone office and longline switch:3
Note that to test whether the GeoJSON driver works on your system, you can use the following command:
"GeoJSON" %in% ogrDrivers()$name
## [1] TRUE
As shown in the above example, GeoJSON files can have coordinate reference system (CRS) allocated to them through the "crs"
‘member’:
"crs": {
"type": "name",
"properties": {
"name": "urn:ogc:def:crs:OGC:1.3:CRS84"
}
}
Note that we can see this definition in the gas.GeoJSON
file:
readLines("/tmp/gas.GeoJSON")[3]
## [1] " "
This ‘member’ is defined in plain text in the GeoJSON file before the spatial "features"
are defined. Please see this small example of gas stations to see how the CRS is defined in the context of a complete GeoJSON file.
To write GeoJSON files, the sensibly named corrollary of readOGR
is used, writeOGR
:
if(file.exists("/tmp/gas2.geojson")){
file.remove("/tmp/gas2.geojson")
}
geojsonio::geojson_write(gas, file = "/tmp/gas2.geojson")
## Success! File is at /tmp/gas2.geojson
## <geojson>
## Path: /tmp/gas2.geojson
## From class: SpatialPointsDataFrame
Based on this, geojson files can usually be loaded into R with readOGR
, as illustrated above.
However, this fails for multifeature geojsons.
Reproducible example:
downloader::download("https://github.com/Robinlovelace/Creating-maps-in-R/raw/master/data/test-multifeature.geojson", "test.geojson")
try(test <- rgdal::readOGR("test.geojson", "OGRGeoJSON")) # fails with:
There are 2 solutions. One uses rgdal and the gdal argument require_geomType
. The other uses geojsonio.
The first solution uses new features introduced in gdal 0.9-3. It is worth checking (and maybe updating) your gdal version. (Thanks to [@hrbrmstr](https://stackoverflow.com/users/1457051/hrbrmstr) for this detailed information.)
rgdal::getGDALVersionInfo() # check the version of gdal
## [1] "GDAL 1.11.2, released 2015/02/10"
system("gdal-config --version") # asking gdal its version from bash
Before update, this was:
[1] "GDAL 1.11.1, released 2014/09/24"
After updating gdal, e.g. like this (Ubuntu), it changed:
$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable && sudo apt-get update
$ sudo apt-get install gdal-bin
gdal can provide us with information about the layers in the multipart geojson.
ogrListLayers("test.geojson")
## [1] "OGRGeoJSON"
## attr(,"driver")
## [1] "GeoJSON"
## attr(,"nlayers")
## [1] 1
It can also provide more general information about the object, if we ask it correctly.
try(ogrInfo("test.geojson")) # no information on the file
system("ogrinfo test.geojson ")
try(ogrInfo("test.geojson", "OGRGeoJSON"))
The output from the previous lines indicates how we can access just one layer. We use the argument require_geomType
to specify which of the geometries to load:
testp_info <-
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint")
## NOTE: keeping only 98 wkbPoint of 781 features
## note that extent applies to all features
testp_df <- as.data.frame(testp_info$iteminfo)
head(testp_df)
## name type length typeName maxListCount
## 1 area 4 0 String 0
## 2 bicycle 4 0 String 0
## 3 highway 4 0 String 0
## 4 lit 4 0 String 0
## 5 name 4 0 String 0
## 6 surface 4 0 String 0
Note we can use the same command for any of the geometries:
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType = "wkbPolygon")
NOTE: keeping only 23 wkbPolygon of 781 features
note that extent applies to all features
Source: "test.geojson", layer: "OGRGeoJSON"
Driver: GeoJSON number of rows 781
selected geometry type: wkbPolygon with 23 rows
Feature type: wkbPolygon with 2 dimensions
Extent: (12.48326 41.88355) - (12.5033 41.89629)
CRS: +proj=longlat +datum=WGS84 +no_defs
Number of fields: 78
name type length typeName
1 area 4 0 String
2 bicycle 4 0 String
3 highway 4 0 String
Armed with this information, we can read in the features:
points <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint")
## OGR data source with driver: GeoJSON
## Source: "test.geojson", layer: "OGRGeoJSON"
## with 781 features;
## Selected wkbPoint feature type, with 98 rows
## It has 78 fields
## NOTE: keeping only 98 wkbPoint of 781 features
lines <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbLineString")
## OGR data source with driver: GeoJSON
## Source: "test.geojson", layer: "OGRGeoJSON"
## with 781 features;
## Selected wkbLineString feature type, with 660 rows
## It has 78 fields
## NOTE: keeping only 660 wkbLineString of 781 features
polygons <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPolygon")
## OGR data source with driver: GeoJSON
## Source: "test.geojson", layer: "OGRGeoJSON"
## with 781 features;
## Selected wkbPolygon feature type, with 23 rows
## It has 78 fields
## NOTE: keeping only 23 wkbPolygon of 781 features
Let’s plot the data (see https://stackoverflow.com/questions/30583048/ )
plot(lines, col="#7f7f7f")
plot(polygons, add=TRUE)
plot(points, add=TRUE, col="red")
This is an alternative solution:
test <- geojsonio::geojson_read("test.geojson")
Then we can filter out the feature types of interest:
obj = Filter(
function(f){f$geometry$type=="LineString"},
test$features)
mess <- 'ogr2ogr -where "OGR_GEOMETRY=\'LINESTRING\'" -f "GeoJSON" lines.geojson test.geojson'
system(mess)
testl <- rgdal::readOGR("lines.geojson", layer = "OGRGeoJSON")
## OGR data source with driver: GeoJSON
## Source: "lines.geojson", layer: "OGRGeoJSON"
## with 660 features
## It has 78 fields
Let’s see what R has created from writeOGR:
gas2 <- readLines("/tmp/gas.GeoJSON")
gas2[1:4]
## [1] "{"
## [2] "\"type\": \"FeatureCollection\","
## [3] " "
## [4] "\"features\": ["
Frustratingly, this fails to write the CRS into the file. rgdal’s ability to read a crs from a GeoJSON file is no guarantee of it’s ability to then write it out again. It seems rgdal has lost the CRS in translation.
This can problem be caused by using an old version of gdal
. Not in this case though: incomplete CRS values can lead rgdal
to omit the "crs"
in the file, as explained by Roger Bivand in a lengthy conversation on the R-sig-geo email list. To test that the problem is the fault of gdal and not R, we can use the ogr2ogr
command line tool, supplied by gdal.
# use call gdal from the operating system to convert original GeoJSON:
system("ogr2ogr -f 'ESRI Shapefile' /tmp/gas3.shp '/tmp/gas.GeoJSON' ")
gas3 <- readOGR(dsn = "/tmp", layer = "gas3")
## OGR data source with driver: ESRI Shapefile
## Source: "/tmp", layer: "gas3"
## with 14 features
## It has 3 fields
proj4string(gas3)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(gas)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
The above code shows that gdal has successfully converted the original GeoJSON file to a shapefile, maintaining the CRS. (Note that the CRS is identical but its definition is slightly different.) Let’s take a look at the .proj
file that gdal created:
gas3_proj <- readLines("/tmp/gas3.prj", skipNul = T)
## Warning in readLines("/tmp/gas3.prj", skipNul = T): incomplete final line
## found on '/tmp/gas3.prj'
gas3_proj
## [1] "GEOGCS[\"GCS_WGS_1984\",DATUM[\"D_WGS_1984\",SPHEROID[\"WGS_1984\",6378137,298.257223563]],PRIMEM[\"Greenwich\",0],UNIT[\"Degree\",0.017453292519943295]]"
There are various ways to solve the problem. The ‘outside R’ solution would be to write the file first as a shapefile, and then use ogr2ogr
to convert it to a GeoJSON. But as shown in the code below, this fails:
writeOGR(gas, "/tmp/", layer = "gas4", driver = "ESRI Shapefile")
system("ogr2ogr -f 'GeoJSON' /tmp/gas5.GeoJSON '/tmp/gas4.shp' ")
readLines("/tmp/gas4.prj")
## Warning in readLines("/tmp/gas4.prj"): incomplete final line found on '/
## tmp/gas4.prj'
## [1] "GEOGCS[\"GCS_WGS_1984\",DATUM[\"D_WGS_1984\",SPHEROID[\"WGS_1984\",6378137,298.257223563]],PRIMEM[\"Greenwich\",0],UNIT[\"Degree\",0.017453292519943295]]"
readLines("/tmp/gas5.GeoJSON")[1:3]
## [1] "{"
## [2] "\"type\": \"FeatureCollection\","
## [3] " "
As Roger Bivand shows, if the CRS is defined in certain ways, the "crs"
member will be written to the GeoJSON file. But it is difficult to know how to define a CRS other than by its EPSG name:
gas6 <- gas # duplicate the gas object
proj4string(gas6) <- CRS("+init=epsg:4283")
## Warning in `proj4string<-`(`*tmp*`, value = <S4 object of class structure("CRS", package = "sp")>): A new CRS was assigned to an object with an existing CRS:
## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## without reprojecting.
## For reprojection, use function spTransform in package rgdal
proj4string(gas6)
## [1] "+init=epsg:4283 +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs"
# the below works, but incorrect CRS
# proj4string(gas6) <- CRS("+proj=longlat +ellps=clrk66 +datum=NAD27")
geojsonio::geojson_write(gas6, file = "/tmp/gas6.geojson")
## Success! File is at /tmp/gas6.geojson
## <geojson>
## Path: /tmp/gas6.geojson
## From class: SpatialPointsDataFrame
readLines("/tmp/gas6.geojson")[1:4]
## [1] "{"
## [2] "\"type\": \"FeatureCollection\","
## [3] " "
## [4] "\"features\": ["
To solve the problem manually, we can simply add the correct projection to the GeoJSON file that has been created. Thankfully this is straightforward, as line 3 of the files are left empty by gdal, presumably ready for such an eventuality:
gas7 <- readLines("/tmp/gas2.geojson")
gas7[1:4]
## [1] "{"
## [2] "\"type\": \"FeatureCollection\","
## [3] "\"crs\": { \"type\": \"name\", \"properties\": { \"name\": \"urn:ogc:def:crs:OGC:1.3:CRS84\" } },"
## [4] " "
gas7[3] <- '"crs": {"type": "EPSG", "properties": {"code": 4283}},'
writeLines(gas7, con = "/tmp/gas8.GeoJSON")
gas8 <- readOGR(dsn = "/tmp/gas8.GeoJSON", layer = "OGRGeoJSON")
## OGR data source with driver: GeoJSON
## Source: "/tmp/gas8.GeoJSON", layer: "OGRGeoJSON"
## with 14 features
## It has 3 fields
proj4string(gas8)
## [1] "+proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs"
This is a messy solution but it’s the only one I could find to write GeoJSON files with a CRS defined for Australian data services’ default CRS.