In this class we study the basic commands and functions for spatial analysis. In order to start with the basic understanding of spatial analysis, we install basic packages sf, raster, spData, and spDataLarge.
sf: This library helps us to work with Vector data, i.e. maps, lines, points, and geography information in a 2D maps.raster: This library helps us to work with raster data, i.e. maps with other kind of information than vectors such as: land altitute, mountains information, and so on.spData: This library helps us to upload geographic data and work with it easily.spDataLarge: This library helps us to upload geographic data with large information that commonly can not be done in spData.#install.packages("sf")
#install.packages("raster")
#install.packages("spData")
#remotes::install_github("Nowosad/spDataLarge")
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(raster)
## Loading required package: sp
library(spData)
library(spDataLarge)
The vignette commnad let us to search information from the command we are curious.
#vignette(package = "sf") # see which vignettes are available
#vignette("sf1") # an introduction to the package
Here by using data got from spData we have access to:
We list the name of the variables in some of our datasets, then we plot them by using the comand plot:
names(world)
## [1] "iso_a2" "name_long" "continent" "region_un" "subregion" "type"
## [7] "area_km2" "pop" "lifeExp" "gdpPercap" "geom"
names(baltimore)
## [1] "STATION" "PRICE" "NROOM" "DWELL" "NBATH" "PATIO" "FIREPL"
## [8] "AC" "BMENT" "NSTOR" "GAR" "AGE" "CITCOU" "LOTSZ"
## [15] "SQFT" "X" "Y"
names(us_states)
## [1] "GEOID" "NAME" "REGION" "AREA" "total_pop_10"
## [6] "total_pop_15" "geometry"
plot(world)
## Warning: plotting the first 9 out of 10 attributes; use max.plot = 10 to plot
## all
plot(baltimore)
plot(us_states)
We obtain information and summarize it by using the command summary:
data(world)
summary(world["lifeExp"])
## lifeExp geom
## Min. :50.62 MULTIPOLYGON :177
## 1st Qu.:64.96 epsg:4326 : 0
## Median :72.87 +proj=long...: 0
## Mean :70.85
## 3rd Qu.:76.78
## Max. :83.59
## NA's :10
summary(world)
## iso_a2 name_long continent region_un
## Length:177 Length:177 Length:177 Length:177
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## subregion type area_km2 pop
## Length:177 Length:177 Min. : 2417 Min. :5.630e+04
## Class :character Class :character 1st Qu.: 46185 1st Qu.:3.755e+06
## Mode :character Mode :character Median : 185004 Median :1.040e+07
## Mean : 832558 Mean :4.282e+07
## 3rd Qu.: 621860 3rd Qu.:3.075e+07
## Max. :17018507 Max. :1.364e+09
## NA's :10
## lifeExp gdpPercap geom
## Min. :50.62 Min. : 597.1 MULTIPOLYGON :177
## 1st Qu.:64.96 1st Qu.: 3752.4 epsg:4326 : 0
## Median :72.87 Median : 10734.1 +proj=long...: 0
## Mean :70.85 Mean : 17106.0
## 3rd Qu.:76.78 3rd Qu.: 24232.7
## Max. :83.59 Max. :120860.1
## NA's :10 NA's :17
Lets reduce the data set for get a sample from the first two rows and the first three columns, i.e. first 3 variables. Note that the information includes the variable geometry because it includes the geographic needed information.
world_mini <- world[1:2, 1:3]
world_mini
## Simple feature collection with 2 features and 3 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -180 ymin: -18.28799 xmax: 180 ymax: -0.95
## geographic CRS: WGS 84
## # A tibble: 2 x 4
## iso_a2 name_long continent geom
## <chr> <chr> <chr> <MULTIPOLYGON [°]>
## 1 FJ Fiji Oceania (((180 -16.06713, 180 -16.55522, 179.3641 -16.8013~
## 2 TZ Tanzania Africa (((33.90371 -0.95, 34.07262 -1.05982, 37.69869 -3.~
With the st_as_sf we modify our data to simple feature type data.
library(sp)
(world_sp <- as(world, Class = "Spatial"))
## class : SpatialPolygonsDataFrame
## features : 177
## extent : -180, 180, -90, 83.64513 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## variables : 10
## names : iso_a2, name_long, continent, region_un, subregion, type, area_km2, pop, lifeExp, gdpPercap
## min values : AE, Afghanistan, Africa, Africa, Antarctica, Country, 2416.87048266498, 56295, 50.621, 597.135168986395
## max values : ZW, Zimbabwe, South America, Seven seas (open ocean), Western Europe, Sovereign country, 17018507.4094666, 1364270000, 83.5878048780488, 120860.06755829
(world_sf <- st_as_sf(world_sp))
## Simple feature collection with 177 features and 10 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513
## geographic CRS: WGS 84
## First 10 features:
## iso_a2 name_long continent region_un subregion
## 1 FJ Fiji Oceania Oceania Melanesia
## 2 TZ Tanzania Africa Africa Eastern Africa
## 3 EH Western Sahara Africa Africa Northern Africa
## 4 CA Canada North America Americas Northern America
## 5 US United States North America Americas Northern America
## 6 KZ Kazakhstan Asia Asia Central Asia
## 7 UZ Uzbekistan Asia Asia Central Asia
## 8 PG Papua New Guinea Oceania Oceania Melanesia
## 9 ID Indonesia Asia Asia South-Eastern Asia
## 10 AR Argentina South America Americas South America
## type area_km2 pop lifeExp gdpPercap
## 1 Sovereign country 19289.97 885806 69.96000 8222.254
## 2 Sovereign country 932745.79 52234869 64.16300 2402.099
## 3 Indeterminate 96270.60 NA NA NA
## 4 Sovereign country 10036042.98 35535348 81.95305 43079.143
## 5 Country 9510743.74 318622525 78.84146 51921.985
## 6 Sovereign country 2729810.51 17288285 71.62000 23587.338
## 7 Sovereign country 461410.26 30757700 71.03900 5370.866
## 8 Sovereign country 464520.07 7755785 65.23000 3709.082
## 9 Sovereign country 1819251.33 255131116 68.85600 10003.089
## 10 Sovereign country 2784468.59 42981515 76.25200 18797.548
## geometry
## 1 MULTIPOLYGON (((180 -16.067...
## 2 MULTIPOLYGON (((33.90371 -0...
## 3 MULTIPOLYGON (((-8.66559 27...
## 4 MULTIPOLYGON (((-122.84 49,...
## 5 MULTIPOLYGON (((-122.84 49,...
## 6 MULTIPOLYGON (((87.35997 49...
## 7 MULTIPOLYGON (((55.96819 41...
## 8 MULTIPOLYGON (((141.0002 -2...
## 9 MULTIPOLYGON (((141.0002 -2...
## 10 MULTIPOLYGON (((-68.63401 -...
Lets plot data for some variables from our world dataset:
plot(world[3:6])
plot(world["pop"])
In order to get information only for part of the data, first we explore what we information we have in the variable continent, and thwen we can create subsamples for Aia and South America:
table(world$continent)
##
## Africa Antarctica Asia
## 51 1 47
## Europe North America Oceania
## 39 18 7
## Seven seas (open ocean) South America
## 1 13
(world_asia <- world[world$continent == "Asia", ])
## Simple feature collection with 47 features and 10 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 26.04335 ymin: -10.35999 xmax: 145.5431 ymax: 55.38525
## geographic CRS: WGS 84
## # A tibble: 47 x 11
## iso_a2 name_long continent region_un subregion type area_km2 pop lifeExp
## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 KZ Kazakhst~ Asia Asia Central ~ Sove~ 2729811. 1.73e7 71.6
## 2 UZ Uzbekist~ Asia Asia Central ~ Sove~ 461410. 3.08e7 71.0
## 3 ID Indonesia Asia Asia South-Ea~ Sove~ 1819251. 2.55e8 68.9
## 4 TL Timor-Le~ Asia Asia South-Ea~ Sove~ 14715. 1.21e6 68.3
## 5 IL Israel Asia Asia Western ~ Coun~ 22991. 8.22e6 82.2
## 6 LB Lebanon Asia Asia Western ~ Sove~ 10099. 5.60e6 79.2
## 7 PS Palestine Asia Asia Western ~ Disp~ 5037. 4.29e6 73.1
## 8 JO Jordan Asia Asia Western ~ Sove~ 89207. 8.81e6 74.0
## 9 AE United A~ Asia Asia Western ~ Sove~ 79881. 9.07e6 76.9
## 10 QA Qatar Asia Asia Western ~ Sove~ 11328. 2.37e6 77.9
## # ... with 37 more rows, and 2 more variables: gdpPercap <dbl>,
## # geom <MULTIPOLYGON [°]>
(asia <- st_union(world_asia))
## Geometry set for 1 feature
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 26.04335 ymin: -10.35999 xmax: 145.5431 ymax: 55.38525
## geographic CRS: WGS 84
## MULTIPOLYGON (((120.295 -10.25865, 118.9678 -9....
world_samerica = world[world$continent == "South America", ]
samerica = st_union(world_samerica)
Then, we can plot the data for the variable continent:
For this plot, cex it help us to refer to some information we are going to use in our map, in this case we use population. st_centroid help us to calculate the center of a polygon that going to be included in our map by converting the geometry of polygons into points.
plot(world["continent"], reset = FALSE)
cex <- sqrt(world$pop) / 10000
world_cents <- st_centroid(world, of_largest = TRUE)
## Warning in st_centroid.sf(world, of_largest = TRUE): st_centroid assumes
## attributes are constant over geometries of x
## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon =
## of_largest_polygon): st_centroid does not give correct centroids for longitude/
## latitude data
plot(st_geometry(world_cents), add = TRUE, cex = cex)
Lets plot the map for china: expandBB takes a numeric vector of lenght four that expands the bounding bo of th eplot relative to zero in the following order: bottom, left, top, right.
china = world[world$name_long == "China", ]
plot(st_geometry(china), expandBB = c(0, 0.2, 0.1, 1), col = "gray", lwd = 3)
plot(world_asia[0], add = TRUE)
Lets try with another country:
peru = world[world$name_long == "Peru", ]
plot(st_geometry(peru), expandBB = c(0.5, 1, 0.5, 1), col = "red", lwd = 4)
plot(world_samerica[0], add = TRUE)
Geometries are composed by a set of mathematical representations of spatial relationships, measures, and properties. We have points, lines, and polygons.
## MULTIPOINT
multipoint_matrix = rbind(c(5, 2), c(1, 3), c(3, 4), c(3, 2))
st_multipoint(multipoint_matrix)
## MULTIPOINT ((5 2), (1 3), (3 4), (3 2))
plot(st_multipoint(multipoint_matrix), axes = TRUE)
## LINESTRING
linestring_matrix = rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2))
st_linestring(linestring_matrix)
## LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)
plot(st_linestring(linestring_matrix), axes = TRUE)
## POLYGON
polygon_list = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))
st_polygon(polygon_list)
## POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))
plot(st_polygon(polygon_list), axes = TRUE)
## POLYGON with a hole
polygon_border = rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))
polygon_hole = rbind(c(2, 4), c(3, 4), c(3, 3), c(2, 3), c(2, 4))
polygon_with_hole_list = list(polygon_border, polygon_hole)
st_polygon(polygon_with_hole_list)
## POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5), (2 4, 3 4, 3 3, 2 3, 2 4))
plot(st_polygon(polygon_with_hole_list), axes = TRUE)
## MULTILINESTRING
multilinestring_list = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)),
rbind(c(1, 2), c(2, 4)))
st_multilinestring((multilinestring_list))
## MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))
plot(st_multilinestring((multilinestring_list)), axes = TRUE)
## MULTIPOLYGON
multipolygon_list = list(list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))),
list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2))))
st_multipolygon(multipolygon_list)
## MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5)), ((0 2, 1 2, 1 3, 0 3, 0 2)))
plot(st_multipolygon(multipolygon_list), axes = TRUE)
## GEOMETRYCOLLECTION
gemetrycollection_list = list(st_multipoint(multipoint_matrix),
st_linestring(linestring_matrix))
st_geometrycollection(gemetrycollection_list)
## GEOMETRYCOLLECTION (MULTIPOINT ((5 2), (1 3), (3 4), (3 2)), LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))
plot(st_geometrycollection(gemetrycollection_list), axes = TRUE)
A sfc object only contains a single simple feature geometry.
# sfc POINT
point1 = st_point(c(5, 2))
point2 = st_point(c(1, 3))
points_sfc = st_sfc(point1, point2)
points_sfc
## Geometry set for 2 features
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1 ymin: 2 xmax: 5 ymax: 3
## CRS: NA
## POINT (5 2)
## POINT (1 3)
plot(points_sfc)
# sfc POLYGON
polygon_list1 = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))
polygon1 = st_polygon(polygon_list1)
polygon_list2 = list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2)))
polygon2 = st_polygon(polygon_list2)
polygon_sfc = st_sfc(polygon1, polygon2)
st_geometry_type(polygon_sfc)
## [1] POLYGON POLYGON
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
plot(polygon_sfc)
# sfc MULTILINESTRING
multilinestring_list1 = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)),
rbind(c(1, 2), c(2, 4)))
multilinestring1 = st_multilinestring((multilinestring_list1))
multilinestring_list2 = list(rbind(c(2, 9), c(7, 9), c(5, 6), c(4, 7), c(2, 7)),
rbind(c(1, 7), c(3, 8)))
multilinestring2 = st_multilinestring((multilinestring_list2))
multilinestring_sfc = st_sfc(multilinestring1, multilinestring2)
st_geometry_type(multilinestring_sfc)
## [1] MULTILINESTRING MULTILINESTRING
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
plot(multilinestring_sfc)
# sfc GEOMETRY
point_multilinestring_sfc = st_sfc(point1, multilinestring1)
st_geometry_type(point_multilinestring_sfc)
## [1] POINT MULTILINESTRING
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
The crs stands for Coordinate Reference System, which refers the way we represent earth’s surface in a 2D surface.
The WGS84 is the most popular coordination reference system. It stands for World Geodetic System made in 1984, and used for GPS systems. Other popular reference system is the NAD83, the North American Datum of 1983, it uses the center of the earth as coordinate origin.
## sfc objects can additionally store information on the coordinate reference systems (CRS).
st_crs(points_sfc)
## Coordinate Reference System: NA
# EPSG definition
points_sfc_wgs = st_sfc(point1, point2, crs = 4326)
st_crs(points_sfc_wgs)
## Coordinate Reference System:
## User input: EPSG:4326
## wkt:
## GEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## USAGE[
## SCOPE["unknown"],
## AREA["World"],
## BBOX[-90,-180,90,180]],
## ID["EPSG",4326]]
# PROJ4STRING definition
st_sfc(point1, point2, crs = "+proj=longlat +datum=WGS84 +no_defs")
## Geometry set for 2 features
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1 ymin: 2 xmax: 5 ymax: 3
## CRS: +proj=longlat +datum=WGS84 +no_defs
## POINT (5 2)
## POINT (1 3)
############## The sf class
lnd_point = st_point(c(0.1, 51.5)) # sfg object
lnd_geom = st_sfc(lnd_point, crs = 4326) # sfc object
lnd_attrib = data.frame( # data.frame object
name = "London",
temperature = 25,
date = as.Date("2017-06-21")
)
lnd_sf = st_sf(lnd_attrib, geometry = lnd_geom) # sf object
lnd_sf
## Simple feature collection with 1 feature and 3 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 0.1 ymin: 51.5 xmax: 0.1 ymax: 51.5
## geographic CRS: WGS 84
## name temperature date geometry
## 1 London 25 2017-06-21 POINT (0.1 51.5)
class(lnd_sf)
## [1] "sf" "data.frame"
It is composed by cells formed by vertical and horizontal lines, each of those cells are called pixeles. Each of those cells might has information saved in it such as color, elevation information, and so on.
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
new_raster = raster(raster_filepath)
# Zion National Park (Utah, USA). For example, srtm.tif is a digital elevation model of this area
new_raster
## class : RasterLayer
## dimensions : 457, 465, 212505 (nrow, ncol, ncell)
## resolution : 0.0008333333, 0.0008333333 (x, y)
## extent : -113.2396, -112.8521, 37.13208, 37.51292 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : C:/Program Files/R/R-4.0.2/library/spDataLarge/raster/srtm.tif
## names : srtm
## values : 1024, 2892 (min, max)
ncell(new_raster) #number of cells (pixels)
## [1] 212505
res(new_raster) #spatial resolution
## [1] 0.0008333333 0.0008333333
plot(new_raster)
spplot(new_raster) #new plot type
#Rasters can also be created from scratch using the raster() function.
new_raster2 = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 1:36)
plot(new_raster2)
A rasterbrick consists of multiple layers, which typically correspond to a single mutispectral stellite file or a single multilayer object in memory. the brick() function creates a Rasterbrick object.
multi_raster_file = system.file("raster/landsat.tif", package = "spDataLarge")
r_brick = brick(multi_raster_file)
r_brick
## class : RasterBrick
## dimensions : 1428, 1128, 1610784, 4 (nrow, ncol, ncell, nlayers)
## resolution : 30, 30 (x, y)
## extent : 301905, 335745, 4111245, 4154085 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=12 +datum=WGS84 +units=m +no_defs
## source : C:/Program Files/R/R-4.0.2/library/spDataLarge/raster/landsat.tif
## names : landsat.1, landsat.2, landsat.3, landsat.4
## min values : 7550, 6404, 5678, 5252
## max values : 19071, 22051, 25780, 31961
nlayers(r_brick)
## [1] 4
raster_on_disk = raster(r_brick, layer = 1)
raster_in_memory = raster(xmn = 301905, xmx = 335745,
ymn = 4111245, ymx = 4154085,
res = 30)
values(raster_in_memory) = sample(seq_len(ncell(raster_in_memory)))
crs(raster_in_memory) = crs(raster_on_disk)
r_stack = stack(raster_in_memory, raster_on_disk)
r_stack
## class : RasterStack
## dimensions : 1428, 1128, 1610784, 2 (nrow, ncol, ncell, nlayers)
## resolution : 30, 30 (x, y)
## extent : 301905, 335745, 4111245, 4154085 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=12 +datum=WGS84 +units=m +no_defs
## names : layer, landsat.1
## min values : 1, 7550
## max values : 1610784, 19071
Decision on which Raster class should be used depends mostly on the character of input data. Processing of a single mulitilayer file or object is the most effective with RasterBrick, while RasterStack allows calculations based on many files, many Raster* objects, or both.