This markdown file is adapted from week 01 from Applied Spatial Analysis in Public Health online course available from https://hughst.github.io/. We would like to thank the contributors of the course who have provided the datasets and course in open domain. Though the datasets used in this code are downloaded from the gihub repository of the course, the major differences from the course website are:-
1. We have used sf package as compared to sp and rgdal package used in the course.
2. We have made plots using ggplot, and ggspatial packages as compared to base plot and spplot functions in the course.
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(raster)
## Loading required package: sp
library(leaflet)
suppressMessages(library(tidyverse))
Malaria data from Ethopia has been used in the course and we will also demonstrate learnings using same dataset.
ETH_malaria_data <- read.csv("https://raw.githubusercontent.com/HughSt/HughSt.github.io/master/course_materials/week1/Lab_files/Data/mal_data_eth_2009_no_dups.csv",
header=T)
names(ETH_malaria_data)
## [1] "country" "country_id" "continent_id" "site_id" "site_name"
## [6] "latitude" "longitude" "rural_urban" "year_start" "lower_age"
## [11] "upper_age" "examined" "pf_pos" "pf_pr" "method"
## [16] "title1" "citation1"
examined column tells about about how many tested
pf_pos tells about how many were falciparum positive
pf_pr tells about falciparum rate or proportion positive
options(tibble.width = Inf)
head(ETH_malaria_data,4)
plot(Hmisc::describe(ETH_malaria_data))
## $Categorical
##
## $Continuous
As a first step, since we have csv file with lat long columns, a scatter plot can provide basic visualisation of dataset
ETH_malaria_data %>%
ggplot(aes(x = longitude, y = latitude))+
geom_point()+
labs(title = "Mapping school locations using csv dataset")
The above figure helps us understand school locations but all points are same. To get further insights, we will determine the size of the point based on falciparum rate in the school.
ETH_malaria_data %>%
ggplot(aes(x = longitude, y = latitude, size = pf_pr))+
geom_point()+
labs(title = "Location of schools with their size representing the falciparum rate in the school")
To perform spatial operations, the csv dataframe should be converted to sf object.
dat <- st_as_sf(
ETH_malaria_data,
coords = c("longitude", "latitude"),
crs = 4326
)
dat %>%
ggplot()+
geom_sf()+
labs(title = "Location of Schools")+
xlab("Longitude")+
ylab("Latitude")
ggplot()+
geom_sf(data = dat, aes(size = pf_pr, color = pf_pr))+
labs(title = "Location of schools with size and color according to falciparum rate",
color = "Falciparum Rate",
size = "Falciparum\nRate")+
xlab("Longitude")+
ylab("Latitude")+
ggspatial::annotation_north_arrow(location = "br")+
ggspatial::annotation_scale(location = "bl")
The shape file for Ethopia is available as a zipped folder on github repository for the course. You can click on the link below to download the same. Once downloaded, store the extracted files into the working directory and load it. https://github.com/HughSt/HughSt.github.io/raw/master/course_materials/week1/Lab_files/Data/ETH_Adm1_shapefile.zip
shp <- st_read("./ETH_Adm1_shapefile/ETH_Adm_1.shp")
## Reading layer `ETH_Adm_1' from data source `C:\Users\Gurpreet\OneDrive\r_scripts\Applied_spatial_analysis_course\ETH_Adm1_shapefile\ETH_Adm_1.shp' using driver `ESRI Shapefile'
## Simple feature collection with 11 features and 10 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 33.00154 ymin: 3.398823 xmax: 47.95823 ymax: 14.84548
## geographic CRS: WGS 84
ggplot()+
geom_sf(data = shp)+
labs(title = "Ethopia shapefile overview")
ggplot()+
geom_sf(data = shp)+
geom_sf(data = dat, aes(size = pf_pr, color = pf_pr))+
labs(title = "Location of schools studied in Ethopia with size and color according to falciparum rate",
color = "Falciparum Rate",
size = "Falciparum\nRate")+
xlab("Longitude")+
ylab("Latitude")+
ggspatial::annotation_north_arrow(location = "br")+
ggspatial::annotation_scale(location = "bl")
Many a times it becomes difficult to get a shape file of the stud area. Leaflet package helps us to use web maps from open sources such as Open Street Map, ESRI maps, cartoDB, etc.
The first step when using web maps include creation of a basemp. So, lets create a basemap from open street view as well as ESRI and cartoDB.
basemap <- leaflet() %>% addTiles()
basemap
basemap <- leaflet() %>% addProviderTiles("Esri.WorldImagery")
basemap
basemap <- leaflet() %>% addProviderTiles("CartoDB.Positron")
basemap
Once basemap is chosen, further layers can be added using pipe ( %>% )command
basemap %>% addPolygons(data = shp)
The color, fillOpacity, line opacity, lineweight can be modified further.
basemap %>% addPolygons(data = shp,
color = "green",
fillOpacity = 0.1,
opacity = 0.5,
weight = 1)
Pop ups can also be added if required. To know the name, click on the respective polygon
basemap %>% addPolygons(data = shp,
color = "green",
fillOpacity = 0.1,
opacity = 0.5,
weight = 1,
popup = shp$NAME_1)
Point data can be added using addCircleMarkers function.
basemap %>% addCircleMarkers(data = dat,
color = "#990000",
radius = 1)
In case you want to add both polygon and point data, the same can be carried out using pipe links.
basemap %>% addPolygons(data = shp,
color = "green",
fillOpacity = 0.1,
opacity = 0.5,
weight = 1,
popup = shp$NAME_1) %>%
addCircleMarkers(data = dat,
color = "#990000",
radius = 1)
In leaflet package, using colorNumeric function customised color paletters can be developed.
We can use wes_palette function from wesanderson package to feed the palette details to leaflet function.
custom_palette <- colorNumeric(wesanderson::wes_palette("Zissou1")[1:5], dat$pf_pr, n = 5)
basemap %>% addPolygons(data = shp,
color = "green",
fillOpacity = 0.1,
opacity = 0.5,
weight = 1,
popup = shp$NAME_1) %>%
addCircleMarkers(data = dat,
color = custom_palette(dat$pf_pr),
radius = 1)
basemap %>% addPolygons(data = shp,
color = "green",
fillOpacity = 0.1,
opacity = 0.5,
weight = 1,
popup = shp$NAME_1) %>%
addCircleMarkers(data = dat,
color = custom_palette(dat$pf_pr),
radius = 1,
popup = paste("Prevalence = ",as.character(dat$pf_pr))) %>%
addLegend(pal = custom_palette,
title = "Falciparum rate",
values = dat$pf_pr)
Now, we will load and plot raster data. We will be using the same raster file as provided in the course from github repository. The raster file is on elevation profile of the study area.
elevation_raster <- raster("https://github.com/HughSt/HughSt.github.io/raw/master/course_materials/week1/Lab_files/Data/elev_ETH.tif")
plot(elevation_raster)
basemap %>% addRasterImage(elevation_raster)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition
custom_palette <- colorNumeric(topo.colors(64), values(elevation_raster), na.color = NA)
basemap %>% addRasterImage(elevation_raster, color = custom_palette) %>%
addLegend(values = values(elevation_raster), pal = custom_palette)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition