Vector Data Analysis using R

Objectives

  1. Hand on use of the sf package to read and manipulate spatial data.
  2. Familiarize oneself with some commonly used dplyr functions.
  3. Perform table and spatial join on spatial and non-spatial data.
  4. Learn basic mapping skills using tmap() and leaflet().

Overview

The sf package is very well documented. It has well-built functionalities that allow for easy vector geographic data processing and manipulation. It has alot of similarities with PostGis, because both supports objects and functions specified in the Open Geospatial Consortium(OGC) and refers to formal standards(ISO 19125-1:20040). Leaflet is an open source javascript library for web mapping, it is usually use in conjuction with HTML(HyperText Markup Language) and CSS (Casacading Style Sheet).It is very fuctional and effective for interactive web mapping.The tidyverse package is yet another powerful tool for data wrangling and processing in R.It is made up of a collection of other packages including dplyr,tidyr,readr,purr etc.

Lets load the following libraries.

Lets load in all our data using read_sf() and read.csv().

Checking and Setting CRS

The coordinate refrence system(CRS) defines how the two-dimensional spatial object or element relate to the surface of the earth. In this workflow we would be using the datum WGS84 with projection unit of Longitude and Latitude.

Lets check the verious class and CRS of the Spatial Objects loaded using the class() function.

They are all of class sf, thats what we want.

The GhanaDistrict spatial Object has no CRS, hence the need to set it.

Note, the projections are in longlat(WGS84), since we wouldn’t be computing any distance or area measurements, there is no need tranforming or reprojecting.

Table Join

This allows us to join the attribute data to the spatial object(vector) based on a common field(key) between the two objects.

Lets create a table join specifically an innerjoin.This would narrow down the data to districts that we are more interested in working with.

We notice Wasa Amenfi West Municipal was left out. Lets find out using the str_which() function.

Note the spelling of the Wassa(double ‘ss’) as compared to a Wasa(single ‘s’) in the ClassRoster.

We can see that the spelling is now corrected.

We perform the innerjoin again with the function inner_join(). Note; the spatial object would have to begin first, otherwise output would be a dataframe.

The number of rows is now 30 as compared to before,join is perfectly done.

We have the Region and region column/field showing.They hold the same data , lets set one to NULL.

Lets compute the top 3 districts with highest number of Mphil students 2019-2021

From the above table summary, it is deduce that Ho Municipal,Mfantseman Municipal and Oforikrom Municipal were each represented by two students, with all other Municipal and Districts represented by one student.

Spatial Join

This communicates how the spatial objects relates based on their location(x and y) coordinates and how they interact at that location(spatial overlay).

Lets create a spatial join between the Region and District shape files. This allow us to have all information as one unit.

Centroids

This enable us create a single central point of the various region polygon. Thats allowing us to easily plot the regional count in their respective region polygon.

Lets now look at student count at the regional level, and see wether they are any surprises.Note; this are not necessarily where student are staying or residing, but the birth district of each student.

Column Chart and Proportional Symbol Map

The column chart depicts the raw counts of the students in the various regions within Ghana.

Lets design a basic column chart using the function ggplot().

Lets now map the various region_count/ population of Mphil students within their regional boundaries.(note;this are just raw count). This is represented as a proportional symbol map below.

Interactive Choropleth Map

Choropleth map is a type of thematic map, where certain regions or areas are shaded in proportion to a particular statistical variable. Such variable could be continuous value or discreete integer.

Lets try to create a basic interactive Choropleth Map using leaflet() and htmltools().

References


Lovelace,R.,Nowosad,J.,and Muenchow,J.(2019) GeoComputation with R.1st edn.Bocca Raton:

Chapman and Hall/CRC.


Maxwell,A. (2020) Vector-Based Spatial Analysis. Available at:

http://wvview.org/spatial_analytics/Vector_Analysis/_site/index.html (Accessed April 2021).


RStudio,Inc.(2014) Leaflet for R-Introduction. Available at:rstudio.github.io/leaflet/ (Accessed April 2021).