Using artificial student recruitment data for the Arizona-based Faux University, this code-through explores how to create simple maps—primarily via the ‘usmap’ package—that spatially display frequencies for each of the following recruitment funnel stages: students who applied to Faux University, students who were then admitted to Faux University, and students who enrolled (matriculated) at Faux University.
This code-through will first walk you through wrangling and transforming the recruitment data prior to mapping each stage separately using the ‘usmap’ package.
Secondary packages include:
mapview
pander
ggplot2
dplyr
sp
Use the install.packages() function when necessary.
Because spatial data has many applications across many sectors and industry verticals, there is no question that there is great utility in learning how to visually illustrate the size of a group while simultaneously showing where each group is located on a map.
Within this code-through, you will learn how to do the following:
1) Create separate datasets from the Fall 2023 Faux University applicant pool:
Applicants
Admitted Students
Enrolled Students
2) Retain key fields:
Longitude
Latitude
ID
City
3) Complete a coordinate map test:
Conversion to spatial data
Plot maps
4) Transform data for mapping
5) Plot maps using the ‘usmap’ package
#Be sure to set default directory
library( pander )
library( usmap )
library( ggplot2 )
library( dplyr )
library( sp )
library( mapview )
Applicant_Data <- read.csv( "https://raw.githubusercontent.com/Bslyter/Faux_Files/main/Fall23_Faux_University_Recruit.csv")
head(Applicant_Data)#Using 'filter' Function
Admit_Data <- filter( Applicant_Data, ADMIT_COUNT == "1" )
Enroll_Data <- filter( Applicant_Data, ENRL_COUNT == "1" )#Important note: The Fall23_Faux_University_Recruit dataset already contains latitude and longitude.
#More often than not, most data you intend to plot will not include coordinates.
#At best, the data will include the full address block, city/state, or just zip codes.
#You will either need to manually look up coordinates or update your data en bulk
#using a geocoding service.
#I would recommend using the tidygeocoder package which is tied to the Nominatim
#("osm") geocoding service: https://cran.r-project.org/web/packages/tidygeocoder/readme/README.html
#Note that most geocoding services require an API key.
#Applicant_Data1 will be used for a coordinate map test,
#just to ensure the dummy longitude and latitude values
#make sense—spatially speaking.
Applicant_Data1 <- select(Applicant_Data, c( 'Longitude','Latitude' ) )
Applicant_Data2 <- select(Applicant_Data, c( 'Longitude','Latitude','ID','City' ) )
Admit_Data2 <- select(Admit_Data, c('Longitude','Latitude','ID','City') )
Enroll_Data2 <- select(Enroll_Data, c( 'Longitude','Latitude','ID','City' ) )Applicant_Data.sp <- SpatialPointsDataFrame(coords = Applicant_Data1,
data = Applicant_Data1, proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
mapview(Applicant_Data.sp)