DIDA 370 Tutorial: Visualizing Point Data
In this tutorial, you will learn how to analyze point data by visualizing its distribution across space. Often, you will have a high density of points, and it can be challenging to discern patterns simply by looking at the points themselves. In this case, analyzing the distribution of points instead is a better way to visualize patterns across a geographic region of interest. This is often referred to as quadrat analysis (visualizing points as they are distributed in a regularly sized grid) or tessellation (more generally, converting continuous point data to a discretized space, typically using a square grid or other geometric shape). Let’s see what this looks like! Along the way, I will also introduce you to mapview, a quick (but somewhat limited) way to make leaflet maps.
#load in packages
library(sf)
## Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(spData)
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`
library(urbnmapr)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
library(mapview)
## Warning: package 'mapview' was built under R version 4.3.3
library(RColorBrewer)
#load in basemap
counties <- get_urbn_map("counties", sf = TRUE)
#filter the data to get just NYS
counties <- counties %>%
filter(state_abbv == "NY") %>%
st_transform("EPSG:32116")
## old-style crs object detected; please recreate object with a recent sf::st_crs()
## Warning in CPL_crs_from_input(x): GDAL Message 1: CRS EPSG:2163 is deprecated.
## Its non-deprecated replacement EPSG:9311 will be used instead. To use the
## original CRS, set the OSR_USE_NON_DEPRECATED configuration option to NO.
#load in invasive species point data
setwd("~/Binghamton/DIDA 370/invasives_ny")
invasives <- st_read("PRESENCE_POINT_CONFIRMED__2024_04_15_170054853.shp")
## Reading layer `PRESENCE_POINT_CONFIRMED__2024_04_15_170054853' from data source
## `C:\Users\melha\OneDrive\Documents\Binghamton\DIDA 370\invasives_ny\PRESENCE_POINT_CONFIRMED__2024_04_15_170054853.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 9828 features and 35 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 1269700 ymin: 259948.3 xmax: 2050797 ymax: 815175.4
## Projected CRS: North_America_Albers_Equal_Area_Conic
#convert to correct projection, get rid of points outside of NY
invasives <- invasives %>% st_transform("EPSG:32116")
invasives <- invasives[counties,]
#ggplot version of our data
ggplot()+
geom_sf(data = counties, mapping = aes())+
geom_sf(data = invasives, mapping = aes(), color = "blue")
#mapview version of the data!
mapview_points = mapview(invasives, cex = 3, alpha = .5, popup = NULL)
mapview_points
#let's make a square grid to visualize the point pattern
#the grid needs to be in the same units as your map - our map is in meters, which is why the units look so large
#you can play around with grid size until you get one that looks good
area_grid = st_make_grid(invasives, c(20000, 20000),
what = "polygons",
#this will make it a square grid
square = TRUE)
#convert the grid to an sf object and create an ID column
grid_sf = st_sf(area_grid) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_grid)))
# count number of points in each grid
# https://gis.stackexchange.com/questions/323698/counting-points-in-polygons-with-sf-package-of-r
grid_sf$n_invasives = lengths(st_intersects(grid_sf, invasives))
#Grid visualization 1
ggplot()+
geom_sf(data = counties, mapping = aes())+
geom_sf(data = grid_sf, mapping = aes(fill = n_invasives), alpha = .5)+
scale_fill_gradient(low = "white", high = "tomato")+
theme_minimal()
#In this second version, we'll make a hexagonal grid and delete unneeded cells in the grid
area_hex = st_make_grid(invasives, c(18000, 18000), what = "polygons", square = FALSE)
# To sf and add grid ID
hex_sf = st_sf(area_hex) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_hex)))
# count number of points in each grid
# https://gis.stackexchange.com/questions/323698/counting-points-in-polygons-with-sf-package-of-r
hex_sf$n_invasives = lengths(st_intersects(hex_sf, invasives))
# remove grid without value of 0 (i.e. no points in side that grid)
hex_count = filter(hex_sf, n_invasives > 0)
mapview(hex_count, zcol = "n_invasives",
#name the variable
layer.name = "Invasives",
alpha = 0.6,
#add a color palette
col.regions=brewer.pal(6, "PuRd"),
#color palette specifications: range from 0 to 200, 50 units between breaks
at = seq(0, 300, 50))
## Warning: Found less unique colors (6) than unique zcol values (7)!
## Interpolating color vector to match number of zcol values.
#ggplot2 version
ggplot()+
geom_sf(data = counties, mapping = aes())+
geom_sf(data = hex_count, mapping = aes(fill = n_invasives), alpha = .5)+
#uses colorbrewer palettes for continuous data
#direction=1 reverses the direction of the palette
scale_fill_distiller(palette = "PuRd", direction = 1)+
theme_minimal()
Part 2: Plotting Point Density
You may also want to look at a smoothed image of the density of points across your study region. To do so, we can determine the density of points across space and convert it to a raster. Here’s how:
library(spatstat)
## Warning: package 'spatstat' was built under R version 4.3.3
## Loading required package: spatstat.data
## Warning: package 'spatstat.data' was built under R version 4.3.3
## Loading required package: spatstat.geom
## Warning: package 'spatstat.geom' was built under R version 4.3.3
## spatstat.geom 3.2-9
## Loading required package: spatstat.random
## Warning: package 'spatstat.random' was built under R version 4.3.3
## spatstat.random 3.2-3
## Loading required package: spatstat.explore
## Warning: package 'spatstat.explore' was built under R version 4.3.3
## Loading required package: nlme
##
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
##
## collapse
## spatstat.explore 3.2-7
## Loading required package: spatstat.model
## Warning: package 'spatstat.model' was built under R version 4.3.3
## Loading required package: rpart
## spatstat.model 3.2-11
## Loading required package: spatstat.linnet
## Warning: package 'spatstat.linnet' was built under R version 4.3.3
## spatstat.linnet 3.1-5
##
## spatstat 3.0-8
## For an introduction to spatstat, type 'beginner'
library(stars)
## Warning: package 'stars' was built under R version 4.3.3
## Loading required package: abind
#convert to a point object to do the density analysis
invasive_sf <- invasives$geometry
ppp_points <- as.ppp(invasive_sf)
#create a NY object to use as the boundaries for the density calculation:
ny <- counties %>% summarise(mean = mean(state_fips))
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `mean = mean(state_fips)`.
## Caused by warning in `mean.default()`:
## ! argument is not numeric or logical: returning NA
#set as the boundary
Window(ppp_points) <- as.owin(ny)
plot(ppp_points, main = "")
This measure of density is called Kernel Density, and it is calculated
using a moving sub-region window, called a Kernel window. This is quite
similar to the focal operations that we discussed in our raster unit.
This measure of density is nice because it gives us a more detailed
picture of point density across the study unit. We can also convert the
output of the kernel density function into a raster object, which would
allow us to use raster methods on it. In this tutorial, I will show you
how to convert it back to an SF object for plotting using ggplot2.
#calculate density
density_spatstat <- density(ppp_points)
#first, convert to a spatial object called a stars object
density_stars <- stars::st_as_stars(density_spatstat)
#then convert back to sf
density_sf <- st_as_sf(density_stars) %>%
st_set_crs(32116)
#plot it!
ggplot() +
geom_sf(data = density_sf, aes(fill = v), col = NA) +
geom_sf(data = st_boundary(ny))+
scale_fill_distiller(palette = "PuRd", direction = 1)+
theme_void()
#you can also plot it on mapview, although ggplot2 may be better suited for this task
mapview(density_sf)
Resources Appelhans et al. (n.d.). mapview. Retrieved from: https://r-spatial.github.io/mapview/articles/mapview_01-basics.html
Dorman, M. (2022). Chapter 13 Point pattern analysis (in preparation). Retrieved from: https://geobgu.xyz/r/point-pattern-analysis.html.
Gimond, M. (2023). Chapter 11: Point Pattern Analysis. Retreived from: https://mgimond.github.io/Spatial/chp11_0.html
Peeples, M. (N. D.). Point Pattern Analysis. Retrieved from: https://www.mattpeeples.net/modules/PointPattern.html
RColorBrewer’s Palettes. Retreived from: https://r-graph-gallery.com/38-rcolorbrewers-palettes.html
Urban Data Palette (2021). Create spatial square/hexagon grids and count points inside in R with sf. Retrieved from: https://urbandatapalette.com/post/2021-08-tessellation-sf/
Wilkin, J. (2015). 8 Analysing Spatial Patterns III: Point Pattern Analysis. Retrieved from: https://jo-wilkin.github.io/GEOG0030/coursebook/analysing-spatial-patterns-iii-point-pattern-analysis.html.