Introduction

Hello, and welcome to my short analysis of some Utah golf data! This is data obtained from UGRC and I used a csv and some shapefiles accordingly to fulfill certain parts of the analysis. This was done for the purposes of a final project for a geography class. With that, let’s hop right into things beginning with the first part.

Packages and finding the number of courses for each type by county

So first off, I wanted to know the most number of golf courses for each type by county. Before doing so, some packages will need to be installed.

library(terra)
## terra 1.7.78
library(RColorBrewer)
library(sf)
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(viridis)
## Loading required package: viridisLite
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:terra':
## 
##     intersect, union
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

And then the csv file will need to be read to utilize the data

golf = read.csv("GolfCourses_UGRC.csv")

Now I can input the following code and got the following figure as the result.

county_bar = ggplot(golf, aes(y = County))
county_bar + geom_bar() + facet_grid(~Type)

As you can see, the counties with the most number of courses of each type were the following:

Finding the anova for if each type and the number of holes is significant plus another model with par.

As the title suggests, I will begin by creating a model with the relationship of the type of course and the number of holes with the following code:

golfFit = lm(Holes ~ Type, data = golf)
golfFit
## 
## Call:
## lm(formula = Holes ~ Type, data = golf)
## 
## Coefficients:
## (Intercept)  TypePrivate   TypePublic   TypeResort  
##     16.7500       1.6786       0.2857      -3.2500
anova(golfFit)
## Analysis of Variance Table
## 
## Response: Holes
##            Df Sum Sq Mean Sq F value Pr(>F)
## Type        3   93.5  31.162  1.0974 0.3533
## Residuals 113 3208.8  28.397

As you can see, we got the Pr(>F) value of 0.3533, and an F-value of 1.0974. Next, let’s test another anova with Par as an additional variable and compare with the previous anova to see if our model improves by chance.

golfFitTwo = lm(Holes ~ Type + Par, data = golf)
golfFitTwo
## 
## Call:
## lm(formula = Holes ~ Type + Par, data = golf)
## 
## Coefficients:
## (Intercept)  TypePrivate   TypePublic   TypeResort          Par  
##      1.1321       0.6547       0.6011      -0.8384       0.2446
anova(golfFitTwo, golfFit)
## Analysis of Variance Table
## 
## Model 1: Holes ~ Type + Par
## Model 2: Holes ~ Type
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    112 1580.7                                  
## 2    113 3208.8 -1   -1628.1 115.35 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And it did! The Pr(>F) value is vastly smaller and the F-value is vastly bigger. This means there is a significant difference when comparing the type of course and the holes plus the par.

Map(s) of golf courses by county in Utah

The last part I will be looking at is producing two maps with the number of golf courses in each county in Utah to serve as a helpful reference for those of you looking for an ideal spot with many courses to choose from.

This came in multiple parts, so to break it up, we start with reading the shapefile version of the courses and a shapefile for the counties in Utah:

golf_shp <- st_read(dsn = "GolfCourses/GolfCourses.shp") 
## Reading layer `GolfCourses' from data source 
##   `C:\Users\karls\Desktop\GEOG 5680 Assignments\FinalProject\GolfCourses\GolfCourses.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 117 features and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -12652210 ymin: 4442456 xmax: -12171850 ymax: 5146475
## Projected CRS: WGS 84 / Pseudo-Mercator
utahCounty <- st_read("utahcountyF/utahcounty/utahcounty.shp")
## Reading layer `utahcounty' from data source 
##   `C:\Users\karls\Desktop\GEOG 5680 Assignments\FinalProject\utahcountyF\utahcounty\utahcounty.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 29 features and 13 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -114.0529 ymin: 36.99766 xmax: -109.0416 ymax: 42.00171
## CRS:           NA

Next, let’s insure that the coordinate systems for each file match.

st_crs(utahCounty) <- 4326
utahCounty <- st_transform(utahCounty, st_crs(golf_shp))

Now let’s create breaks for the plot and make a new variable for the Utah County shapefile to include data from the golf file.

breaksP = c(0, 2, 4, 6, 8, 10)

utahCounty$nCourse <- lengths(st_intersects(utahCounty, golf_shp))

Now we make the plot.

ggplot() + 
  geom_sf(data = utahCounty, aes(fill = nCourse))

So it does its work and shows the courses by area, but there is a concern and that has something to do with the size of each county. To address this, we will do a little math.

## The shape area being converted to km

utahCounty$area_km2 <- utahCounty$Shape_Area / 1e6

## Number of courses per km squared

utahCounty$course_per_km2 <- utahCounty$nCourse / utahCounty$area_km2

What we are doing is we are taking the Shape_Area and dividing it to convert it to kilometers first, and then taking the nCourse variable created recently and dividing it by what we just got for the km conversion to get the number of courses per km squared.

Now, with all that we have, let’s make another map, following similar steps as before.

breaksKM = c(0, 0.0025, 0.0050, 0.0075, 0.01)

ggplot() + 
  geom_sf(data = utahCounty, aes(fill = course_per_km2)) +
  scale_fill_viridis(option = "viridis", 
                     breaks = breaksKM, labels = breaksKM) +
  theme_bw()

And there’s the map to take the course per km squared into account! It looks like Salt Lake County is on the higher end for the number of golf courses.

Final words

Hopefully this helped to provide some statistical insight to the golf courses in Utah. Thank you very much for your time in reading my report and happy golfing!

Sources for data:

GolfCourses File: https://opendata.gis.utah.gov/datasets/utah-golf-courses/about (Both csv and shapefile). Author: Utah Automated Geographic Reference Center (AGRC). Last Update: September 9, 2022.

utahcounty File: See https://gis.utah.gov/products/sgid/categories/ for a list of various data from UGRC. Author: Utah Automated Geographic Reference Center (AGRC).