# load all necessary packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidycensus)
library(sf)
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(yelpr)
library(tigris)
## To enable caching of data, set `options(tigris_use_cache = TRUE)`
## in your R script or .Rprofile.
library(httr)
library(reshape2)
##
## Attaching package: 'reshape2'
##
## The following object is masked from 'package:tidyr':
##
## smiths
library(here)
## here() starts at C:/Users/wpgeorgia/Documents/GT MSUA/CP 8883/Intro to UA R Projects
library(knitr)
library(sf)
library(jsonlite)
##
## Attaching package: 'jsonlite'
##
## The following object is masked from 'package:purrr':
##
## flatten
library(tmap)
## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')
tmap_mode("view")
## tmap mode set to interactive viewing
# activate census API
tidycensus::census_api_key(Sys.getenv("census_api"))
## To install your API key for use in future sessions, run this function with `install = TRUE`.
# retrieve census tracts for Brown County, WI
GB_tracts <- suppressMessages(
get_acs(geography = "tract",
state = "WI",
county = c("Brown"),
variables = c(hhincome = 'B19019_001'),
year = 2021,
survey = "acs5",
geometry = TRUE,
output = "wide")
)
## | | | 0% | |= | 2% | |=== | 5% | |==== | 6% | |===== | 8% | |========= | 12% | |=========== | 15% | |=========== | 16% | |============ | 18% | |============= | 19% | |============== | 20% | |================ | 22% | |================= | 25% | |==================== | 29% | |============================= | 41% | |============================== | 43% | |=================================== | 50% | |==================================== | 51% | |===================================== | 53% | |===================================== | 54% | |======================================= | 55% | |======================================= | 56% | |==================================================== | 75% | |====================================================== | 78% | |============================================================= | 87% | |=================================================================== | 96% | |==================================================================== | 97% | |======================================================================| 100%
# get boundaries of Green Bay, WI
green_bay <- tigris::places('WI') |>
filter(NAME == 'Green Bay')
## Retrieving data for the year 2022
## | | | 0% | |= | 2% | |== | 3% | |==== | 6% | |===== | 7% | |====== | 8% | |====== | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========== | 14% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================= | 24% | |================== | 26% | |=================== | 27% | |==================== | 28% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 33% | |======================== | 34% | |========================= | 35% | |========================== | 37% | |========================== | 38% | |=========================== | 39% | |============================ | 40% | |============================ | 41% | |============================== | 43% | |================================ | 45% | |================================ | 46% | |================================= | 47% | |================================== | 48% | |================================== | 49% | |==================================== | 51% | |===================================== | 53% | |====================================== | 54% | |======================================= | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 63% | |============================================= | 64% | |================================================ | 68% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 75% | |===================================================== | 75% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |========================================================= | 82% | |============================================================ | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |===================================================================== | 99% | |======================================================================| 100%
# define census tracts within Green, Bay WI
tract_green_bay <- GB_tracts[green_bay,]
# draw tract shapes overlaid with city limits
tm_shape(tract_green_bay) + tm_borders(lwd = 2.5) +
tm_shape(green_bay) + tm_polygons(col = 'blue', alpha = 0.4)
get_radius <- function(poly, epsg_id){
# Getting distance between the centroid of tracts' boundingbox and corners of bounding boxes
epsg_id <- 4326
bb <- st_bbox(poly)
# Get lat & long coordinates of any one corner of the bounding box.
bb_corner <- st_point(c(bb[1], bb[2])) |> st_sfc(crs = epsg_id)
# Get centroid of the bb
bb_center_x <- (bb[3]+bb[1])/2
bb_center_y <- (bb[4]+bb[2])/2
bb_center <- st_point(c(bb_center_x, bb_center_y)) |> st_sfc(crs = epsg_id) |> st_sf()
# Get the distance between bb_p and c
rad <- st_distance(bb_corner, bb_center)
# Multiply 1.1 to make the circle a bit larger than the Census Tract.
bb_center$radius <- rad*1.1
return(bb_center)
}
# Using (sapply) to apply this radius function to each Census Tract.
epsg_id <- 4326
rad4all_apply <- tract_green_bay %>%
st_geometry() %>%
st_transform(crs = epsg_id) %>%
lapply(., function(x) get_radius(x, epsg_id = epsg_id))
rad4all_apply <- bind_rows(rad4all_apply)
# putting x,y coordinates in separate columns
data_4_yelp <- rad4all_apply %>%
mutate(x = st_coordinates(.)[,1],
y = st_coordinates(.)[,2])
data_4_yelp %>%
# Draw buffers centered at the centroids of Tract polygons
st_buffer(., dist = .$radius) %>%
# Display this buffer in yellow
tm_shape(.) + tm_polygons(alpha = 0.5, col = 'yellow') +
# Displays the census tracts in purple
tm_shape(tract_green_bay) + tm_borders(col= 'purple')
get_yelp <- function(tract, category){
# Gets one row of tract information (1,) and category name (str),
# Outputs a list of business data.frame
Sys.sleep(1)
n <- 1
resp <- business_search(api_key = Sys.getenv("yelp_api"),
categories = category,
latitude = tract$y,
longitude = tract$x,
offset = (n - 1) * 50, # = 0 when n = 1
radius = round(tract$radius),
limit = 50)
# Calculate how many requests are needed in total
required_n <- ceiling(resp$total/50)
# out is where the results will be appended to.
out <- vector("list", required_n)
# Store the business information to nth slot in out
out[[n]] <- resp$businesses
# Change the name of the elements to the total required_n
# This is to know if there are more than 1000 businesses,
# we know how many.
names(out)[n] <- required_n
# Throw error if more than 1000
if (resp$total >= 1000)
{
# glue formats string by inserting {n} with what's currently stored in object n.
print(glue::glue("{n}th row has >= 1000 businesses."))
# Stop before going into the loop because we need to
# break down Census Tract to something smaller.
return(out)
}
else
{
n <- n + 1
while(n <= required_n){
resp <- business_search(api_key = Sys.getenv("yelp_api"),
categories = category,
latitude = tract$y,
longitude = tract$x,
offset = (n - 1) * 50,
radius = round(tract$radius),
limit = 50)
out[[n]] <- resp$businesses
n <- n + 1
}
out <- out %>% bind_rows()
return(out)
}
}
# Apply the function for the first Census Tract
yelp_first_tract <- get_yelp(data_4_yelp[1,], "yoga") %>%
as_tibble()
## No encoding supplied: defaulting to UTF-8.
# Print
yelp_first_tract %>% print
## # A tibble: 0 × 0
# Prepare an empty list for YOGA category
yelp_allyoga_list <- vector("list", nrow(data_4_yelp))
# Looping through all Census Tracts
for (row in 1:nrow(data_4_yelp)){
yelp_allyoga_list[[row]] <- suppressMessages(get_yelp(data_4_yelp[row,], "yoga"))
print(paste0("Current row: ", row))
}
## [1] "Current row: 1"
## [1] "Current row: 2"
## [1] "Current row: 3"
## [1] "Current row: 4"
## [1] "Current row: 5"
## [1] "Current row: 6"
## [1] "Current row: 7"
## [1] "Current row: 8"
## [1] "Current row: 9"
## [1] "Current row: 10"
## [1] "Current row: 11"
## [1] "Current row: 12"
## [1] "Current row: 13"
## [1] "Current row: 14"
## [1] "Current row: 15"
## [1] "Current row: 16"
## [1] "Current row: 17"
## [1] "Current row: 18"
## [1] "Current row: 19"
## [1] "Current row: 20"
## [1] "Current row: 21"
## [1] "Current row: 22"
## [1] "Current row: 23"
## [1] "Current row: 24"
## [1] "Current row: 25"
## [1] "Current row: 26"
## [1] "Current row: 27"
## [1] "Current row: 28"
## [1] "Current row: 29"
## [1] "Current row: 30"
## [1] "Current row: 31"
## [1] "Current row: 32"
## [1] "Current row: 33"
## [1] "Current row: 34"
## [1] "Current row: 35"
## [1] "Current row: 36"
## [1] "Current row: 37"
## [1] "Current row: 38"
# Collapsing the list into a data.frame
yelp_allyoga <- yelp_allyoga_list %>% bind_rows() %>% as_tibble()
# Prepare empty list for GYMS category
yelp_allgyms_list <- vector("list", nrow(data_4_yelp))
for (row in 1:nrow(data_4_yelp)){
yelp_allgyms_list[[row]] <- suppressMessages(get_yelp(data_4_yelp[row,], "gyms"))
print(paste0("Current row: ", row))
}
## [1] "Current row: 1"
## [1] "Current row: 2"
## [1] "Current row: 3"
## [1] "Current row: 4"
## [1] "Current row: 5"
## [1] "Current row: 6"
## [1] "Current row: 7"
## [1] "Current row: 8"
## [1] "Current row: 9"
## [1] "Current row: 10"
## [1] "Current row: 11"
## [1] "Current row: 12"
## [1] "Current row: 13"
## [1] "Current row: 14"
## [1] "Current row: 15"
## [1] "Current row: 16"
## [1] "Current row: 17"
## [1] "Current row: 18"
## [1] "Current row: 19"
## [1] "Current row: 20"
## [1] "Current row: 21"
## [1] "Current row: 22"
## [1] "Current row: 23"
## [1] "Current row: 24"
## [1] "Current row: 25"
## [1] "Current row: 26"
## [1] "Current row: 27"
## [1] "Current row: 28"
## [1] "Current row: 29"
## [1] "Current row: 30"
## [1] "Current row: 31"
## [1] "Current row: 32"
## [1] "Current row: 33"
## [1] "Current row: 34"
## [1] "Current row: 35"
## [1] "Current row: 36"
## [1] "Current row: 37"
## [1] "Current row: 38"
yelp_allgyms <- yelp_allgyms_list %>% bind_rows() %>% as_tibble()
# Combine all yoga and all gyms into one data frame
yelp_allbusinesses <- bind_rows(yelp_allyoga, yelp_allgyms)
yelp_sfyoga <- yelp_allyoga %>%
mutate(x = .$coordinates$longitude,
y = .$coordinates$latitude) %>%
filter(!is.na(x) & !is.na(y)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
yelp_sfgyms <- yelp_allgyms %>%
mutate(x = .$coordinates$longitude,
y = .$coordinates$latitude) %>%
filter(!is.na(x) & !is.na(y)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
yelp_sfall <- yelp_allbusinesses %>%
mutate(x = .$coordinates$longitude,
y = .$coordinates$latitude) %>%
filter(!is.na(x) & !is.na(y)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
# Map for yoga studios
tm_shape(yelp_sfyoga) +
tm_dots(col = "review_count", style="pretty", title="Yoga Studios")
# Map for gyms
tm_shape(yelp_sfgyms) +
tm_dots(col = "review_count", style="pretty", title="Gyms")
# Map for both sets of businesses
tm_shape(yelp_sfall) +
tm_dots(col = "review_count", style="pretty", title="All Yoga Studios and Gyms")
# Print data for All businesses, just yoga studios, and just gyms
yelp_allbusinesses %>% print
## # A tibble: 274 × 17
## id alias name image_url is_closed url review_count categories rating
## <chr> <chr> <chr> <chr> <lgl> <chr> <int> <list> <dbl>
## 1 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 2 mPdo9z6… jens… Jens… "https:/… FALSE http… 8 <df> 4.5
## 3 3LshRsE… grac… Grac… "https:/… FALSE http… 4 <df> 5
## 4 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 5 Uz-HorO… pedr… Pedr… "https:/… FALSE http… 1 <df> 5
## 6 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 7 5jQz1lv… judi… Judi… "" FALSE http… 0 <df> 0
## 8 l8rImG1… ever… Ever… "https:/… FALSE http… 6 <df> 5
## 9 voA-as6… odys… Odys… "https:/… FALSE http… 2 <df> 5
## 10 5DKy9mh… east… East… "https:/… FALSE http… 3 <df> 4
## # ℹ 264 more rows
## # ℹ 8 more variables: coordinates <df[,2]>, transactions <list>,
## # location <df[,8]>, phone <chr>, display_phone <chr>, distance <dbl>,
## # business_hours <list>, attributes <df[,3]>
yelp_allyoga %>% print
## # A tibble: 111 × 17
## id alias name image_url is_closed url review_count categories rating
## <chr> <chr> <chr> <chr> <lgl> <chr> <int> <list> <dbl>
## 1 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 2 mPdo9z6… jens… Jens… "https:/… FALSE http… 8 <df> 4.5
## 3 3LshRsE… grac… Grac… "https:/… FALSE http… 4 <df> 5
## 4 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 5 Uz-HorO… pedr… Pedr… "https:/… FALSE http… 1 <df> 5
## 6 CpyuQ1z… morn… Morn… "https:/… FALSE http… 1 <df> 5
## 7 5jQz1lv… judi… Judi… "" FALSE http… 0 <df> 0
## 8 l8rImG1… ever… Ever… "https:/… FALSE http… 6 <df> 5
## 9 voA-as6… odys… Odys… "https:/… FALSE http… 2 <df> 5
## 10 5DKy9mh… east… East… "https:/… FALSE http… 3 <df> 4
## # ℹ 101 more rows
## # ℹ 8 more variables: coordinates <df[,2]>, transactions <list>,
## # location <df[,8]>, phone <chr>, display_phone <chr>, distance <dbl>,
## # business_hours <list>, attributes <df[,3]>
yelp_allgyms %>% print
## # A tibble: 163 × 17
## id alias name image_url is_closed url review_count categories rating
## <chr> <chr> <chr> <chr> <lgl> <chr> <int> <list> <dbl>
## 1 HvtJHD2… plan… Plan… "https:/… FALSE http… 13 <df> 3.2
## 2 IRKoEjB… redd… RedD… "" FALSE http… 1 <df> 5
## 3 vOIJx1n… anyt… Anyt… "" FALSE http… 7 <df> 2.3
## 4 cKFAfLv… ferg… Ferg… "https:/… FALSE http… 3 <df> 5
## 5 cKFAfLv… ferg… Ferg… "https:/… FALSE http… 3 <df> 5
## 6 9cwfQ84… viki… Viki… "https:/… FALSE http… 1 <df> 5
## 7 voA-as6… odys… Odys… "https:/… FALSE http… 2 <df> 5
## 8 a6JtG3g… gree… Gree… "https:/… FALSE http… 0 <df> 0
## 9 KaUm844… reps… Reps… "https:/… FALSE http… 0 <df> 0
## 10 5DKy9mh… east… East… "https:/… FALSE http… 3 <df> 4
## # ℹ 153 more rows
## # ℹ 8 more variables: coordinates <df[,2]>, transactions <list>,
## # location <df[,8]>, phone <chr>, display_phone <chr>, distance <dbl>,
## # business_hours <list>, attributes <df[,3]>
Which city did you choose?
Green Bay, WI
How many businesses are there in total?
There are 274 businesses between yoga studios and gyms.
How many businesses are there for each business category?
There are 111 yoga studios and 163 gyms.
Upon visual inspection, can you see any noticeable spatial patterns to the way they are distributed across the city (e.g., clustering of businesses at some parts of the city)? (Optional) Are there any other interesting findings?
The fitness businesses seem fairly dispersed, but there are a decent proportion of businesses - especially yoga studios - near downtown close to the Fox River. Since it’s the urban core of the city, that area may be home to a younger population who are likely more active. Another interesting observation is proximity to the river. A decent number of businesses - again, especially yoga studios - are near the river. Many health and fitness businesses focus on connecting wellbeing to nature. With beautiful river views and river trails to take an advantage of, that may contribute to some businesses choosing that location. Finally, there appears to be a concentration of gyms in the southwest part of the area. That area is wealthier than other parts, and because gym memberships can be quite expensive, it makes sense that many businesses are in this wealthy area.