KP Wells Urban Analytics Mini Assignment 1 Due 9/20/22

First, we need to decide what area we’re doing and what businesses we want to scrape Yelp! for. I live in DeKalb County, Georgia, which is part of the Atlanta metro, so let’s go with that. As for businesses, let’s say I want to have brunch and a spa day with my friends. So I’m going to be looking at restaurants that serve brunch and medical spas. That means I’ll look at the categories breakfast_brunch and medicalspa. Let’s start by calling some libraries I might need

library(tidycensus) #Lets us use Census api
library(sf) #allows us to read and write shapefifles
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(tmap) #visualizes simple maps 
library(jsonlite) #reads and writes json files
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()  masks stats::filter()
## ✖ purrr::flatten() masks jsonlite::flatten()
## ✖ dplyr::lag()     masks stats::lag()
library(httr) #lets us make api requests
library(jsonlite)
library(reshape2)
## 
## Attaching package: 'reshape2'
## 
## The following object is masked from 'package:tidyr':
## 
##     smiths
library(here) #stores paths
## here() starts at C:/Users/kwells65/OneDrive - Georgia Institute of Technology/Assignments
library(yelpr)
library(knitr)

Next, I need to install my Census API key

tidycensus::census_api_key(Sys.getenv("census_api")) 
## To install your API key for use in future sessions, run this function with `install = TRUE`.
#NOTE: My census api is stored as an environment variable for security. Also, I didn't actually have my census api installed on this machine, but next time I need it, I'll add "install = TRUE" to this function.

Step One: Getting Census Tract Boundaries

Now I’m ready to get my polygonal data. The block of code below gets the census tract boundaries I want for DeKalb.

dek_tracts <- suppressMessages(
  get_acs(geography = "tract", #you can use other geographies like 'county' or 'state' here
          state = "GA",
          county = "Dekalb", 
          variables = c(hhincome = "B19019_001", population = "B01003_001"),
          year = 2019,
          survey = "acs5", #This is American Community Survey 5-yr estimates
          geometry = TRUE, #allows us to return sf objects
          output = "wide"))
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

I like to specify dfs as much as possible which is why I used ‘dek_tracts’ instead of just ‘tracts’ it’s easier for me to avoid confusion when I’m repeating the analysis multiple times. This is especially if I’m using multiple geographies or looking at different locations.

#Let me check the output before moving on
message(sprintf("nrow: %s, ncol: %s", nrow(dek_tracts), ncol(dek_tracts)))
## nrow: 145, ncol: 7
dek_tracts %>% head() %>% knitr::kable()
GEOID NAME hhincomeE hhincomeM populationE populationM geometry
13089021213 Census Tract 212.13, DeKalb County, Georgia 154063 19674 3526 204 MULTIPOLYGON (((-84.34783 3…
13089023506 Census Tract 235.06, DeKalb County, Georgia 45924 13793 6465 927 MULTIPOLYGON (((-84.25237 3…
13089021305 Census Tract 213.05, DeKalb County, Georgia 55109 4607 4970 391 MULTIPOLYGON (((-84.28811 3…
13089023313 Census Tract 233.13, DeKalb County, Georgia 55143 5672 5294 576 MULTIPOLYGON (((-84.14593 3…
13089021604 Census Tract 216.04, DeKalb County, Georgia 159306 38073 3237 254 MULTIPOLYGON (((-84.31051 3…
13089021913 Census Tract 219.13, DeKalb County, Georgia 32983 3760 4450 559 MULTIPOLYGON (((-84.1905 33…

Next, I’m going to make it so I only keep the variables I want. The ‘E’ in this case stands for ‘Estimate’.

dek_tracts2 <- dek_tracts %>% 
  select(GEOID, 
         hhincome = hhincomeE, # New name = old name
         population = populationE)

I want to visualize my data periodically to make sure everything looks okay. So let’s take a look at the tracts on a map.

tmap_mode("view")
## tmap mode set to interactive viewing
#tmap mode set to interactive viewing

tm_shape(dek_tracts2) + tm_borders()
#The above block of code should show us all the census tracts for DeKalb.

Steps Two and Three: Draw a Bounding Box

In this block of code, we’re going to get the bounding box of our polygon, get lat/long coordinates of any one of the corners of that bounding box and get the centroid of the bounding box (bb)

get_r <- function(poly, epsg_id){
  bb <- st_bbox(poly)
  bb_corner <- st_point(c(bb[1], bb[2])) %>% st_sfc(crs = epsg_id)
  bb_center_x <- (bb[3]+bb[1])/2
  bb_center_y <- (bb[4]+bb[2])/2
  bb_center <- st_point(c(bb_center_x, bb_center_y)) %>% st_sfc(crs = epsg_id) %>% st_sf()
  
#Next, I'm going to get the distance between bb_p and c and multiply it by 1.1 to make the circle a bit bigger than the tracts
  r <- st_distance(bb_corner, bb_center)
  bb_center$radius <- r*1.2
  return(bb_center)
}

Now that I’ve established the bounding box, let’s apply the function to each of our polygons using a for loop. First, I need to create an empty vector for our results to fill.

epsg_id <- 4326 #NOTE: 4326 measures distance in meters. You can also use 26967.

r4all_loop <- vector("list", nrow(dek_tracts2))

#for loop starts here

for (i in 1:nrow(dek_tracts2)){
  r4all_loop[[i]] <- dek_tracts2 %>% 
    st_transform(crs = epsg_id) %>% 
    st_geometry() %>% 
    .[[i]] %>% 
    get_r(epsg_id = epsg_id)
}

r4all_loop <- bind_rows(r4all_loop)

#Now, I append my x/y coordinates
ready_4_yelp <- r4all_loop %>% 
  mutate(x = st_coordinates(.)[,1],
         y = st_coordinates(.)[,2])

Let’s map it!

tmap_mode('view')
## tmap mode set to interactive viewing
#To check my data, I'll take a look at the first 10 rows, draw a red buffer centered on the centroid of the tract polygons,and display the original polygons in blue
ready_4_yelp[1:10,] %>% 
  st_buffer(., dist = .$radius) %>% 
  tm_shape(.) + tm_polygons(alpha = 0.5, col = 'red') +
  tm_shape(dek_tracts2[1:10,]) + tm_borders(col= 'blue')

Step Four: Downloading Yelp! Data With the yelpr Package

I’ll start off by defining my function. The purpose of this block of code is to get one row of tract information (1,) #and category name (str). The output is a list of business data.frame.

get_yelp <- function(tract, category){
  n <- 1
  resp <- business_search(api_key = Sys.getenv("yelp_api"), 
                          categories = category, 
                          latitude = tract$y, 
                          longitude = tract$x, 
                          offset = (n - 1) * 50, # = 0 when n = 1
                          radius = round(tract$radius), 
                          limit = 50)
  required_n <- ceiling(resp$total/50)
  
  out <- vector("list", required_n)
  #'out' is where our results will be appended to.  
  # Store the business information to nth slot in out
  out[[n]] <- resp$businesses
  
#Next, I need to change the name of the elements to the total required_n
#This is to know if there are more than 1000 businesses,we know how many.
  names(out)[n] <- required_n
  
#throw error if more than 1000
  if (resp$total >= 1000)
  {
#glue formats strings of text by inserting {n} with what's currently stored in object n.
    print(glue::glue("{n}th row has >= 1000 businesses."))
#Now, I need to stop before going into the loop because we need to break down Census Tract to something smaller.
    return(out)
  } 
  else 
  {
    # add 1 to n
    n <- n + 1
    
    
#here's where the while loop starts
    while(n <= required_n){
      resp <- business_search(api_key = Sys.getenv("yelp_api"), 
                              categories = category, 
                              latitude = tract$y, 
                              longitude = tract$x, 
                              offset = (n - 1) * 50, 
                              radius = round(tract$radius), 
                              limit = 50)
      
      out[[n]] <- resp$businesses
      
      n <- n + 1
    } #<< this signifies the end of the while loop
    
#Finally, we merge all elements in the list into a single data frame
    dek_out <- out %>% bind_rows()
    
    return(dek_out)
  }
}

Step Five: Defining a Function for Accessing Yelp! API for one Census Tract

First, let’s test things by applying the function for the first Census Tract.

yelp_first_tract_brunch <- get_yelp(ready_4_yelp[1,], "breakfast_brunch") %>% 
  as_tibble()
## No encoding supplied: defaulting to UTF-8.
yelp_first_tract_spa <- get_yelp(ready_4_yelp[1,], "medicalspa") %>% 
  as_tibble()
## No encoding supplied: defaulting to UTF-8.
#This gets the data for both business types I want to look at. Let's combine them.
yelp_first_tract <- bind_rows(yelp_first_tract_brunch, yelp_first_tract_spa)

#As always, let's check it!
yelp_first_tract %>% print
## # A tibble: 17 × 16
##    id           alias name  image…¹ is_cl…² url   revie…³ categ…⁴ rating coord…⁵
##    <chr>        <chr> <chr> <chr>   <lgl>   <chr>   <int> <list>   <dbl>   <dbl>
##  1 Sh7BBAHsDkN… firs… Firs… "https… FALSE   http…     268 <df>       3.5    33.9
##  2 C8SrEYsWjjG… j-ch… J Ch… "https… FALSE   http…      88 <df>       3      33.9
##  3 iJ0DwsHhE75… bell… Bell… "https… FALSE   http…       5 <df>       3.5    33.9
##  4 HW36mkQQdcX… park… Park… "https… FALSE   http…      35 <df>       5      34.0
##  5 aaajyKLRtLL… atla… Atla… "https… FALSE   http…      48 <df>       5      33.8
##  6 Cldc9nU5XRY… hydr… Hydr… "https… FALSE   http…      63 <df>       4      33.8
##  7 W4_ucUE30B4… hydr… Hydr… "https… FALSE   http…      37 <df>       4.5    33.8
##  8 Ns4Cu0YlZlv… b-ne… B Ne… "https… FALSE   http…      10 <df>       4.5    33.8
##  9 pLaH_zlvbF1… lux-… LUX … "https… FALSE   http…      16 <df>       4      33.9
## 10 5-OUTmlfQwB… hydr… Hydr… "https… FALSE   http…      11 <df>       4      33.9
## 11 Sl34mZJVY7B… mirr… Mirr… "https… FALSE   http…       1 <df>       5      33.9
## 12 WOcIAFJsOtG… adva… Adva… "https… FALSE   http…       3 <df>       5      33.9
## 13 OfyqEX-Mb3j… body… Body… "https… FALSE   http…       2 <df>       5      33.8
## 14 l5DR2U_9o4T… body… Body… "https… FALSE   http…       4 <df>       5      33.6
## 15 1PKvJcXAkBv… the-… The … "https… FALSE   http…       2 <df>       5      33.9
## 16 0rJVLBKHycm… bell… Bell… ""      FALSE   http…       3 <df>       2.5    34.0
## 17 WunHdCt2Rpr… 360-… 360 … "https… FALSE   http…       1 <df>       1      33.8
## # … with 7 more variables: coordinates$longitude <dbl>, transactions <list>,
## #   price <chr>, location <df[,8]>, phone <chr>, display_phone <chr>,
## #   distance <dbl>, and abbreviated variable names ¹​image_url, ²​is_closed,
## #   ³​review_count, ⁴​categories, ⁵​coordinates$latitude
## # ℹ Use `colnames()` to see all variable names

Step 6: Applying the function to all other Census Tracts

First, I’ll prepare my collectors.

yelp_all_list <- vector("list", nrow(ready_4_yelp))
yelp_brunch_list <- vector("list", nrow(ready_4_yelp))
yelp_spa_list <- vector("list", nrow(ready_4_yelp))

Now I can write my loop. My advice is to really think about how to structure the loop when you’re working with multiple variables.

for (row in 1:nrow(ready_4_yelp)){
  yelp_brunch <- suppressMessages(get_yelp(ready_4_yelp[row,], "breakfast_brunch"))
  yelp_spa <- suppressMessages(get_yelp(ready_4_yelp[row,], "medicalspa"))
  yelp_all_list[[row]] <- yelp_brunch %>% bind_rows(yelp_spa)
  yelp_brunch_list[[row]] <- suppressMessages(get_yelp(ready_4_yelp[row,], "breakfast_brunch"))
  yelp_spa_list[[row]]  <- suppressMessages(get_yelp(ready_4_yelp[row,], "medicalspa"))
  
  if (row %% 10 == 0){
    print(paste0("Current row: ", row))
  }
}
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 10"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 20"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 30"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 40"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 50"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 60"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 70"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 80"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 90"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 100"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 110"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 120"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 130"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
## [1] "Current row: 140"
## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs

## Warning: Outer names are only allowed for unnamed scalar atomic inputs
#Finally, we can collapse everything into a single df
yelp_all_dek <- yelp_all_list %>% bind_rows() %>% as_tibble()
yelp_bunch_df <- yelp_brunch_list  %>% bind_rows() %>% as_tibble()
yelp_spa_df <- yelp_spa_list  %>% bind_rows() %>% as_tibble()

#let's take a look
yelp_all_dek %>% print(width=1000)
## # A tibble: 1,529 × 16
##    id                     alias                              
##    <chr>                  <chr>                              
##  1 Sh7BBAHsDkNKIv91xLetgg first-watch-dunwoody-3             
##  2 C8SrEYsWjjGYfeF41g9wjw j-christophers-atlanta-3           
##  3 iJ0DwsHhE75_MwjZI54_Sg bellas-g-kitchen-sandy-spring      
##  4 HW36mkQQdcXEqZn6zlgRjA park-ave-cosmetic-center-roswell-3 
##  5 aaajyKLRtLL_MjzLjRH-sA atlanta-medical-aesthetics-atlanta 
##  6 Cldc9nU5XRYehC3-ByS-_Q hydra-buckhead-atlanta-2           
##  7 W4_ucUE30B4mtGpBYAFldg hydra-virginia-highlands-atlanta-5 
##  8 Ns4Cu0YlZlvGfpYTzO5QAg b-new-beauty-studios-atlanta       
##  9 pLaH_zlvbF1hGRB8pIQS_w lux-med-spa-atlanta                
## 10 5-OUTmlfQwBhfohoh9-MOA hydra-sandy-springs-sandy-springs-2
##    name                      
##    <chr>                     
##  1 First Watch               
##  2 J Christopher's           
##  3 Bella's G. Kitchen        
##  4 Park Ave Cosmetic Center  
##  5 Atlanta Medical Aesthetics
##  6 Hydra+ Buckhead           
##  7 Hydra+ Virginia Highlands 
##  8 B New Beauty Studio       
##  9 LUX Med Spa               
## 10 Hydra+ Sandy Springs      
##    image_url                                                           
##    <chr>                                                               
##  1 https://s3-media4.fl.yelpcdn.com/bphoto/S8_zbrjLpaStDXJ_rlTMWA/o.jpg
##  2 https://s3-media2.fl.yelpcdn.com/bphoto/Qx_bVCh68BPassOe-EDRQw/o.jpg
##  3 https://s3-media2.fl.yelpcdn.com/bphoto/Tb8Kg_uDjwcFbCOM8MVLfg/o.jpg
##  4 https://s3-media4.fl.yelpcdn.com/bphoto/CRpFyrLc1jt3UNMMESH_AQ/o.jpg
##  5 https://s3-media4.fl.yelpcdn.com/bphoto/ya_3PALu7D7kJMYFIOYuDg/o.jpg
##  6 https://s3-media1.fl.yelpcdn.com/bphoto/2x3IGy4wCQPnhxe4VVzaeQ/o.jpg
##  7 https://s3-media4.fl.yelpcdn.com/bphoto/Wkv45cHYZYSTVWguGctIDQ/o.jpg
##  8 https://s3-media1.fl.yelpcdn.com/bphoto/9QgXi2PdnpXBw7m8q5xO_A/o.jpg
##  9 https://s3-media1.fl.yelpcdn.com/bphoto/fW7Dwiv1xbfSedpdhscaFQ/o.jpg
## 10 https://s3-media3.fl.yelpcdn.com/bphoto/6nFtxxpwtnU4v_Vic2Z-gg/o.jpg
##    is_closed
##    <lgl>    
##  1 FALSE    
##  2 FALSE    
##  3 FALSE    
##  4 FALSE    
##  5 FALSE    
##  6 FALSE    
##  7 FALSE    
##  8 FALSE    
##  9 FALSE    
## 10 FALSE    
##    url                                                                          
##    <chr>                                                                        
##  1 https://www.yelp.com/biz/first-watch-dunwoody-3?adjust_creative=D_azJCkzTpdR…
##  2 https://www.yelp.com/biz/j-christophers-atlanta-3?adjust_creative=D_azJCkzTp…
##  3 https://www.yelp.com/biz/bellas-g-kitchen-sandy-spring?adjust_creative=D_azJ…
##  4 https://www.yelp.com/biz/park-ave-cosmetic-center-roswell-3?adjust_creative=…
##  5 https://www.yelp.com/biz/atlanta-medical-aesthetics-atlanta?adjust_creative=…
##  6 https://www.yelp.com/biz/hydra-buckhead-atlanta-2?adjust_creative=D_azJCkzTp…
##  7 https://www.yelp.com/biz/hydra-virginia-highlands-atlanta-5?adjust_creative=…
##  8 https://www.yelp.com/biz/b-new-beauty-studios-atlanta?adjust_creative=D_azJC…
##  9 https://www.yelp.com/biz/lux-med-spa-atlanta?adjust_creative=D_azJCkzTpdR6H0…
## 10 https://www.yelp.com/biz/hydra-sandy-springs-sandy-springs-2?adjust_creative…
##    review_count categories   rating coordinates$latitude $longitude transactions
##           <int> <list>        <dbl>                <dbl>      <dbl> <list>      
##  1          268 <df [3 × 2]>    3.5                 33.9      -84.3 <chr [1]>   
##  2           88 <df [3 × 2]>    3                   33.9      -84.3 <chr [1]>   
##  3            5 <df [1 × 2]>    3.5                 33.9      -84.4 <chr [1]>   
##  4           35 <df [3 × 2]>    5                   34.0      -84.3 <list [0]>  
##  5           48 <df [2 × 2]>    5                   33.8      -84.4 <list [0]>  
##  6           63 <df [3 × 2]>    4                   33.8      -84.4 <list [0]>  
##  7           37 <df [2 × 2]>    4.5                 33.8      -84.4 <list [0]>  
##  8           10 <df [3 × 2]>    4.5                 33.8      -84.4 <list [0]>  
##  9           16 <df [3 × 2]>    4                   33.9      -84.4 <list [0]>  
## 10           11 <df [3 × 2]>    4                   33.9      -84.4 <list [0]>  
##    price location$address1          $address2  $address3 $city         $zip_code
##    <chr> <chr>                      <chr>      <chr>     <chr>         <chr>    
##  1 $$    1317 Dunwoody Village Pkwy "Ste 101"  ""        Dunwoody      30338    
##  2 $$    5482 Chamblee Dunwoody Rd  ""         ""        Atlanta       30338    
##  3 <NA>  6600 Peachtree Dunwoody Rd ""         ""        Sandy Spring  30328    
##  4 $$$   633 Holcomb Bridge Rd      "Ste A"    ""        Roswell       30076    
##  5 <NA>  77 12th St NE              "Loft 6"   <NA>      Atlanta       30309    
##  6 $$    2221 Peachtree Rd NE       "Ste Q"    ""        Atlanta       30309    
##  7 <NA>  675 N Highland Ave NE      "Ste 4000" <NA>      Atlanta       30306    
##  8 <NA>  1465 Howell Mill Rd 200A   "Ste 200A" <NA>      Atlanta       30318    
##  9 $$$   4684 Roswell Rd NE         "Ste A1"   ""        Atlanta       30342    
## 10 <NA>  6400 Blue Stone Rd         "Ste 120"  ""        Sandy Springs 30328    
##    $country $state $display_address phone        display_phone  distance
##    <chr>    <chr>  <list>           <chr>        <chr>             <dbl>
##  1 US       GA     <chr [3]>        +16784433447 (678) 443-3447     706.
##  2 US       GA     <chr [2]>        +17703951642 (770) 395-1642     831.
##  3 US       GA     <chr [2]>        +14043887873 (404) 388-7873    2400.
##  4 US       GA     <chr [3]>        +17702991493 (770) 299-1493    8856.
##  5 US       GA     <chr [3]>        +17706535173 (770) 653-5173   19301.
##  6 US       GA     <chr [3]>        +14049486780 (404) 948-6780   16179.
##  7 US       GA     <chr [3]>        +14046206915 (404) 620-6915   20180.
##  8 US       GA     <chr [3]>        +16788206955 (678) 820-6955   18947.
##  9 US       GA     <chr [3]>        +14043679005 (404) 367-9005    8705.
## 10 US       GA     <chr [3]>        +14049961135 (404) 996-1135    4989.
## # … with 1,519 more rows
## # ℹ Use `print(n = ...)` to see more rows

Step Seven: Let’s map the results

First, I’ll extract coordinates.

dek_yelp_sf <- yelp_all_dek %>% 
  mutate(x = .$coordinates$longitude,
         y = .$coordinates$latitude) %>% 
  filter(!is.na(x) & !is.na(y)) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326)

Now to visualize it using tmap.

tm_shape(dek_yelp_sf) +
  tm_dots(col = "review_count", style="quantile")

Q1. What’s the county and state of your choice? DeKalb County, GA

Q2.How many businesses are there in total? 1,529 total businesses.

Q3.How many businesses are there for each business category? There were 734 places that serve brunch and 795 places that were medical spas.

Q4. Upon visual inspection, can you see any noticeable spatial patterns to the way they are distributed across the county (e.g., clustering of businesses at some parts of the county)? I noticed clustering seems to be in and around what I think are the more affluent parts of the county. I don’t have a shapefile to confirm it, but this looks to especially be the case in/around Commission District 2 and incorporated areas–especially near the City of Atlanta and to the north. Historically, the move towards incorporation in the north of the county is driven by persistent NIMBYism in more affluent, predominantly white residents.

Q5.(Optional) Are there any other interesting findings? It’s not exactly a finding, but I think it’d be interesting to dig into demographics more. I’d like to look at things like housing stock characteristics, housing tenure, commute characteristics, etc. to see if I can figure out if these are places where the built environment is more walkable or more car-centered (It’s Atlanta, so I’m guessing the latter haha).Moreover, pulling crime statistics and looking at DUIs might be interesting. A lot of brunch places do bottomless cocktails and many spas will offer their patrons wine. If it’s a car-centric environment, then there might be instances where patrons are unwittingly over the limit and get behind the wheel.