Lab 04 - La Quinta is Spanish for next to Denny’s, Pt. 2

Brady May 5/20/26

Load packages and data

library(tidyverse) 
library(dsbox) 
states <- read_csv("states.csv")
  1. Filter the Dennys dataframe for Alaska and save the result as dn_ak. How many Dennys locations are there in Alaska
dn_ak <- dennys %>%
  filter(state == "AK")
nrow(dn_ak)
## [1] 3

There are three Dennys locations in Alaska

  1. Filter the La Quinta dataframe for Alaska (AK) and save the result as lq_ak. How many La Quinta locations are there in Alaska?
lq_ak <- laquinta %>%
  filter(state == "AK")
nrow(lq_ak)
## [1] 2

There are two La Quinta locations in Alaska

  1. How many pairings are there between all Dennys and all La Quinta locations in Alaska? how many distances do we need to calculate between the locations of these establishments in Alaska?

3 x 2 = 6. This represents the 6 possible pairings. We also must find the 6 distances of these pairings.

  1. How many observations are in the joined dn_lq_ak data frame? What are the names of the variables in this data frame.
dn_lq_ak <- full_join(dn_ak, lq_ak, by = "state")
## Warning in full_join(dn_ak, lq_ak, by = "state"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
dn_lq_ak
## # A tibble: 6 × 11
##   address.x     city.x state zip.x longitude.x latitude.x address.y city.y zip.y
##   <chr>         <chr>  <chr> <chr>       <dbl>      <dbl> <chr>     <chr>  <chr>
## 1 2900 Denali   Ancho… AK    99503       -150.       61.2 3501 Min… "\nAn… 99503
## 2 2900 Denali   Ancho… AK    99503       -150.       61.2 4920 Dal… "\nFa… 99709
## 3 3850 Debarr … Ancho… AK    99508       -150.       61.2 3501 Min… "\nAn… 99503
## 4 3850 Debarr … Ancho… AK    99508       -150.       61.2 4920 Dal… "\nFa… 99709
## 5 1929 Airport… Fairb… AK    99701       -148.       64.8 3501 Min… "\nAn… 99503
## 6 1929 Airport… Fairb… AK    99701       -148.       64.8 4920 Dal… "\nFa… 99709
## # ℹ 2 more variables: longitude.y <dbl>, latitude.y <dbl>
  1. What function from the tidyverse do we use the add a new variable to a data frame while keeping the existing variables?

we use mutate

  1. Calculate the distances between all pairs of Denny’s and La Quinta locations and save this variable as distance. Make sure to save this variable in THE dn_lq_ak data frame so that you can use it later.
haversine <- function(long1, lat1, long2, lat2, round = 3) {
  # convert to radians
  long1 = long1 * pi / 180
  lat1  = lat1  * pi / 180
  long2 = long2 * pi / 180
  lat2  = lat2  * pi / 180
  
  R = 6371 # Earth mean radius in km
  
  a = sin((lat2 - lat1)/2)^2 + cos(lat1) * cos(lat2) * sin((long2 - long1)/2)^2
  d = R * 2 * asin(sqrt(a))
  
  return( round(d,round) ) # distance in km
}
dn_lq_ak <- dn_lq_ak %>%
  mutate(distance = haversine(longitude.x, latitude.x, longitude.y, latitude.y))

Using mutate, the function create distance in the dn_lq_ak date frame. This accounts for the haversine distance between Dennys and La Quinta location pairs in Alaska.

  1. Calculate the minimum distance between a Denny’s and La Quinta for each Denny’s location. To do so we group by Denny’s locations and calculate a new variable that stores the information for the minimum distance.
dn_lq_ak_mindlist <- dn_lq_ak %>%
  group_by(address.x) %>%
  summarise(closest = min(distance))
  1. Describe the distribution of the distances Denny’s and the nearest La Quinta locations in Alaska. Also include an appripriate visualization and relevant summary statistics.
dn_lq_ak %>%
  filter(dn_lq_ak$distance %in% dn_lq_ak_mindlist$closest) %>%
  ggplot() +
  geom_point(mapping = aes(
    x = longitude.x,
    y = latitude.x,
    color = "establishment.x"
  )) +
geom_point(mapping = aes(
  x = longitude.y,
  y = latitude.y,
  color = "establishment.y"
)) +
  labs(
    title = "Dennys and Laquinta Locations",
    subtitle = "In Alaska",
    x = "Longitude of Establishments",
    y = "Latitude of Establishments",
    color = "Establishment"
  )

They are very spread out.

  1. Repeat the same analysis for North Carolina: (i) filter Denny’s and La Quinta Data Frames for NC, (ii) join these data frames to get a completelist of all possible pairings, (iii) calculate the distances between all possible pairings of Denny’s and La Quinta in NC, (iv) find the minimum distance between each Denny’s and La Quinta location, (v) visualize and describe the distribution of these shortest distances using appropriate summary statistics.
dn_nc <- dennys %>% filter(state == "NC")
lq_nc <- laquinta %>% filter(state == "NC")

dn_tx <- dennys %>% filter(state == "TX")
lq_tx <- laquinta %>% filter(state == "TX")
ggplot() +
  geom_point(data = dn_nc, aes(x = longitude, y = latitude, color = "Dennys"),
             size = 2, alpha = 0.5) +
  geom_point(data = lq_nc, aes(x = longitude, y = latitude, color = "La Quinta"),
             size = 2, alpha = 0.5) +
  labs(
    title = "Dennys and La Quinta locations in North Carolina",
    x = "Longitude",
    y = "Latitude",
    color = "Establishment" +
  theme_minimal()
  )

  1. Repeat the same analysis for Texas.
ggplot() +
  geom_point(data = dn_tx, aes(x = longitude, y = latitude, color = "Denny's"),
             size = 2, alpha = 0.5) +
  geom_point(data = lq_tx, aes(x = longitude, y = latitude, color = "La Quinta"),
             size = 2, alpha = 0.5) +
  labs(
    title = "Denny's and La Quinta Locations in Texas",
    x = "Longitude",
    y = "Latitude",
    color = "Establishment"
  ) +
  theme_minimal()  

  1. Repeat the same analysis for a state of your choosing, different than the ones we covered so far.
dn_ca <- dennys %>% filter(state == "CA")
lq_ca <- laquinta %>% filter(state == "CA")

ggplot() +
  geom_point(data = dn_ca, aes(x = longitude, y = latitude, color = "Dennys"),
             size = 2, alpha = 0.5) +
  geom_point(data = lq_ca, aes(x = longitude, y = latitude, color = "La Quinta"),
             size = 2, alpha = 0.5) +
  labs(
    title = "Dennys and La Quinta Locations in California",
    x = "Longitude",
    y = "Latitude",
    color = "Establishment") +
  theme_minimal()

  1. Among the states you examined, where is Mitch Hedberg’s joke most likely to hold true? Explain your reasoning.

Texas has the smallest distances overall. This means Dennys and La Quinta are usually right next to each other. California is also close but Texas shows the strongest data.