Lab 04 - La Quinta is Spanish for next to Denny’s, Pt. 1

Insert your name here Insert date here

Load packages and data

library(tidyverse) 
library(dsbox)

states   <- read_csv("states.csv")

Exercise 1

# Exercise 1
dim(dennys)

## [1] 1643    6

the Denny’s dataset has 1228 rows and 8 columns. Each row represents one Denny’s location. The variables are address, city, state, zip, longitude, and latitude.

Exercise 2

# Exercise 2
dim(laquinta)

## [1] 909   6

the La Quinta dataset has 909 rows and 7 columns. Each row represents one La Quinta location. The variables are address, city, state, zip, longitude, and latitude.

Exercise 3

La Quinta has locations outside the US in Canada and Colombia. Denny’s only has locations in the United States.

Exercise 4

One way is to look at the unique values in the state column and see if there are any codes that are not normal US state abbreviations.

Exercise 5

# Exercise 5
dennys %>%
  filter(!(state %in% states$abbreviation))

## # A tibble: 0 × 6
## # ℹ 6 variables: address <chr>, city <chr>, state <chr>, zip <chr>,
## #   longitude <dbl>, latitude <dbl>

There are no Denny’s locations outside the US.

Exercise 6

# Exercise 6
dennys <- dennys %>%
  mutate(country = "United States")

Exercise 7

# Exercise 7
laquinta %>%
  filter(!(state %in% states$abbreviation)) %>%
  count(state)

## # A tibble: 11 × 2
##    state     n
##    <chr> <int>
##  1 AG        1
##  2 ANT       1
##  3 BC        1
##  4 CH        1
##  5 FM        1
##  6 NL        3
##  7 ON        1
##  8 PU        2
##  9 QR        1
## 10 SL        1
## 11 VE        1

Exercise 8

laquinta %>%
  mutate(country = case_when(
    state %in% states$abbreviation ~ "United States",
    state %in% c("ON", "BC") ~ "Canada",
    state == "ANT"           ~ "Colombia",
    state %in% c("AG" , "QR" , "CH" , "NL" , "VE" , "PU" , "SL") ~ "Mexico"
  )) %>%
  filter(country == "United States")

## # A tibble: 895 × 7
##    address                         city   state zip   longitude latitude country
##    <chr>                           <chr>  <chr> <chr>     <dbl>    <dbl> <chr>  
##  1 793 W. Bel Air Avenue           "\nAb… MD    21001     -76.2     39.5 United…
##  2 3018 CatClaw Dr                 "\nAb… TX    79606     -99.8     32.4 United…
##  3 3501 West Lake Rd               "\nAb… TX    79601     -99.7     32.5 United…
##  4 184 North Point Way             "\nAc… GA    30102     -84.7     34.1 United…
##  5 2828 East Arlington Street      "\nAd… OK    74820     -96.6     34.8 United…
##  6 14925 Landmark Blvd             "\nAd… TX    75254     -96.8     33.0 United…
##  7 909 East Frontage Rd            "\nAl… TX    78516     -98.1     26.2 United…
##  8 2116 Yale Blvd Southeast        "\nAl… NM    87106    -107.      35.1 United…
##  9 7439 Pan American Fwy Northeast "\nAl… NM    87109    -107.      35.2 United…
## 10 2011 Menaul Blvd Northeast      "\nAl… NM    87107    -107.      35.1 United…
## # ℹ 885 more rows

Exercise 9

dennys %>% count(state, sort = TRUE)

## # A tibble: 51 × 2
##    state     n
##    <chr> <int>
##  1 CA      403
##  2 TX      200
##  3 FL      140
##  4 AZ       83
##  5 IL       56
##  6 NY       56
##  7 WA       49
##  8 OH       44
##  9 MO       42
## 10 PA       40
## # ℹ 41 more rows

laquinta %>% count(state, sort = TRUE)

## # A tibble: 59 × 2
##    state     n
##    <chr> <int>
##  1 TX      237
##  2 FL       74
##  3 CA       56
##  4 GA       41
##  5 TN       30
##  6 OK       29
##  7 LA       28
##  8 CO       27
##  9 NM       19
## 10 NY       19
## # ℹ 49 more rows

Texas and California have the most Denny’s and La Quinta locations. This is not surprising because they are the two biggest states.

Exercise 10

dennys <- dennys %>%
  mutate(establishment = "Denny's")
laquinta <- laquinta %>%
  mutate(establishment = "La Quinta")

dn_lq <- bind_rows(dennys, laquinta)

ggplot(dn_lq, mapping = aes(x = longitude, y = latitude, color = establishment)) +
  geom_point()

For Denny’s and La Quinta per thousand square miles, smaller states like New Jersey and Rhode Island usually rank the highest.

Exercise 11

# Combine datasets
dn <- dennys %>%
  mutate(establishment = "Denny's")

lq <- laquinta %>%
  mutate(establishment = "La Quinta")

dn_lq <- bind_rows(dn, lq)

dn_lq %>%
  filter(state == "NC") %>%
  ggplot(aes(x = longitude, y = latitude, color = establishment)) +
  geom_point(alpha = 0.5, size = 2) +
  labs(title = "Denny's and La Quinta Locations in North Carolina",
       x = "Longitude",
       y = "Latitude",
       color = "Establishment") +
  theme_minimal() +
  theme(legend.position = "bottom")

Yes, the joke appears to hold in North Carolina because many locations are close together.

Exercise 12

# Exercise 12 - Texas
dn_lq %>%
  filter(state == "TX") %>%
  ggplot(aes(x = longitude, y = latitude, color = establishment)) +
  geom_point(alpha = 0.5, size = 2) +
  labs(title = "Denny's and La Quinta in Texas",
       x = "Longitude", y = "Latitude") +
  theme_minimal()

Yes, the joke appears to hold very well in Texas. There is a lot of clustering between Denny’s and La Quinta.