Insert your name here Insert date here
## [1] 1643 6
the Denny’s dataset has 1228 rows and 8 columns. Each row represents one Denny’s location. The variables are address, city, state, zip, longitude, and latitude.
## [1] 909 6
the La Quinta dataset has 909 rows and 7 columns. Each row represents one La Quinta location. The variables are address, city, state, zip, longitude, and latitude.
La Quinta has locations outside the US in Canada and Colombia. Denny’s only has locations in the United States.
One way is to look at the unique values in the state column and see if there are any codes that are not normal US state abbreviations.
## # A tibble: 0 × 6
## # ℹ 6 variables: address <chr>, city <chr>, state <chr>, zip <chr>,
## # longitude <dbl>, latitude <dbl>
There are no Denny’s locations outside the US.
## # A tibble: 11 × 2
## state n
## <chr> <int>
## 1 AG 1
## 2 ANT 1
## 3 BC 1
## 4 CH 1
## 5 FM 1
## 6 NL 3
## 7 ON 1
## 8 PU 2
## 9 QR 1
## 10 SL 1
## 11 VE 1
laquinta %>%
mutate(country = case_when(
state %in% states$abbreviation ~ "United States",
state %in% c("ON", "BC") ~ "Canada",
state == "ANT" ~ "Colombia",
state %in% c("AG" , "QR" , "CH" , "NL" , "VE" , "PU" , "SL") ~ "Mexico"
)) %>%
filter(country == "United States")## # A tibble: 895 × 7
## address city state zip longitude latitude country
## <chr> <chr> <chr> <chr> <dbl> <dbl> <chr>
## 1 793 W. Bel Air Avenue "\nAb… MD 21001 -76.2 39.5 United…
## 2 3018 CatClaw Dr "\nAb… TX 79606 -99.8 32.4 United…
## 3 3501 West Lake Rd "\nAb… TX 79601 -99.7 32.5 United…
## 4 184 North Point Way "\nAc… GA 30102 -84.7 34.1 United…
## 5 2828 East Arlington Street "\nAd… OK 74820 -96.6 34.8 United…
## 6 14925 Landmark Blvd "\nAd… TX 75254 -96.8 33.0 United…
## 7 909 East Frontage Rd "\nAl… TX 78516 -98.1 26.2 United…
## 8 2116 Yale Blvd Southeast "\nAl… NM 87106 -107. 35.1 United…
## 9 7439 Pan American Fwy Northeast "\nAl… NM 87109 -107. 35.2 United…
## 10 2011 Menaul Blvd Northeast "\nAl… NM 87107 -107. 35.1 United…
## # ℹ 885 more rows
## # A tibble: 51 × 2
## state n
## <chr> <int>
## 1 CA 403
## 2 TX 200
## 3 FL 140
## 4 AZ 83
## 5 IL 56
## 6 NY 56
## 7 WA 49
## 8 OH 44
## 9 MO 42
## 10 PA 40
## # ℹ 41 more rows
## # A tibble: 59 × 2
## state n
## <chr> <int>
## 1 TX 237
## 2 FL 74
## 3 CA 56
## 4 GA 41
## 5 TN 30
## 6 OK 29
## 7 LA 28
## 8 CO 27
## 9 NM 19
## 10 NY 19
## # ℹ 49 more rows
Texas and California have the most Denny’s and La Quinta locations. This is not surprising because they are the two biggest states.
dennys <- dennys %>%
mutate(establishment = "Denny's")
laquinta <- laquinta %>%
mutate(establishment = "La Quinta")For Denny’s and La Quinta per thousand square miles, smaller states like New Jersey and Rhode Island usually rank the highest.
# Combine datasets
dn <- dennys %>%
mutate(establishment = "Denny's")
lq <- laquinta %>%
mutate(establishment = "La Quinta")
dn_lq <- bind_rows(dn, lq)dn_lq %>%
filter(state == "NC") %>%
ggplot(aes(x = longitude, y = latitude, color = establishment)) +
geom_point(alpha = 0.5, size = 2) +
labs(title = "Denny's and La Quinta Locations in North Carolina",
x = "Longitude",
y = "Latitude",
color = "Establishment") +
theme_minimal() +
theme(legend.position = "bottom")
Yes, the joke appears to hold in North Carolina because many locations
are close together.
# Exercise 12 - Texas
dn_lq %>%
filter(state == "TX") %>%
ggplot(aes(x = longitude, y = latitude, color = establishment)) +
geom_point(alpha = 0.5, size = 2) +
labs(title = "Denny's and La Quinta in Texas",
x = "Longitude", y = "Latitude") +
theme_minimal()
Yes, the joke appears to hold very well in Texas. There is a lot of
clustering between Denny’s and La Quinta.