Languages spoken at home in the United States

languages <- read_csv(path_data) 

Explore

glimpse(), from the dplyr package, provides a high-level overview of the data set.

glimpse(languages)
Rows: 2,142
Columns: 5
$ geoid    <chr> "04000US01", "04000US01", "04000US01", "04000US01", "04000US0…
$ state    <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alaba…
$ language <chr> "Spanish", "French", "Haitian", "Italian", "Portuguese", "Ger…
$ speakers <dbl> 156752, 6515, 1100, 1345, 2546, 8607, 741, 781, 1995, 364, 33…
$ percent  <dbl> 3.3336998407, 0.1385567933, 0.0233940864, 0.0286045874, 0.054…

The following code shows us the total number of at-home speakers of each language, sorted by the number of speakers. Page through the output to see the totals for other languages.

languages |>
  group_by(language) |>
  summarize(total = sum(speakers, na.rm = TRUE)) |>
  arrange(desc(total))
# A tibble: 42 × 2
   language      total
   <chr>         <dbl>
 1 Spanish    41157140
 2 Chinese     3460422
 3 Tagalog     1723342
 4 Vietnamese  1528461
 5 Arabic      1305000
 6 French      1210216
 7 Korean      1085969
 8 Russian      984024
 9 Hindi        875078
10 German       871275
# ℹ 32 more rows

Spanish has the most at-home speakers of any non-English language.

Top languages spoken in MA

Spanish, Portuguese, and Chinese are the most common non-English languages spoken in Massachusetts. You can edit the code inside filter() to visualize a different state.

languages |>
  filter(state == "Massachusetts") |> # Edit this line to explore a different state
  mutate(language = fct_reorder(language, percent, .desc = TRUE)) |>
  slice_max(percent, n = 9) |>
  ggplot(aes(language, percent)) +
  geom_col(fill = pal[[1]]) +
  theme_minimal()

How does Massachusetts compare to Connecticut?

languages |>
  filter(state %in% c("Massachusetts", "Connecticut")) |> # Edit this line to compare other states
  mutate(language = fct_reorder(language, percent, .desc = TRUE)) |>
  slice_max(percent, n = 9) |>
  ggplot(aes(language, percent, fill = state)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = pal[1:2]) +
  theme_minimal() +
  labs(title = "Languages spoken at home in CT and MA")

States where Tagalog is spoken most commonly

languages |> 
  filter(language == "Tagalog") |> # Edit this line to explore a different language
  slice_max(order_by = percent, n = 10) |> 
  mutate(state = fct_reorder(state, percent, .desc = TRUE)) |> 
  ggplot(aes(state, percent)) +
  geom_col(fill = pal[[1]]) +
  theme_minimal() +
  labs(title = "Tagalog speakers by state")

Most common language by state

In 47 states, Spanish is the most common non-English language spoken at home.

languages |> 
  group_by(state) |> 
  filter(speakers == max(speakers)) |> 
  ungroup() |> 
  count(language, sort = TRUE) |> 
  rename(`Most common language` = language)
# A tibble: 4 × 2
  `Most common language`                                         n
  <chr>                                                      <int>
1 Spanish                                                       47
2 French                                                         2
3 Ilocano, Samoan, Hawaiian, or other Austronesian languages     1
4 Other Native languages of North America                        1

The only states where Spanish is not the most common language spoken are Alaska, Hawaii, Maine, and Vermont.

languages |> 
  group_by(state) |> 
  filter(speakers == max(speakers)) |> 
  ungroup() |> 
  filter(language != "Spanish")
# A tibble: 4 × 5
  geoid     state   language                                    speakers percent
  <chr>     <chr>   <chr>                                          <dbl>   <dbl>
1 04000US02 Alaska  Other Native languages of North America        27126    3.96
2 04000US15 Hawaii  Ilocano, Samoan, Hawaiian, or other Austro…   125233    9.17
3 04000US23 Maine   French                                         33580    2.60
4 04000US50 Vermont French                                          8356    1.36

Two of these states (Alaska and Hawaii) states have multiple official languages:

  • Hawaii: English and Hawaiian (included in “Ilocano, Samoan, Hawaiian, or other Austronesian languages”)

  • Alaska: English and 20 Alaska Native languages, including Yup’ik and Inupiaq (included in “Other Native languages of North America”)

Maine and Vermont

Maine and Vermont both have a relatively large proportion of French speakers. Other common languages include German, Spanish, and Chinese.

languages |> 
  filter(
    state %in% c("Maine", "Vermont"),
    language %in% c("German", "Spanish", "Chinese", "French") 
  ) |> 
  ggplot(aes(language, percent, fill = state)) +
  geom_col() +
  scale_fill_manual(values = pal[1:2]) +
  facet_wrap(vars(state)) +
  guides(fill = "none") +
  theme_minimal()