languages <- read_csv(path_data) Languages spoken at home in the United States
Explore
glimpse(), from the dplyr package, provides a high-level overview of the data set.
glimpse(languages)Rows: 2,142
Columns: 5
$ geoid <chr> "04000US01", "04000US01", "04000US01", "04000US01", "04000US0…
$ state <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alaba…
$ language <chr> "Spanish", "French", "Haitian", "Italian", "Portuguese", "Ger…
$ speakers <dbl> 156752, 6515, 1100, 1345, 2546, 8607, 741, 781, 1995, 364, 33…
$ percent <dbl> 3.3336998407, 0.1385567933, 0.0233940864, 0.0286045874, 0.054…
The following code shows us the total number of at-home speakers of each language, sorted by the number of speakers. Page through the output to see the totals for other languages.
languages |>
group_by(language) |>
summarize(total = sum(speakers, na.rm = TRUE)) |>
arrange(desc(total))# A tibble: 42 × 2
language total
<chr> <dbl>
1 Spanish 41157140
2 Chinese 3460422
3 Tagalog 1723342
4 Vietnamese 1528461
5 Arabic 1305000
6 French 1210216
7 Korean 1085969
8 Russian 984024
9 Hindi 875078
10 German 871275
# ℹ 32 more rows
Spanish has the most at-home speakers of any non-English language.
Top languages spoken in MA
Spanish, Portuguese, and Chinese are the most common non-English languages spoken in Massachusetts. You can edit the code inside filter() to visualize a different state.
languages |>
filter(state == "Massachusetts") |> # Edit this line to explore a different state
mutate(language = fct_reorder(language, percent, .desc = TRUE)) |>
slice_max(percent, n = 9) |>
ggplot(aes(language, percent)) +
geom_col(fill = pal[[1]]) +
theme_minimal()How does Massachusetts compare to Connecticut?
languages |>
filter(state %in% c("Massachusetts", "Connecticut")) |> # Edit this line to compare other states
mutate(language = fct_reorder(language, percent, .desc = TRUE)) |>
slice_max(percent, n = 9) |>
ggplot(aes(language, percent, fill = state)) +
geom_col(position = "dodge") +
scale_fill_manual(values = pal[1:2]) +
theme_minimal() +
labs(title = "Languages spoken at home in CT and MA")States where Tagalog is spoken most commonly
languages |>
filter(language == "Tagalog") |> # Edit this line to explore a different language
slice_max(order_by = percent, n = 10) |>
mutate(state = fct_reorder(state, percent, .desc = TRUE)) |>
ggplot(aes(state, percent)) +
geom_col(fill = pal[[1]]) +
theme_minimal() +
labs(title = "Tagalog speakers by state")Most common language by state
In 47 states, Spanish is the most common non-English language spoken at home.
languages |>
group_by(state) |>
filter(speakers == max(speakers)) |>
ungroup() |>
count(language, sort = TRUE) |>
rename(`Most common language` = language)# A tibble: 4 × 2
`Most common language` n
<chr> <int>
1 Spanish 47
2 French 2
3 Ilocano, Samoan, Hawaiian, or other Austronesian languages 1
4 Other Native languages of North America 1
The only states where Spanish is not the most common language spoken are Alaska, Hawaii, Maine, and Vermont.
languages |>
group_by(state) |>
filter(speakers == max(speakers)) |>
ungroup() |>
filter(language != "Spanish")# A tibble: 4 × 5
geoid state language speakers percent
<chr> <chr> <chr> <dbl> <dbl>
1 04000US02 Alaska Other Native languages of North America 27126 3.96
2 04000US15 Hawaii Ilocano, Samoan, Hawaiian, or other Austro… 125233 9.17
3 04000US23 Maine French 33580 2.60
4 04000US50 Vermont French 8356 1.36
Two of these states (Alaska and Hawaii) states have multiple official languages:
Hawaii: English and Hawaiian (included in “Ilocano, Samoan, Hawaiian, or other Austronesian languages”)
Alaska: English and 20 Alaska Native languages, including Yup’ik and Inupiaq (included in “Other Native languages of North America”)
Maine and Vermont
Maine and Vermont both have a relatively large proportion of French speakers. Other common languages include German, Spanish, and Chinese.
languages |>
filter(
state %in% c("Maine", "Vermont"),
language %in% c("German", "Spanish", "Chinese", "French")
) |>
ggplot(aes(language, percent, fill = state)) +
geom_col() +
scale_fill_manual(values = pal[1:2]) +
facet_wrap(vars(state)) +
guides(fill = "none") +
theme_minimal()