Assignment 6

Author

Colby Chavarie

Open the assign06.qmd file and complete the exercises.

This is a very open-ended assignment. There are three musts:

  1. You must use the tidycensus package to get either decennial or ACS data from the US Census Bureau.

  2. You must get data for two different variables and they can’t be population or median home values.

  3. You must show all the code you used to get the data and create the table or chart.

You can then either create a cool table or chart comparing the two variables. They can be from any region and for any geography…it doesn’t necessarily need to be Maine.

Note: you will receive deductions for not using tidyverse syntax in this assignment. That includes the use of filter, mutate, and the up-to-date pipe operator |>.

The Grading Rubric is available at the end of this document.

We’ll preload the following potentially useful packages

library(tidyverse)
library(tidycensus)
library(gapminder)
library(gt)
library(gtExtras)
library(scales)

This is your work area. Add as many code cells as you need.

# Load tidycensus and tidyverse
library(tidycensus)
library(tidyverse)

# Set your Census API key
census_api_key("0e9cc6b566c8745aa2019ab9e6c6f9ae11cec078")
To install your API key for use in future sessions, run this function with `install = TRUE`.
# Choose the variables
# B23025_003: Employment status (employed, population 16 years and over)
# B15003_022: Percent of people with a Bachelor's degree

# Fetch the data
acs_data <- get_acs(
  geography = "state", 
  variables = c("B23025_003", "B15003_022"),
  year = 2021, 
  survey = "acs5"
)
Getting data from the 2017-2021 5-year ACS
# View the first few rows of the data
head(acs_data)
# A tibble: 6 × 5
  GEOID NAME    variable   estimate   moe
  <chr> <chr>   <chr>         <dbl> <dbl>
1 01    Alabama B15003_022   563628  5772
2 01    Alabama B23025_003  2298013  8669
3 02    Alaska  B15003_022    92691  2130
4 02    Alaska  B23025_003   363718  2386
5 04    Arizona B15003_022   923339  9073
6 04    Arizona B23025_003  3401906  7579
# Clean and manipulate the data
acs_data_clean <- acs_data %>%
  # Select relevant columns and rename them for clarity
  select(state = NAME, variable = variable, estimate) %>%
  # Create a label for the variables
  mutate(variable_label = case_when(
    variable == "B23025_003" ~ "Employed (16 years and over)",
    variable == "B15003_022" ~ "Percent with Bachelor's Degree"
  ))

# View the cleaned data
head(acs_data_clean)
# A tibble: 6 × 4
  state   variable   estimate variable_label                
  <chr>   <chr>         <dbl> <chr>                         
1 Alabama B15003_022   563628 Percent with Bachelor's Degree
2 Alabama B23025_003  2298013 Employed (16 years and over)  
3 Alaska  B15003_022    92691 Percent with Bachelor's Degree
4 Alaska  B23025_003   363718 Employed (16 years and over)  
5 Arizona B15003_022   923339 Percent with Bachelor's Degree
6 Arizona B23025_003  3401906 Employed (16 years and over)  
# Create a bar plot comparing employment status and percent with a Bachelor's degree
ggplot(acs_data_clean, aes(x = state, y = estimate, fill = variable_label)) +
  geom_bar(stat = "identity", position = "dodge") +
  coord_flip() + 
  labs(title = "Employed (16 years and over) vs Percent with Bachelor's Degree by State",
       x = "State",
       y = "Estimate") +
  theme_minimal() +
  scale_fill_manual(values = c("blue", "green"))

# Create a table using gt
acs_data_clean %>%
  pivot_wider(names_from = variable_label, values_from = estimate) %>%
  gt() %>%
  tab_header(
    title = "ACS Data: Employment Status vs Percent with Bachelor's Degree"
  ) %>%
  tab_spanner(
    label = "Employment & Education",
    columns = c("Employed (16 years and over)", "Percent with Bachelor's Degree")
  )
ACS Data: Employment Status vs Percent with Bachelor's Degree
state variable
Employment & Education
Employed (16 years and over) Percent with Bachelor's Degree
Alabama B15003_022 NA 563628
Alabama B23025_003 2298013 NA
Alaska B15003_022 NA 92691
Alaska B23025_003 363718 NA
Arizona B15003_022 NA 923339
Arizona B23025_003 3401906 NA
Arkansas B15003_022 NA 313527
Arkansas B23025_003 1384596 NA
California B15003_022 NA 5855383
California B23025_003 19980462 NA
Colorado B15003_022 NA 1051023
Colorado B23025_003 3120868 NA
Connecticut B15003_022 NA 561567
Connecticut B23025_003 1940626 NA
Delaware B15003_022 NA 134252
Delaware B23025_003 492450 NA
District of Columbia B15003_022 NA 124285
District of Columbia B23025_003 402460 NA
Florida B15003_022 NA 3038293
Florida B23025_003 10377036 NA
Georgia B15003_022 NA 1426415
Georgia B23025_003 5274596 NA
Hawaii B15003_022 NA 226399
Hawaii B23025_003 717453 NA
Idaho B15003_022 NA 231259
Idaho B23025_003 883059 NA
Illinois B15003_022 NA 1910757
Illinois B23025_003 6686514 NA
Indiana B15003_022 NA 797977
Indiana B23025_003 3411413 NA
Iowa B15003_022 NA 423852
Iowa B23025_003 1686696 NA
Kansas B15003_022 NA 415201
Kansas B23025_003 1512063 NA
Kentucky B15003_022 NA 461841
Kentucky B23025_003 2121880 NA
Louisiana B15003_022 NA 511447
Louisiana B23025_003 2160206 NA
Maine B15003_022 NA 209253
Maine B23025_003 711350 NA
Maryland B15003_022 NA 934036
Maryland B23025_003 3296484 NA
Massachusetts B15003_022 NA 1215939
Massachusetts B23025_003 3876978 NA
Michigan B15003_022 NA 1287856
Michigan B23025_003 5002960 NA
Minnesota B15003_022 NA 944751
Minnesota B23025_003 3105784 NA
Mississippi B15003_022 NA 280355
Mississippi B23025_003 1331967 NA
Missouri B15003_022 NA 789957
Missouri B23025_003 3084786 NA
Montana B15003_022 NA 166825
Montana B23025_003 548944 NA
Nebraska B15003_022 NA 274664
Nebraska B23025_003 1046463 NA
Nevada B15003_022 NA 359703
Nevada B23025_003 1538959 NA
New Hampshire B15003_022 NA 230314
New Hampshire B23025_003 767453 NA
New Jersey B15003_022 NA 1611515
New Jersey B23025_003 4893875 NA
New Mexico B15003_022 NA 225538
New Mexico B23025_003 952564 NA
New York B15003_022 NA 2996306
New York B23025_003 10306430 NA
North Carolina B15003_022 NA 1481848
North Carolina B23025_003 5119397 NA
North Dakota B15003_022 NA 112023
North Dakota B23025_003 416764 NA
Ohio B15003_022 NA 1483021
Ohio B23025_003 5970869 NA
Oklahoma B15003_022 NA 457256
Oklahoma B23025_003 1881598 NA
Oregon B15003_022 NA 644813
Oregon B23025_003 2146693 NA
Pennsylvania B15003_022 NA 1813647
Pennsylvania B23025_003 6662890 NA
Rhode Island B15003_022 NA 160523
Rhode Island B23025_003 588135 NA
South Carolina B15003_022 NA 653988
South Carolina B23025_003 2444002 NA
South Dakota B15003_022 NA 119331
South Dakota B23025_003 463198 NA
Tennessee B15003_022 NA 859255
Tennessee B23025_003 3380708 NA
Texas B15003_022 NA 3791665
Texas B23025_003 14390216 NA
Utah B15003_022 NA 450953
Utah B23025_003 1648313 NA
Vermont B15003_022 NA 110000
Vermont B23025_003 348907 NA
Virginia B15003_022 NA 1338831
Virginia B23025_003 4422588 NA
Washington B15003_022 NA 1217575
Washington B23025_003 3899915 NA
West Virginia B15003_022 NA 165975
West Virginia B23025_003 786365 NA
Wisconsin B15003_022 NA 833670
Wisconsin B23025_003 3123629 NA
Wyoming B15003_022 NA 69809
Wyoming B23025_003 297398 NA
Puerto Rico B15003_022 NA 469856
Puerto Rico B23025_003 1236011 NA

Submission

To submit your assignment:

  • Change the author name to your name in the YAML portion at the top of this document
  • Render your document to html and publish it to RPubs.
  • Submit the link to your Rpubs document in the Brightspace comments section for this assignment.
  • Click on the “Add a File” button and upload your .qmd file for this assignment to Brightspace.

Grading Rubric

Item
(percent overall)
100% - flawless 67% - minor issues 33% - moderate issues 0% - major issues or not attempted
Chart or table accuracy.
(45%)
No errors, good labels, everything is clearly visible in the rendered document.
At least two valid variables used from US census data (can be census or ACS)
(40%)
Messages and/or errors suppressed from rendered document and all code is shown.
(7%)
Submitted properly to Brightspace
(8%)
NA NA You must submit according to instructions to receive any credit for this portion.