Assignment 6

Author

Patrick Weed

Go to the shared posit.cloud workspace for this class and open the assign06 project. Open the assign06.qmd file and complete the exercises.

This is a very open-ended assignment. There are three musts:

  1. You must use the tidycensus package to get either decennial or ACS data from the US Census Bureau.

  2. You must get data for two different variables and they can’t be population or median home values.

  3. You must show all the code you used to get the data and create the table or chart.

You can then either create a cool table or chart comparing the two variables. They can be from any region and for any geography…it doesn’t necessarily need to be Maine.

Note: you will receive deductions for not using tidyverse syntax in this assignment. That includes the use of filter, mutate, and the up-to-date pipe operator |>.

The Grading Rubric is available at the end of this document.

We’ll preload the following potentially useful packages

library(tidyverse)
library(tidycensus)
library(gapminder)
library(gt)
library(gtExtras)
library(scales)

This is your work area. Add as many code cells as you need.

# Load necessary libraries
library(tidyverse)
library(tidycensus)
library(gt)
library(gtExtras)
library(scales)

# Set your Census API key
census_api_key("117ff616a54987be0bbdfd35924f91b14b19fd38", overwrite = TRUE)
To install your API key for use in future sessions, run this function with `install = TRUE`.
# Define New England states
new_england_states <- c("CT", "ME", "MA", "NH", "RI", "VT")

# Get ACS data for the desired variables for New England
education_health_data <- get_acs(
  geography = "state",
  variables = c(
    bachelors_degree = "B15003_022",  # Bachelor's degree or higher
    uninsured_rate = "B27001_004"     # No health insurance coverage
  ),
  year = 2021,
  survey = "acs1",
  state = new_england_states
)
Getting data from the 2021 1-year ACS
The 1-year ACS provides data for geographies with populations of 65,000 and greater.
# Clean and reshape the data
education_health_data <- education_health_data |>
  select(GEOID, NAME, variable, estimate) |>
  pivot_wider(names_from = variable, values_from = estimate) |>
  rename(
    Bachelors_Degree_Percent = bachelors_degree,
    Uninsured_Rate_Percent = uninsured_rate
  )

# Create a comparison table
comparison_table <- education_health_data |>
  select(NAME, Bachelors_Degree_Percent, Uninsured_Rate_Percent)

# Display the comparison table using gt
comparison_table |>
  gt() |>
  tab_header(
    title = "Comparison of Bachelor's Degree Percentage and Uninsured Rate in New England (2021)"
  ) |>
  fmt_percent(columns = vars(Bachelors_Degree_Percent, Uninsured_Rate_Percent), decimals = 1) |>
  cols_label(
    Bachelors_Degree_Percent = "Bachelor's Degree (%)",
    Uninsured_Rate_Percent = "Uninsured Rate (%)"
  )
Warning: Since gt v0.3.0, `columns = vars(...)` has been deprecated.
• Please use `columns = c(...)` instead.
Comparison of Bachelor's Degree Percentage and Uninsured Rate in New England (2021)
NAME Bachelor's Degree (%) Uninsured Rate (%)
Connecticut 58,769,000.0% 10,809,900.0%
Maine 22,498,700.0% 3,629,100.0%
Massachusetts 125,074,800.0% 20,926,900.0%
New Hampshire 24,735,600.0% 3,705,900.0%
Rhode Island 16,159,400.0% 3,208,700.0%
Vermont 12,037,300.0% 1,570,100.0%
# Create a scatter plot to compare the two variables
ggplot(comparison_table, aes(x = Bachelors_Degree_Percent, y = Uninsured_Rate_Percent, label = NAME)) +
  geom_point(color = "blue", size = 3) +
  geom_text(vjust = -0.5, hjust = 0.5, size = 3) +
  labs(
    title = "Bachelor's Degree Percentage vs. Uninsured Rate in New England (2021)",
    x = "Percentage with Bachelor's Degree",
    y = "Percentage Uninsured"
  ) +
  theme_minimal()

Submission

To submit your assignment:

  • Change the author name to your name in the YAML portion at the top of this document
  • Render your document to html and publish it to RPubs.
  • Submit the link to your Rpubs document in the Brightspace comments section for this assignment.
  • Click on the “Add a File” button and upload your .qmd file for this assignment to Brightspace.

Grading Rubric

Item
(percent overall)
100% - flawless 67% - minor issues 33% - moderate issues 0% - major issues or not attempted
Chart or table accuracy.
(45%)
No errors, good labels, everything is clearly visible in the rendered document.
At least two valid variables used from US census data (can be census or ACS)
(40%)
Messages and/or errors suppressed from rendered document and all code is shown.
(7%)
Submitted properly to Brightspace
(8%)
NA NA You must submit according to instructions to receive any credit for this portion.