Class 12

Class 13: Final Project Lab 1

Class 14: Final Project Lab 2

Class 15: Final Project Presentations





New functions and concepts




Data analysis and visualization project examples and frameworks

American Community Survey Tables Types


In the American Community Survey, you will most commonly want to use Detailed Tables and Subject Tables:


Educational Attainment data example of Detailed tables

library(tidycensus)
library(tidyverse)

# load in acs variables
acs_vars <- load_variables(2019, "acs1", cache = T)

# import and process population data
raw_city_pop <- get_acs(survey = "acs1", 
                         geography = "place", 
                         variables = "B01003_001",
                         year = "2019")

city_pop <- raw_city_pop %>% 
  rename(pop = estimate) %>% 
  select(GEOID, pop)

# select the variable in the C15003 table and create a variable - label key
c15003_vars <- acs_vars %>% 
  filter(grepl("C15003", name)) %>% 
  mutate(label = str_replace_all(label, "Estimate!!", ""), # use string replace to make the label more readable
         label = str_replace_all(label, ":!!", "_"), # and usable for a column header if we want
         label = str_replace_all(label, " ", "_"),
         label = str_replace_all(label, "[',]", "")) %>% 
  rename(variable = name) %>% 
  mutate(concept = tolower(concept))

# select the variable in the B15003 table and create a variable - label key
b15003_vars <- acs_vars %>% 
  filter(grepl("B15003", name)) %>% 
  mutate(label = str_replace_all(label, "Estimate!!", ""),
         label = str_replace_all(label, ":!!", "_"),
         label = str_replace_all(label, " ", "_"),
         label = str_replace_all(label, "[',]", "")) %>% 
  rename(variable = name) %>% 
  mutate(concept = tolower(concept))

# import B15003 table for cities to see if this table works for my analysis
b_education_raw <- get_acs(survey = "acs1", 
                       geography = "place", 
                       table = "B15003",
                       year = "2019")

b_education <- b_education_raw %>% 
  left_join(b15003_vars, by = "variable") %>% 
  select(GEOID, NAME, label, estimate, moe) %>% 
  pivot_wider(names_from = label, values_from = c(estimate, moe)) %>% 
  left_join(city_pop, by = "GEOID") %>% 
  select(GEOID, NAME, pop, everything())

# import C15003 table for cities to see if this table works for my analysis
c_education_raw <- get_acs(survey = "acs1", 
                       geography = "place", 
                       table = "C15003",
                       year = "2019")

c_education <- c_education_raw %>% 
  left_join(c15003_vars, by = "variable") %>% 
  select(GEOID, NAME, label, estimate, moe) %>% 
  pivot_wider(names_from = label, values_from = c(estimate, moe)) %>% 
  left_join(city_pop, by = "GEOID") %>% 
  select(GEOID, NAME, pop, everything())

rm(c_education_raw, b_education_raw)


The B15003 table has 25 variables, these are the first 8.**
variable label concept
B15003_001 Total: educational attainment for the population 25 years and over
B15003_002 Total_No_schooling_completed educational attainment for the population 25 years and over
B15003_003 Total_Nursery_school educational attainment for the population 25 years and over
B15003_004 Total_Kindergarten educational attainment for the population 25 years and over
B15003_005 Total_1st_grade educational attainment for the population 25 years and over
B15003_006 Total_2nd_grade educational attainment for the population 25 years and over
B15003_007 Total_3rd_grade educational attainment for the population 25 years and over
B15003_008 Total_4th_grade educational attainment for the population 25 years and over


The C15003 table has 18 variables, these are the first 8.
variable label concept
C15003_001 Total: educational attainment for the population 25 years and over
C15003_002 Total_No_schooling_completed educational attainment for the population 25 years and over
C15003_003 Total_Nursery_to_4th_grade educational attainment for the population 25 years and over
C15003_004 Total_5th_and_6th_grade educational attainment for the population 25 years and over
C15003_005 Total_7th_and_8th_grade educational attainment for the population 25 years and over
C15003_006 Total_9th_grade educational attainment for the population 25 years and over
C15003_007 Total_10th_grade educational attainment for the population 25 years and over
C15003_008 Total_11th_grade educational attainment for the population 25 years and over


The universe population in B15003 table has 88 NAs.

GEOID NAME pop estimate_Total: estimate_Total_No_schooling_completed estimate_Total_Nursery_school estimate_Total_Kindergarten estimate_Total_1st_grade estimate_Total_2nd_grade estimate_Total_3rd_grade estimate_Total_4th_grade estimate_Total_5th_grade estimate_Total_6th_grade estimate_Total_7th_grade estimate_Total_8th_grade estimate_Total_9th_grade estimate_Total_10th_grade estimate_Total_11th_grade estimate_Total_12th_grade_no_diploma estimate_Total_Regular_high_school_diploma estimate_Total_GED_or_alternative_credential estimate_Total_Some_college_less_than_1_year estimate_Total_Some_college_1_or_more_years_no_degree estimate_Total_Associates_degree estimate_Total_Bachelors_degree estimate_Total_Masters_degree estimate_Total_Professional_school_degree estimate_Total_Doctorate_degree moe_Total: moe_Total_No_schooling_completed moe_Total_Nursery_school moe_Total_Kindergarten moe_Total_1st_grade moe_Total_2nd_grade moe_Total_3rd_grade moe_Total_4th_grade moe_Total_5th_grade moe_Total_6th_grade moe_Total_7th_grade moe_Total_8th_grade moe_Total_9th_grade moe_Total_10th_grade moe_Total_11th_grade moe_Total_12th_grade_no_diploma moe_Total_Regular_high_school_diploma moe_Total_GED_or_alternative_credential moe_Total_Some_college_less_than_1_year moe_Total_Some_college_1_or_more_years_no_degree moe_Total_Associates_degree moe_Total_Bachelors_degree moe_Total_Masters_degree moe_Total_Professional_school_degree moe_Total_Doctorate_degree
0103076 Auburn city, Alabama 66254 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
0107000 Birmingham city, Alabama 210080 149277 1423 0 0 219 0 68 54 49 620 489 682 1857 3430 5616 2845 36479 6982 8671 20904 14793 26271 11773 3764 2288 4283 734 206 206 277 206 110 87 80 436 497 365 704 1368 1290 932 3807 1807 1966 2544 2344 3203 1993 906 794
0121184 Dothan city, Alabama 68522 48558 874 12 0 0 74 63 0 180 348 173 459 748 1492 1539 1139 11269 2032 3387 7140 4975 8068 3095 1072 419 1165 309 21 206 206 104 53 206 182 267 127 251 263 490 516 339 1024 467 610 710 627 949 500 324 211
0135896 Hoover city, Alabama 85772 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
0137000 Huntsville city, Alabama 201594 137274 797 0 0 38 65 136 0 58 1008 298 919 1818 2775 2248 1700 17554 5286 7753 21270 12206 36462 20380 2099 2404 3421 448 206 206 66 115 194 206 101 655 235 427 704 959 785 714 2689 1359 1401 2422 1757 4204 2810 643 703
0150000 Mobile city, Alabama 188710 127563 1133 0 0 0 0 6 6 289 506 491 751 1245 2201 2973 2188 35375 4475 5496 19774 10355 25019 10469 3081 1730 3022 503 206 206 206 206 13 13 365 321 303 345 696 900 1126 644 2809 1187 1206 2285 1861 2375 1874 759 629
0151000 Montgomery city, Alabama 198525 130848 1367 172 0 0 96 11 229 127 610 743 1238 1880 3896 2258 1687 25681 6166 6393 21935 11053 26054 14546 3273 1433 1523 786 289 206 206 141 23 235 151 446 365 764 940 1474 881 654 2884 1661 1278 2723 2001 2961 1923 804 535
0177256 Tuscaloosa city, Alabama 101133 60968 347 14 0 0 0 0 0 0 80 977 218 701 1907 1464 1075 14168 2900 2601 9399 2998 12677 5690 1621 2131 2261 350 23 206 206 206 206 206 206 100 893 165 524 1085 670 565 2327 929 951 2317 960 2404 1396 706 814


The universe population in C15003 table has 2 NAs.

GEOID NAME pop estimate_Total: estimate_Total_No_schooling_completed estimate_Total_Nursery_to_4th_grade estimate_Total_5th_and_6th_grade estimate_Total_7th_and_8th_grade estimate_Total_9th_grade estimate_Total_10th_grade estimate_Total_11th_grade estimate_Total_12th_grade_no_diploma estimate_Total_Regular_high_school_diploma estimate_Total_GED_or_alternative_credential estimate_Total_Some_college_less_than_1_year estimate_Total_Some_college_1_or_more_years_no_degree estimate_Total_Associates_degree estimate_Total_Bachelors_degree estimate_Total_Masters_degree estimate_Total_Professional_school_degree estimate_Total_Doctorate_degree moe_Total: moe_Total_No_schooling_completed moe_Total_Nursery_to_4th_grade moe_Total_5th_and_6th_grade moe_Total_7th_and_8th_grade moe_Total_9th_grade moe_Total_10th_grade moe_Total_11th_grade moe_Total_12th_grade_no_diploma moe_Total_Regular_high_school_diploma moe_Total_GED_or_alternative_credential moe_Total_Some_college_less_than_1_year moe_Total_Some_college_1_or_more_years_no_degree moe_Total_Associates_degree moe_Total_Bachelors_degree moe_Total_Masters_degree moe_Total_Professional_school_degree moe_Total_Doctorate_degree
0103076 Auburn city, Alabama 66254 33444 210 43 0 79 102 235 108 137 2858 632 1784 2200 2760 11436 7687 950 2223 2422 327 92 206 99 136 264 191 166 1033 533 957 973 1031 1936 1867 800 792
0107000 Birmingham city, Alabama 210080 149277 1423 341 669 1171 1857 3430 5616 2845 36479 6982 8671 20904 14793 26271 11773 3764 2288 4283 734 310 424 697 704 1368 1290 932 3807 1807 1966 2544 2344 3203 1993 906 794
0121184 Dothan city, Alabama 68522 48558 874 149 528 632 748 1492 1539 1139 11269 2032 3387 7140 4975 8068 3095 1072 419 1165 309 114 326 272 263 490 516 339 1024 467 610 710 627 949 500 324 211
0135896 Hoover city, Alabama 85772 58462 174 134 681 244 11 367 287 467 5928 1591 2788 9115 3430 18619 9630 3718 1278 2607 212 230 668 352 23 327 242 293 1468 941 832 1716 1192 2069 1485 952 598
0137000 Huntsville city, Alabama 201594 137274 797 239 1066 1217 1818 2775 2248 1700 17554 5286 7753 21270 12206 36462 20380 2099 2404 3421 448 234 649 464 704 959 785 714 2689 1359 1401 2422 1757 4204 2810 643 703
0150000 Mobile city, Alabama 188710 127563 1133 12 795 1242 1245 2201 2973 2188 35375 4475 5496 19774 10355 25019 10469 3081 1730 3022 503 27 472 424 696 900 1126 644 2809 1187 1206 2285 1861 2375 1874 759 629
0151000 Montgomery city, Alabama 198525 130848 1367 508 737 1981 1880 3896 2258 1687 25681 6166 6393 21935 11053 26054 14546 3273 1433 1523 786 412 475 862 940 1474 881 654 2884 1661 1278 2723 2001 2961 1923 804 535
0177256 Tuscaloosa city, Alabama 101133 60968 347 14 80 1195 701 1907 1464 1075 14168 2900 2601 9399 2998 12677 5690 1621 2131 2261 350 23 100 897 524 1085 670 565 2327 929 951 2317 960 2404 1396 706 814

Which table do you choose?

It depends on your analysis question. In this case, the variables we want to use to calculate the percent of adults with at least a Bachelor’s Degree are in both tables. So we want the table with the least suppressed data - CC15003.





Poverty data example of Subject tables

Subject tables are very useful because they have lots of the data on a given subject in one place. The best way to find them is to:

  • Look at the subject table list to find the subject table you want
  • Then use the parameter dataset = “acs5/subject” in the load_variables() function to view the subject table in R
  • Filter to create a subject table data frame of your table of interest
  • Identify the variables by comparing the subject table and the variables in your data frame
  • Import the variables using get_acs()

As an example, let’s find the poverty rate for all people in the subject table:

acs1_subject_tables <- load_variables(year = 2019, 
                                      dataset = "acs1/subject",
                                      cache = TRUE) %>%  # select only subject tables
  separate(label, into = paste0("label", 1:9), 
           sep = "!!", 
           fill = "right", 
           remove = FALSE)  # expand label column into multiple columns, at each "!!", to make it slightly more readable

s1701_vars <- acs1_subject_tables %>% 
  filter(grepl("S1701", name))

We want the poverty rate for the ‘population for whom poverty status is determined’.

  • S1701_C01_001 - Total population for whom poverty status is determined
  • S1701_C02_001 - Below poverty level
  • S1701_C03_001 - Percent below poverty level
pov_vars <- c("S1701_C01_001", "S1701_C02_001", "S1701_C03_001")

s1701_var_labels <- s1701_vars %>% 
  filter(name %in% pov_vars) %>% # select the variables that are in our list above
  mutate(description = case_when(name == "S1701_C01_001" ~ "pop_s1701", ## rename for column names later
                                 name == "S1701_C02_001" ~ "poverty",
                                 name == "S1701_C03_001" ~ "percent_pov"))  %>% 
  select(name, description)  ## before you select only these columns, check that you renamed correctly

raw_poverty <- get_acs(survey = "acs1",
                       geography = "place", 
                       variables= pov_vars, 
                       year = 2019) %>% 
  left_join(s1701_var_labels, by = c("variable" = "name"))

poverty <- raw_poverty %>% 
  select(-variable) %>% 
  pivot_wider(names_from = description, values_from = c(estimate, moe))





R Markdown / R Notebooks Intro

R Markdown is a scripting language to create documents that run R scripts, but don’t require R, R Studio to open and view.



Create a simple R Markdown document

  • New R Markdown
    • Document
    • HTML


  • Read the text in the doc - it has instructions for how to use it!
  • Change the title on Line 2
  • Delete everything below line 10
  • Press Knit
  • View your document





Example of am R Markdown document that displays an interactive scatterplot



In-class Exercise: Practice creating an R Markdown document






Assignment 11

R Assignment

Create an interactive R Markdown document that shows the Unemployment Rate by Race and Hispanic Origin for New Jersey

  • Create a script to process your data, write out your data frame, then read it into an R Markdown document, create an interactive bar chart, and Knit.
    • Import the Unemployment Rate by Race and Hispanic Origin for each state (or just New Jersey) from the 2019 ACS 1-year summary table.
    • Create a tidy data frame with the unemployment rate for all people over 16, and for each RACE AND HISPANIC OR LATINO ORIGIN category in the dataset.
    • Write out your tidy data frame to use in your R Markdown document
  • Create an R Markdown document and display your bar chart as a web page
    • Make the title of the doc the title of your bar chart
    • Import your data frame from the above script
    • Create a bar chart of the employment rate, for all people over 16 and by race/ethnicity for the state of New Jersey. Do not include any group from the bar chart whose Margin of Error is > 1 percentage point.
    • Make the bar chart interactive and include the following in the tooltip:
      • Racial/ethnic category
      • Unemployment rate
      • Margin of error
    • Knit your R Markdown document
    • Optional, create a RPubs account and publish your barchart

Submit your data processing script and your R Markdown document to create your interactive bar chart (the .html file) on Canvas.


Readings:

Read R Markdown from RStudio from the Introduction to Notebooks to get a sense of how R Markdown works and what some of the possibilities are.

Final Project Framing:

Answer the “Beginning of the Project” framing questions for your final project:

  • What is the question I want to answer with this research?
  • What is the goal?
  • Who is the audience?
  • What will the audience expect the answer to be?
  • What data and specific research questions will answer the question?