library( knitr )
library( tidyverse )
library( tidycensus )
library( sf )
library( DT )

Introduction

This project will show the user how to calculate and display demographic census data for a county using a combination of tidyverse, tidycensus, and ggplot2.

I’ve chosen the beautiful county of San Diego, California because I have fond memories of living there during law school. San Diego Wood map was made by Raymond Inzitari

Loading The Census Data

In order to allow R to talk to the census data API, you must sign up for a census key. You can do that using your email at https://api.census.gov/data/key_signup.html.

After you have your key, you assign it using the census_api_key() command pictured below. I elected to further assign mine to the key variable for easy use.

census_api_key("d0a4276115eb56db29e3593b6dbb95b201c615e4")
key <- "d0a4276115eb56db29e3593b6dbb95b201c615e4"
census_api_key( key )
# Add key to .Renviron
Sys.setenv(CENSUS_KEY="d0a4276115eb56db29e3593b6dbb95b201c615e4")
# Reload .Renviron
readRenviron("~/.Renviron")
# Check to see that the expected key is output in your R console
Sys.getenv("CENSUS_KEY")
## [1] "d0a4276115eb56db29e3593b6dbb95b201c615e4"

Note: Professor Anthony Howell, PhD has a great video tutorial on this available: here

Location data and pre-processing

Using the get_acs() function from Tidycensus, we can select geographical data from portions of the US. Here I chose to select information down to the housing tract level. The variable from the census data that represents household income is B19013_001. To see a list of these variables, you can call load_variables() and select the date and type of census information that you want.

I have loaded the R package “DT” because it creates interactive HTML tables by calling datatable() and makes navigating this massive dataset easier by providing a search box.

variable_check <- load_variables(2018, "acs5", cache = TRUE)

datatable(variable_check)

San Diego Compared to Other Counties

First I wanted to get an idea of where San Diego ranks compared to other California counties. To do this I created a variable ca_income which grabbed a sampling of counties and then plotted the results on a geom_point():

# Access the 1-year ACS  with the survey parameter
ca_income <- get_acs(geography = "county", 
                   state = "CA",
                   county = c("Los Angeles", "Santa Clara", "Orange", "San Diego", "Yolo", "Santa Barbara", "Humboldt"),
                   variables = "B19013_001")
## Getting data from the 2015-2019 5-year ACS
# Reorder the states in descending order of estimates
ggplot(ca_income, aes(x = estimate, y = NAME)) + 
  geom_point()

I wanted to make the geom_point more readable and also let the reader know that I had, personally, selected these counties. California has many counties and I wanted a sampling from up and down the coast so that the viewer could get an idea of where San Diego lands. I also sorted the counties so that the highest was at the top.

# Set dot color and size
g_color <- ggplot(ca_income, aes(x = estimate, y = reorder(NAME, estimate))) + 
  geom_point(color = "forest green", size = 4)

# Format the x-axis labels
g_scale <- g_color + 
  scale_x_continuous(labels = scales::dollar) + 
  theme_minimal(base_size = 10) 

# Label your x-axis, y-axis, and title your chart
g_label <- g_scale + 
  labs(x = "2018 ACS estimate", 
       y = "", 
       title = "California Median Income by Select Counties")
  
g_label

San Diego Tract Income

After my quick overview of San Diego’s relationship to California, I wanted to get an idea of where the money was within San Diego County.

ca_income <- get_acs(geography = "tract", 
                   state = "CA",
                   county = "San Diego",
                   variables = "B19013_001",
                   geometry = TRUE)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
datatable(ca_income)

San Diego Tract Income Map

It wouldn’t make much sense to create another geom_point() because nobody has tract ID’s memorized so I decided to plot the income onto a map. I’ve seen a handful of R bloggers talk about the colorful Viridis color maps and so I decided to color my map using these scales. They are apparently easier to read for people with colorblindness and (selfishly) I like the color pallete:

# Reorder the states in descending order of estimates
ggplot(ca_income, aes(fill = estimate, color = estimate)) + 
  geom_sf() + 
  scale_fill_viridis_c() +  
  scale_color_viridis_c(guide = FALSE) + 
  theme_minimal() + 
  coord_sf(datum = NA) + 
  facet_wrap(~variable) +# Plot the estimate to view a map of the data
  labs(title = "Household Income in Each Housing Tract", 
       subtitle = "San Diego County, California")

As is usually the case in California, the high income residents generally like to settle near the water (otherwise they’d move to a much nicer house somewhere like Arizona.)

Note: In order for the map to render properly, you must add the geometry = TRUE expression (I made this mistake several times).

Racial Demographics

In addition to income, I wanted to see the general demographics of San Diego. To do this, I created a new variable race_vars and assigned the codes from the census data associated with the categories: White, Black, Native, Asian, HIPI, and Hispanic. I’d like to determine the percent of each demographic group within the San Diego County tracts.

# Assign Census variables vector to race_vars  
race_vars <- c(White = "B03002_003", Black = "B03002_004", Native = "B03002_005", 
               Asian = "B03002_006", HIPI = "B03002_007", Hispanic = "B03002_012")

# Request a summary variable from the ACS
ca_race <- get_acs(geography = "tract", 
                   state = "CA",
                   county = "San Diego",
                   variables = race_vars, 
                   summary_var = "B03002_001",
                   geometry = TRUE)

# Calculate a new percentage column and check the result
ca_race_pct <- ca_race %>%
  mutate(pct = 100 * (summary_est / summary_est))
datatable(ca_race_pct)

Visualized on a Map

I then plotted each group on their own geom_sf() facet wrap map. That way you could visualize the density of each demographic with each housing tract.

# Remove the gridlines and generate faceted maps
ggplot(ca_race_pct, aes(fill = estimate, color = estimate)) + 
  geom_sf() + 
  scale_fill_viridis_c() +  
  scale_color_viridis_c(guide = FALSE) + 
  theme_minimal() + 
  coord_sf(datum = NA) + 
  facet_wrap(~variable) +# Plot the estimate to view a map of the data
  labs(title = "Population in Each Housing Tract", 
       subtitle = "San Diego County, California")

Median Housing Values

Finally, I wanted to get a sense of the median housing values in each tract to see if this lined up with Household Income (I assumed it would be very close but wanted to be certain). I pulled another variable from the census B25077_001 which holds household income.

# Get dataset with geometry set to TRUE
sd_homes <- get_acs(geography = "tract", state = "CA", 
                        county = "San Diego", 
                        variables = "B25077_001", 
                        geometry = TRUE)
# Plot the estimate to view a map of the data
ggplot(sd_homes, aes(fill = estimate, color = estimate)) + 
  geom_sf() + 
  scale_fill_viridis_c(labels = scales::dollar) +  
  scale_color_viridis_c(guide = FALSE) + 
  theme_minimal() + 
  coord_sf(crs = 26911, datum = NA) + 
  labs(title = "Median owner-occupied housing value by Census tract", 
       subtitle = "San Diego County, California", 
       fill = "ACS estimate")

This was a little disappointing because certain tracts didn’t load and so they are an ugly grey color. Otherwise, the map lined up pretty well with the household income map that I previously did.

Thank you for reading!

About this Project

It was created using data from the United States Census via the Tidycensus API. This dashboard was the Code-Through project for CPP 529: Foundations in Data Science I for my M.S. in Data Analytics at ASU. In addition to course materials, this project built on ideas from the Data Camp class on Analyzing US Census Data in R.

Any errors in reporting are the fault of the student (Sean Harrington) and more likely are a result in novice coding than improper reporting by the city of San Diego or the US Census.

Dashboard Author

This dashboard was created by Sean Harrington a novice data analyst in ASU’s MS in Data Analytics & Program Evaluation program.