Introduction
Census data plays a crucial role in understanding the demographics, economic conditions, and overall characteristics of different geographic regions. This data allows us to explore trends in labor force participation and unemployment rates, which contribute to social and economic conditions. Here, It is analyzed the unemployment rates across the United States by state, as well as specifically for Texas, using data from the U.S. Census Bureau. The following steps outline how to calculate the unemployment rate from Census data and create a map that visualizes these rates.
Library
library(tidycensus)
library(tidyverse)
library(viridis)
library(viridisLite)
library(tidyr)
library(DT)
library(tigris)
library(sf)
library(ggplot2)
library(dplyr)
Set Census API Key
census_api_key("b6e58445b9ae883e07f164601f69a0d057707e1f")
Step 1: Getting the Data
After Install API key, first the analysis is done for Texas data. The data for Texas was generated from census data, 2017. Here two variables are used.
jobs <- c(labor_force = "B23025_005E",
unemployed = "B23025_002E")
# Fetch data for Brazos County, Texas
texas <- get_acs(
geography = "tract",
year = 2017,
survey = "acs5",
variables = jobs,
county = "Brazos",
state = "TX",
geometry = TRUE
)
## | | | 0% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |=== | 4% | |=== | 5% | |==== | 6% | |===== | 7% | |====== | 8% | |====== | 9% | |======= | 10% | |========== | 14% | |=========== | 15% | |============= | 19% | |============== | 20% | |=============== | 22% | |================ | 23% | |================= | 24% | |================== | 26% | |=================== | 27% | |=================== | 28% | |===================== | 30% | |====================== | 31% | |======================= | 32% | |======================== | 34% | |========================= | 35% | |========================== | 37% | |=========================== | 38% | |============================ | 40% | |============================= | 42% | |============================== | 43% | |=============================== | 45% | |================================ | 46% | |================================= | 47% | |================================== | 48% | |==================================== | 52% | |====================================== | 54% | |======================================= | 55% | |======================================== | 57% | |========================================= | 59% | |========================================== | 60% | |=========================================== | 61% | |============================================ | 63% | |============================================= | 65% | |=============================================== | 66% | |================================================ | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 72% | |===================================================== | 76% | |====================================================== | 77% | |======================================================= | 78% | |======================================================== | 80% | |========================================================== | 83% | |============================================================ | 85% | |============================================================= | 87% | |============================================================== | 88% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================= | 93% | |================================================================== | 95% | |==================================================================== | 97% | |===================================================================== | 98% | |======================================================================| 100%
head(texas)
## Simple feature collection with 6 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -96.4213 ymin: 30.60732 xmax: -96.29517 ymax: 30.72237
## Geodetic CRS: NAD83
## GEOID NAME variable estimate moe
## 1 48041000300 Census Tract 3, Brazos County, Texas B23025_002 3376 387
## 2 48041000300 Census Tract 3, Brazos County, Texas B23025_005 133 107
## 3 48041000800 Census Tract 8, Brazos County, Texas B23025_002 2401 407
## 4 48041000800 Census Tract 8, Brazos County, Texas B23025_005 45 60
## 5 48041001701 Census Tract 17.01, Brazos County, Texas B23025_002 3392 433
## 6 48041001701 Census Tract 17.01, Brazos County, Texas B23025_005 140 95
## geometry
## 1 MULTIPOLYGON (((-96.42003 3...
## 2 MULTIPOLYGON (((-96.42003 3...
## 3 MULTIPOLYGON (((-96.36543 3...
## 4 MULTIPOLYGON (((-96.36543 3...
## 5 MULTIPOLYGON (((-96.31898 3...
## 6 MULTIPOLYGON (((-96.31898 3...
names(texas)
## [1] "GEOID" "NAME" "variable" "estimate" "moe" "geometry"
Step 2: Transforming the Data
Now the data is used to calculate unemployment rate with the two variables. The variables new name is given here.
texas <- texas %>%
mutate(variable = case_when(
variable == "B23025_005" ~ "Unemployed",
variable == "B23025_002" ~ "Workforce",
)) %>%
select(-moe) %>%
spread(variable, estimate) %>%
mutate(UnemploymentRate = round(Unemployed / Workforce * 100, 2))
Step 3: Data Exploration
Now in data exploration, the lowest and highest unemployment is shown for the State Texas.
# Lowest unemployment rate
lowest_unemployment <- texas %>% arrange(UnemploymentRate)
head(lowest_unemployment)
## Simple feature collection with 6 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -96.36938 ymin: 30.50335 xmax: -96.16038 ymax: 30.86951
## Geodetic CRS: NAD83
## GEOID NAME Unemployed Workforce
## 1 48041001801 Census Tract 18.01, Brazos County, Texas 35 3217
## 2 48041002013 Census Tract 20.13, Brazos County, Texas 41 3736
## 3 48041002008 Census Tract 20.08, Brazos County, Texas 67 4531
## 4 48041000102 Census Tract 1.02, Brazos County, Texas 60 3264
## 5 48041000800 Census Tract 8, Brazos County, Texas 45 2401
## 6 48041002001 Census Tract 20.01, Brazos County, Texas 62 2893
## UnemploymentRate geometry
## 1 1.09 MULTIPOLYGON (((-96.30938 3...
## 2 1.10 MULTIPOLYGON (((-96.35701 3...
## 3 1.48 MULTIPOLYGON (((-96.30178 3...
## 4 1.84 MULTIPOLYGON (((-96.31898 3...
## 5 1.87 MULTIPOLYGON (((-96.36543 3...
## 6 2.14 MULTIPOLYGON (((-96.30139 3...
# Highest unemployment rate
highest_unemployment <- texas %>% arrange(desc(UnemploymentRate))
head(highest_unemployment)
## Simple feature collection with 6 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -96.42343 ymin: 30.50488 xmax: -96.32343 ymax: 30.69686
## Geodetic CRS: NAD83
## GEOID NAME Unemployed Workforce
## 1 48041002015 Census Tract 20.15, Brazos County, Texas 263 1700
## 2 48041000500 Census Tract 5, Brazos County, Texas 300 2358
## 3 48041002012 Census Tract 20.12, Brazos County, Texas 408 3715
## 4 48041002014 Census Tract 20.14, Brazos County, Texas 140 1283
## 5 48041000603 Census Tract 6.03, Brazos County, Texas 258 2475
## 6 48041001400 Census Tract 14, Brazos County, Texas 107 1310
## UnemploymentRate geometry
## 1 15.47 MULTIPOLYGON (((-96.34547 3...
## 2 12.72 MULTIPOLYGON (((-96.40883 3...
## 3 10.98 MULTIPOLYGON (((-96.36407 3...
## 4 10.91 MULTIPOLYGON (((-96.42343 3...
## 5 10.42 MULTIPOLYGON (((-96.40734 3...
## 6 8.17 MULTIPOLYGON (((-96.35209 3...
Visualization
Step 4: Map the Data
The map of Texas is created here. GGplot used to create the map with highlighting the unemployment rates.
ggplot(texas, aes(fill = UnemploymentRate)) +
geom_sf(color = "white") +
theme_void() +
theme(panel.grid.major = element_line(colour = 'transparent')) +
scale_fill_distiller(palette = "Reds", direction = 1, name = "Unemployment Rate (%)") +
labs(
title = "Percent Unemployed in Brazos County, Texas",
caption = "Source: US Census/ACS5 2017"
) +
NULL
Step 5: Display the Data in an Interactive Table
data table is listed all the unemployment rates from the data file of Texas.
datatable(texas, options = list(pageLength = 10))
Use of Census Data
The above steps are repeated to see USA unemployment rates by states.
# Set your Census API key
census_api_key("b6e58445b9ae883e07f164601f69a0d057707e1f")
jobs <- c(labor_force = "B23025_005E",
unemployed = "B23025_002E")
# Get ACS data for the entire US at the State level
us_state_unemp <- get_acs(
geography = "state",
year = 2017,
survey = "acs5",
variables = jobs
)
us_state_unemp <- us_state_unemp %>%
mutate(variable = case_when(
variable == "B23025_005" ~ "Unemployed",
variable == "B23025_002" ~ "Workforce",
)) %>%
select(-moe) %>%
spread(variable, estimate) %>%
mutate(UnemploymentRate = round(Unemployed / Workforce * 100, 2))
# Data Exploration: Sort data by UnemploymentRate to find areas with the lowest and highest rates
# Lowest unemployment rate
lowest_unemployment <- us_state_unemp %>% arrange(UnemploymentRate)
head(lowest_unemployment)
## # A tibble: 6 × 5
## GEOID NAME Unemployed Workforce UnemploymentRate
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 38 North Dakota 11085 417893 2.65
## 2 46 South Dakota 17018 458023 3.72
## 3 31 Nebraska 38881 1031005 3.77
## 4 19 Iowa 69018 1670448 4.13
## 5 27 Minnesota 130510 3036696 4.3
## 6 49 Utah 64314 1480657 4.34
# Highest unemployment rate
highest_unemployment <- us_state_unemp %>% arrange(desc(UnemploymentRate))
head(highest_unemployment)
## # A tibble: 6 × 5
## GEOID NAME Unemployed Workforce UnemploymentRate
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 72 Puerto Rico 220597 1262220 17.5
## 2 28 Mississippi 117786 1349864 8.73
## 3 11 District of Columbia 31279 392421 7.97
## 4 32 Nevada 116285 1465320 7.94
## 5 35 New Mexico 73710 962123 7.66
## 6 06 California 1491146 19612777 7.6
Summary Data
# Summary statistics for the house-price-to-income ratio
summary_stats <- us_state_unemp %>%
summarize(
min = min(UnemploymentRate, na.rm = TRUE),
max = max(UnemploymentRate, na.rm = TRUE),
median = median(UnemploymentRate, na.rm = TRUE),
mean = mean(UnemploymentRate, na.rm = TRUE),
sd = sd(UnemploymentRate, na.rm = TRUE),
)
print(summary_stats)
## # A tibble: 1 × 5
## min max median mean sd
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2.65 17.5 6.41 6.30 2.06
Map of USA
# Get geometries for US states
state_geoms <- states(cb = TRUE, year = 2017) %>%
st_as_sf() # Convert to an sf object
## | | | 0% | |= | 1% | |= | 2% | |== | 3% | |=== | 4% | |==== | 5% | |==== | 6% | |===== | 8% | |====== | 9% | |======== | 11% | |========= | 12% | |========= | 13% | |========== | 14% | |========== | 15% | |=========== | 16% | |============ | 17% | |============== | 20% | |=============== | 21% | |=============== | 22% | |================= | 24% | |================== | 25% | |=================== | 27% | |==================== | 28% | |===================== | 31% | |====================== | 31% | |========================= | 35% | |========================= | 36% | |========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================= | 41% | |============================= | 42% | |============================== | 43% | |=============================== | 45% | |================================ | 46% | |================================= | 47% | |================================= | 48% | |================================== | 49% | |=================================== | 50% | |===================================== | 52% | |===================================== | 53% | |====================================== | 55% | |======================================= | 56% | |======================================== | 57% | |========================================= | 58% | |========================================= | 59% | |========================================== | 60% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 76% | |====================================================== | 77% | |======================================================= | 78% | |======================================================== | 80% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 83% | |=========================================================== | 84% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 93% | |================================================================== | 94% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 97% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 100%
# Assuming you have the unemployment data `us_state_unemp` already loaded
# Join the unemployment data to the geometries
us_state_unemp <- state_geoms %>%
left_join(us_state_unemp, by = c("GEOID" = "GEOID"))
# Plot the unemployment rate by state with no padding
ggplot(us_state_unemp, aes(geometry = geometry, fill = UnemploymentRate)) +
geom_sf(color = "black") +
theme_void() +
scale_fill_distiller(palette = "Oranges", direction = 1, name = "Unemployment Rate (%)") +
labs(
title = "Percent Unemployed by State in the USA (2017)",
caption = "Source: US Census/ACS5 2017"
) +
theme(
plot.margin = margin(4, 4, 4, 4), # Remove extra margins
legend.key.width = unit(1, "cm"), # Adjust legend size
plot.title = element_text(hjust = 0.5, size = 24), # Title size
plot.caption = element_text(hjust = 1, size = 16), # Caption size
legend.title = element_text(size = 14), # Legend title size
legend.text = element_text(size = 12), # Legend text size
legend.position = "bottom", # Position the legend at the bottom
legend.direction = "horizontal", # Make the legend horizontal
legend.title.align = 0.5 # Center the legend title
)+
coord_sf(xlim = c(-125, -66), ylim = c(24, 50), expand = FALSE)