The project undertaken was an ambitious and insightful endeavor to harness the power of the U.S. Census Bureau’s API, creating an R Shiny application capable of visualizing changes in the Gini Index and median household income across U.S. counties from 2015 to 2020. The goal was to present a dynamic, data-driven narrative that could illuminate socio-economic trends over this period. However, the journey was not without its hurdles and learning moments. Though not fully unraveled, the challenges encountered pointed to the complexity and nuances inherent in working with large-scale, real-world data sets. While impeding the final realization of the project’s full potential, these challenges provided valuable insights into the intricacies of data handling and application development in R.
Initially, the project started with extracting data from CSV files from the Census Bureau. While successful in creating a foundational Shiny application, this preliminary approach revealed limitations in terms of data comprehensiveness. Many counties were missing from the dataset, prompting a pivot to directly accessing the Census Bureau’s API for a more exhaustive dataset. This shift marked a significant step forward, enabling the creation of static visualizations that more accurately represented the socio-economic landscape of the United States at a county level. The process involved meticulously joining multiple datasets to form a cohesive and comprehensive view of the Gini Index and median household income trends. The resulting visualizations offered a static but insightful glimpse into the economic disparities and shifts across the nation.
However, transitioning from static visualizations to a dynamic Shiny application introduced a new set of challenges. The development was frequently hindered by unexpected errors, particularly in rendering the maps, which became a recurring issue. Efforts to troubleshoot included extensive debugging and strategically removing NA values from the dataset to stabilize the application. Despite these efforts, the application fell short of its full interactive potential, with issues in map display persisting. This project phase underscored the complexities involved in developing a fully functional Shiny application, particularly when handling large and intricate datasets. The experience, though marked by setbacks, was a profound learning journey in data manipulation, API utilization, and interactive application development in R.
## Reading layer `county' from data source
## `/Users/jcw81/Desktop/JHU/Semester 1/Data Visualization/Class Project/Module 11/county.geojson'
## using driver `GeoJSON'
## Simple feature collection with 3207 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -170 ymin: 17.6804 xmax: -64.6086 ymax: 71.3526
## Geodetic CRS: WGS 84
The intent of this code is to set environment variables for GDAL and PROJ library paths, which are essential for spatial data processing in R, especially when using libraries like rgdal or sf.
Sys.setenv(GDAL_DATA = "/usr/local/Cellar/gdal/3.7.2_1/share/gdal")
Sys.setenv(PROJ_LIB = "/usr/local/opt/proj")
df1 <- read.csv("~/Desktop/JHU/Semester 1/Data Visualization/Class Project/Module 11/County Gini Dataset.csv",header=TRUE, stringsAsFactors=FALSE)
head(spdf, n=20)
## Simple feature collection with 20 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -170 ymin: 31.1922 xmax: -84.9015 ymax: 56.6142
## Geodetic CRS: WGS 84
## First 10 features:
## COUNTY_STATE_NAME COUNTY_STATE_CODE GEOID FIPS_CODE
## 1 Barbour, Alabama Barbour, AL 1005 01-005
## 2 Choctaw, Alabama Choctaw, AL 1023 01-023
## 3 Conecuh, Alabama Conecuh, AL 1035 01-035
## 4 Elmore, Alabama Elmore, AL 1051 01-051
## 5 Hale, Alabama Hale, AL 1065 01-065
## 6 Pike, Alabama Pike, AL 1109 01-109
## 7 Russell, Alabama Russell, AL 1113 01-113
## 8 Shelby, Alabama Shelby, AL 1117 01-117
## 9 Aleutians West, Alaska Aleutians West, AK 2016 02-016
## 10 Mohave, Arizona Mohave, AZ 4015 04-015
## geometry
## 1 MULTIPOLYGON (((-85.6577 31...
## 2 MULTIPOLYGON (((-88.4732 31...
## 3 MULTIPOLYGON (((-86.9059 31...
## 4 MULTIPOLYGON (((-86.375 32....
## 5 MULTIPOLYGON (((-87.7157 33...
## 6 MULTIPOLYGON (((-86.1914 31...
## 7 MULTIPOLYGON (((-85.434 32....
## 8 MULTIPOLYGON (((-87.0268 33...
## 9 MULTIPOLYGON (((-166.3104 5...
## 10 MULTIPOLYGON (((-114.0506 3...
head(df1)
## GEO_ID NAME LSAD STATE X2010 X2011 X2012 X2013 X2014
## 1 0500000US01003 Baldwin County Alabama 0.459 0.4324 0.4371 0.4727 0.4648
## 2 0500000US01015 Calhoun County Alabama 0.460 0.4530 0.4563 0.4670 0.4561
## 3 0500000US01043 Cullman County Alabama 0.473 0.4478 0.4453 0.4543 0.4533
## 4 0500000US01049 DeKalb County Alabama 0.465 0.4857 0.4131 0.3940 0.4754
## 5 0500000US01051 Elmore County Alabama 0.403 0.4233 0.3744 0.4206 0.4160
## 6 0500000US01055 Etowah County Alabama 0.449 0.4539 0.4467 0.4351 0.4374
## X2015 X2016 X2017 X2018 X2019 X2021 X2022
## 1 0.4722 0.4498 0.4363 0.4725 0.4630 0.4539 0.4727
## 2 0.4219 0.4692 0.4892 0.4650 0.4572 0.4532 0.4600
## 3 0.4438 0.4518 0.4121 0.4981 0.3961 0.4274 0.4701
## 4 0.4639 0.4528 0.4975 0.5003 0.4459 0.5274 0.4363
## 5 0.4620 0.4535 0.4307 0.4162 0.4468 0.4370 0.4079
## 6 0.4125 0.4477 0.4657 0.4469 0.5008 0.4450 0.4437
The purpose of this code is to remove the prefix “US” from the GEO_ID column in the dataframe df1 and convert the modified GEO_ID values to numeric format, verifying the changes by viewing the first few rows.
# This removes everything before and including "US"
df1$GEO_ID <- gsub(".*US", "", df1$GEO_ID)
# Convert GEO_ID to numeric if necessary
df1$GEO_ID <- as.numeric(df1$GEO_ID)
# You can check the first few rows to confirm
head(df1)
## GEO_ID NAME LSAD STATE X2010 X2011 X2012 X2013 X2014 X2015
## 1 1003 Baldwin County Alabama 0.459 0.4324 0.4371 0.4727 0.4648 0.4722
## 2 1015 Calhoun County Alabama 0.460 0.4530 0.4563 0.4670 0.4561 0.4219
## 3 1043 Cullman County Alabama 0.473 0.4478 0.4453 0.4543 0.4533 0.4438
## 4 1049 DeKalb County Alabama 0.465 0.4857 0.4131 0.3940 0.4754 0.4639
## 5 1051 Elmore County Alabama 0.403 0.4233 0.3744 0.4206 0.4160 0.4620
## 6 1055 Etowah County Alabama 0.449 0.4539 0.4467 0.4351 0.4374 0.4125
## X2016 X2017 X2018 X2019 X2021 X2022
## 1 0.4498 0.4363 0.4725 0.4630 0.4539 0.4727
## 2 0.4692 0.4892 0.4650 0.4572 0.4532 0.4600
## 3 0.4518 0.4121 0.4981 0.3961 0.4274 0.4701
## 4 0.4528 0.4975 0.5003 0.4459 0.5274 0.4363
## 5 0.4535 0.4307 0.4162 0.4468 0.4370 0.4079
## 6 0.4477 0.4657 0.4469 0.5008 0.4450 0.4437
library(usmap)
us_states <- us_map(regions = "states")
# Check column names in both data frames
colnames(spdf)
## [1] "COUNTY_STATE_NAME" "COUNTY_STATE_CODE" "GEOID"
## [4] "FIPS_CODE" "geometry"
colnames(df1)
## [1] "GEO_ID" "NAME" "LSAD" "STATE" "X2010" "X2011" "X2012" "X2013"
## [9] "X2014" "X2015" "X2016" "X2017" "X2018" "X2019" "X2021" "X2022"
# Prepare the data by making GEO_IDs match
df1$GEO_ID <- as.numeric(sub("0500000US", "", df1$GEO_ID))
# Merge the datasets on the GEOID
merged_df <- left_join(spdf, df1, by = c("GEOID" = "GEO_ID"))
# Check the head to ensure the merge went correctly
head(merged_df, n = 10)
## Simple feature collection with 10 features and 19 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -170 ymin: 31.1922 xmax: -84.9015 ymax: 56.6142
## Geodetic CRS: WGS 84
## COUNTY_STATE_NAME COUNTY_STATE_CODE GEOID FIPS_CODE NAME LSAD
## 1 Barbour, Alabama Barbour, AL 1005 01-005 <NA> <NA>
## 2 Choctaw, Alabama Choctaw, AL 1023 01-023 <NA> <NA>
## 3 Conecuh, Alabama Conecuh, AL 1035 01-035 <NA> <NA>
## 4 Elmore, Alabama Elmore, AL 1051 01-051 Elmore County
## 5 Hale, Alabama Hale, AL 1065 01-065 <NA> <NA>
## 6 Pike, Alabama Pike, AL 1109 01-109 <NA> <NA>
## 7 Russell, Alabama Russell, AL 1113 01-113 <NA> <NA>
## 8 Shelby, Alabama Shelby, AL 1117 01-117 Shelby County
## 9 Aleutians West, Alaska Aleutians West, AK 2016 02-016 <NA> <NA>
## 10 Mohave, Arizona Mohave, AZ 4015 04-015 Mohave County
## STATE X2010 X2011 X2012 X2013 X2014 X2015 X2016 X2017 X2018
## 1 <NA> NA NA NA NA NA NA NA NA NA
## 2 <NA> NA NA NA NA NA NA NA NA NA
## 3 <NA> NA NA NA NA NA NA NA NA NA
## 4 Alabama 0.403 0.4233 0.3744 0.4206 0.4160 0.4620 0.4535 0.4307 0.4162
## 5 <NA> NA NA NA NA NA NA NA NA NA
## 6 <NA> NA NA NA NA NA NA NA NA NA
## 7 <NA> NA NA NA NA NA NA NA NA NA
## 8 Alabama 0.426 0.4008 0.4477 0.4250 0.4405 0.4424 0.4305 0.4208 0.4589
## 9 <NA> NA NA NA NA NA NA NA NA NA
## 10 Arizona 0.432 0.4265 0.4554 0.4339 0.4257 0.4286 0.4645 0.4551 0.4334
## X2019 X2021 X2022 geometry
## 1 NA NA NA MULTIPOLYGON (((-85.6577 31...
## 2 NA NA NA MULTIPOLYGON (((-88.4732 31...
## 3 NA NA NA MULTIPOLYGON (((-86.9059 31...
## 4 0.4468 0.4370 0.4079 MULTIPOLYGON (((-86.375 32....
## 5 NA NA NA MULTIPOLYGON (((-87.7157 33...
## 6 NA NA NA MULTIPOLYGON (((-86.1914 31...
## 7 NA NA NA MULTIPOLYGON (((-85.434 32....
## 8 0.4527 0.4470 0.4134 MULTIPOLYGON (((-87.0268 33...
## 9 NA NA NA MULTIPOLYGON (((-166.3104 5...
## 10 0.4214 0.4834 0.5238 MULTIPOLYGON (((-114.0506 3...
# Verify column names
print(names(merged_df))
## [1] "COUNTY_STATE_NAME" "COUNTY_STATE_CODE" "GEOID"
## [4] "FIPS_CODE" "NAME" "LSAD"
## [7] "STATE" "X2010" "X2011"
## [10] "X2012" "X2013" "X2014"
## [13] "X2015" "X2016" "X2017"
## [16] "X2018" "X2019" "X2021"
## [19] "X2022" "geometry"
# Check for NA values in the column of interest
summary(merged_df$X2021)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.3447 0.4300 0.4515 0.4533 0.4768 0.6272 2375
# Plot the choropleth map using the correct column name
ggplot(data = merged_df) +
geom_sf(aes(fill = X2021)) + # Make sure this is the correct column name
scale_fill_viridis_c() +
labs(title = "Gini Ratio 2021", fill = "Index Value") +
theme_minimal()
The creation of a Shiny application to visualize the Gini Index across
U.S. counties, despite the challenges with incomplete data, showcases a
successful implementation of data visualization techniques in R. The
application, built using libraries such as shiny, leaflet, viridis, and
dplyr, provides an interactive map for exploring income inequality as
measured by the Gini Index from 2015 to 2022.
In this application, the user interface (UI) is thoughtfully designed with a sidebar containing a slider for year selection and a reset button to restore the map to its initial state. The leaflet package is used to render the choropleth map, while the viridis color palette enhances visual appeal and clarity. The server function is where the application’s logic resides. It includes a reactive expression to filter data based on the selected year and dynamically updates the color palette.
library(shiny)
library(leaflet)
library(viridis)
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:maps':
##
## unemp
## The following object is masked from 'package:scales':
##
## viridis_pal
library(dplyr)
# Define the user interface
ui <- fluidPage(
titlePanel("Choropleth Map of Gini Index by County"),
sidebarLayout(
sidebarPanel(
sliderInput("year", "Year:", min = 2015, max = 2022, value = 2022, step = 1, sep = ""),
actionButton("reset", "Reset Map")
),
mainPanel(leafletOutput("choroplethMap"))
)
)
server <- function(input, output, session) {
# Store map view state
mapViewState <- reactiveValues(lat = 37.8, lng = -96, zoom = 4)
# Reactive expression to filter the data
filteredData <- reactive({
year_column <- paste0("X", input$year)
# Check if the year column exists in the dataframe
if (!year_column %in% names(merged_df)) {
return(NULL)
}
# Add COUNTY_STATE_NAME to the selection in the dataframe
df <- merged_df %>%
# Ensure there is at least one non-NA value in the year column
filter(!all(is.na(.[[year_column]]))) %>%
select(GEOID, NAME, COUNTY_STATE_NAME, geometry, !!sym(year_column))
if (nrow(df) == 0) {
return(NULL)
}
df
})
# Reactive expression for color palette
pal <- reactive({
df <- filteredData()
if (is.null(df)) {
return(NULL)
}
colorNumeric(palette = "viridis", domain = df[[paste0("X", input$year)]])
})
# Render the initial map
output$choroplethMap <- renderLeaflet({
leaflet() %>%
addProviderTiles(providers$CartoDB.Positron) %>%
setView(lng = mapViewState$lng, lat = mapViewState$lat, zoom = mapViewState$zoom)
})
# Update the map when the year changes
observeEvent(input$year, {
df <- filteredData()
if (is.null(df)) {
leafletProxy("choroplethMap") %>%
clearShapes() %>%
removeControl("legendID")
return()
}
color_palette <- pal()
leafletProxy("choroplethMap", data = df) %>%
clearShapes() %>%
removeControl("legendID") %>%
addPolygons(
fillColor = ~ifelse(is.na(df[[paste0("X", input$year)]]),
NA,
color_palette(df[[paste0("X", input$year)]])),
fillOpacity = 0.7,
color = "#BDBDC3",
weight = 0.2,
smoothFactor = 0.5,
highlightOptions = highlightOptions(
weight = 2,
color = "#667",
fillOpacity = 0.7,
bringToFront = TRUE
),
label = ~paste(NAME),
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"
),
popup = ~paste("<strong>County:</strong>", NAME,
"<br><strong>State:</strong>", COUNTY_STATE_NAME, # Ensure this line uses the correct variable
"<br><strong>Gini Value:</strong>", df[[paste0("X", input$year)]])
) %>%
addLegend(
pal = color_palette,
values = ~df[[paste0("X", input$year)]],
title = "Gini Index Value",
position = "bottomright",
layerId = "legendID"
)
})
# Reset Map View
observeEvent(input$reset, {
mapViewState$lat <- 37.8
mapViewState$lng <- -96
mapViewState$zoom <- 4
leafletProxy("choroplethMap") %>%
setView(lng = mapViewState$lng, lat = mapViewState$lat, zoom = mapViewState$zoom)
})
}
# Run the Shiny application
shinyApp(ui, server)
##
## Listening on http://127.0.0.1:4291
#Load the Census API Key
census_api_key("40281d2187403a277b174e30172129f2e2263059", install = TRUE, overwrite = TRUE)
## Your original .Renviron will be backed up and stored in your R HOME directory if needed.
## Your API key has been stored in your .Renviron and can be accessed by Sys.getenv("CENSUS_API_KEY").
## To use now, restart R or run `readRenviron("~/.Renviron")`
## [1] "40281d2187403a277b174e30172129f2e2263059"
#Load readRenviron to access API key
readRenviron("~/.Renviron")
#Assign the value of the API key to the variable api_key
api_key <- Sys.getenv("CENSUS_API_KEY")
library(purrr)
income_inequality <- get_acs(geography = "county",
variables = "B19083_001",
year = 2020,
survey = "acs5")
## Getting data from the 2016-2020 5-year ACS
# Define a vector of years for which you want to pull data
years <- 2015:2020
# Modify income_inequality to include year
income_inequality <- map_df(years, function(year) {
year_data <- get_acs(geography = "county",
variables = "B19083_001",
year = year,
survey = "acs5")
mutate(year_data, year = year)
})
## Getting data from the 2011-2015 5-year ACS
## Getting data from the 2012-2016 5-year ACS
## Getting data from the 2013-2017 5-year ACS
## Getting data from the 2014-2018 5-year ACS
## Getting data from the 2015-2019 5-year ACS
## Getting data from the 2016-2020 5-year ACS
# View the resulting data
head(income_inequality)
## # A tibble: 6 × 6
## GEOID NAME variable estimate moe year
## <chr> <chr> <chr> <dbl> <dbl> <int>
## 1 01001 Autauga County, Alabama B19083_001 0.423 0.0175 2015
## 2 01003 Baldwin County, Alabama B19083_001 0.456 0.0108 2015
## 3 01005 Barbour County, Alabama B19083_001 0.464 0.015 2015
## 4 01007 Bibb County, Alabama B19083_001 0.441 0.0475 2015
## 5 01009 Blount County, Alabama B19083_001 0.404 0.0145 2015
## 6 01011 Bullock County, Alabama B19083_001 0.465 0.0414 2015
median_income <- get_acs(geography = "county",
variables = "B19013_001",
year = 2020,
survey = "acs5")
## Getting data from the 2016-2020 5-year ACS
# Define a vector of years for which you want to pull data
years <- 2015:2020
# Modify median_income to include year
median_income <- map_df(years, function(year) {
year_data <- get_acs(geography = "county",
variables = "B19013_001",
year = year,
survey = "acs5")
mutate(year_data, year = year)
})
## Getting data from the 2011-2015 5-year ACS
## Getting data from the 2012-2016 5-year ACS
## Getting data from the 2013-2017 5-year ACS
## Getting data from the 2014-2018 5-year ACS
## Getting data from the 2015-2019 5-year ACS
## Getting data from the 2016-2020 5-year ACS
# View the resulting data
head(median_income)
## # A tibble: 6 × 6
## GEOID NAME variable estimate moe year
## <chr> <chr> <chr> <dbl> <dbl> <int>
## 1 01001 Autauga County, Alabama B19013_001 51281 2391 2015
## 2 01003 Baldwin County, Alabama B19013_001 50254 1263 2015
## 3 01005 Barbour County, Alabama B19013_001 32964 2973 2015
## 4 01007 Bibb County, Alabama B19013_001 38678 3995 2015
## 5 01009 Blount County, Alabama B19013_001 45813 3141 2015
## 6 01011 Bullock County, Alabama B19013_001 31938 5884 2015
# Perform the join with corrected column names and including year
combined_data <- income_inequality %>%
left_join(median_income, by = c("GEOID", "year"))
head(combined_data)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x estimate.x moe.x year NAME.y variable.y estimate.y
## <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr> <dbl>
## 1 01001 Autauga… B19083_001 0.423 0.0175 2015 Autau… B19013_001 51281
## 2 01003 Baldwin… B19083_001 0.456 0.0108 2015 Baldw… B19013_001 50254
## 3 01005 Barbour… B19083_001 0.464 0.015 2015 Barbo… B19013_001 32964
## 4 01007 Bibb Co… B19083_001 0.441 0.0475 2015 Bibb … B19013_001 38678
## 5 01009 Blount … B19083_001 0.404 0.0145 2015 Bloun… B19013_001 45813
## 6 01011 Bullock… B19083_001 0.465 0.0414 2015 Bullo… B19013_001 31938
## # ℹ 1 more variable: moe.y <dbl>
combined_data <- combined_data %>%
rename(
Gini_Index = estimate.x,
Median_Income = estimate.y
)
head(combined_data)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x Gini_Index moe.x year NAME.y variable.y
## <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr>
## 1 01001 Autauga County, Al… B19083_001 0.423 0.0175 2015 Autau… B19013_001
## 2 01003 Baldwin County, Al… B19083_001 0.456 0.0108 2015 Baldw… B19013_001
## 3 01005 Barbour County, Al… B19083_001 0.464 0.015 2015 Barbo… B19013_001
## 4 01007 Bibb County, Alaba… B19083_001 0.441 0.0475 2015 Bibb … B19013_001
## 5 01009 Blount County, Ala… B19083_001 0.404 0.0145 2015 Bloun… B19013_001
## 6 01011 Bullock County, Al… B19083_001 0.465 0.0414 2015 Bullo… B19013_001
## # ℹ 2 more variables: Median_Income <dbl>, moe.y <dbl>
# Load US map data
us_counties <- sf::st_as_sf(maps::map("county", plot = FALSE, fill = TRUE))
print(colnames(us_counties))
## [1] "ID" "geom"
head(us_counties)
## Simple feature collection with 6 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.01778 ymin: 30.24071 xmax: -85.06131 ymax: 34.2686
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom
## alabama,autauga alabama,autauga MULTIPOLYGON (((-86.50517 3...
## alabama,baldwin alabama,baldwin MULTIPOLYGON (((-87.93757 3...
## alabama,barbour alabama,barbour MULTIPOLYGON (((-85.42801 3...
## alabama,bibb alabama,bibb MULTIPOLYGON (((-87.02083 3...
## alabama,blount alabama,blount MULTIPOLYGON (((-86.9578 33...
## alabama,bullock alabama,bullock MULTIPOLYGON (((-85.66866 3...
# Split the ID at the comma to separate the state and county
us_counties_split <- strsplit(as.character(us_counties$ID), ",")
# Create a new GEOID by pasting the county and state together in the format "County, State"
us_counties$GEOID <- sapply(us_counties_split, function(x) paste(trimws(x[2]), trimws(x[1]), sep = ", "))
head(us_counties)
## Simple feature collection with 6 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.01778 ymin: 30.24071 xmax: -85.06131 ymax: 34.2686
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom GEOID
## alabama,autauga alabama,autauga MULTIPOLYGON (((-86.50517 3... autauga, alabama
## alabama,baldwin alabama,baldwin MULTIPOLYGON (((-87.93757 3... baldwin, alabama
## alabama,barbour alabama,barbour MULTIPOLYGON (((-85.42801 3... barbour, alabama
## alabama,bibb alabama,bibb MULTIPOLYGON (((-87.02083 3... bibb, alabama
## alabama,blount alabama,blount MULTIPOLYGON (((-86.9578 33... blount, alabama
## alabama,bullock alabama,bullock MULTIPOLYGON (((-85.66866 3... bullock, alabama
library(tools)
# Assuming that 'ID' in us_counties is formatted as 'state,county' and we need it as 'County, State'
us_counties$GEOID <- sapply(strsplit(as.character(us_counties$ID), ","),
function(x) {
# Capitalize each part and paste them together
paste(toTitleCase(trimws(x[2])), "County,", toTitleCase(trimws(x[1])))
})
head(us_counties)
## Simple feature collection with 6 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.01778 ymin: 30.24071 xmax: -85.06131 ymax: 34.2686
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom
## alabama,autauga alabama,autauga MULTIPOLYGON (((-86.50517 3...
## alabama,baldwin alabama,baldwin MULTIPOLYGON (((-87.93757 3...
## alabama,barbour alabama,barbour MULTIPOLYGON (((-85.42801 3...
## alabama,bibb alabama,bibb MULTIPOLYGON (((-87.02083 3...
## alabama,blount alabama,blount MULTIPOLYGON (((-86.9578 33...
## alabama,bullock alabama,bullock MULTIPOLYGON (((-85.66866 3...
## GEOID
## alabama,autauga Autauga County, Alabama
## alabama,baldwin Baldwin County, Alabama
## alabama,barbour Barbour County, Alabama
## alabama,bibb Bibb County, Alabama
## alabama,blount Blount County, Alabama
## alabama,bullock Bullock County, Alabama
# Since combined_data already has a NAME column with "County, State" format, we can use that as the GEOID
combined_data$GEOID <- combined_data$NAME.x
head(combined_data)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x Gini_Index moe.x year NAME.y variable.y
## <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr>
## 1 Autauga County, A… Autau… B19083_001 0.423 0.0175 2015 Autau… B19013_001
## 2 Baldwin County, A… Baldw… B19083_001 0.456 0.0108 2015 Baldw… B19013_001
## 3 Barbour County, A… Barbo… B19083_001 0.464 0.015 2015 Barbo… B19013_001
## 4 Bibb County, Alab… Bibb … B19083_001 0.441 0.0475 2015 Bibb … B19013_001
## 5 Blount County, Al… Bloun… B19083_001 0.404 0.0145 2015 Bloun… B19013_001
## 6 Bullock County, A… Bullo… B19083_001 0.465 0.0414 2015 Bullo… B19013_001
## # ℹ 2 more variables: Median_Income <dbl>, moe.y <dbl>
head(map_data)
##
## 1 function (map, region = ".", exact = FALSE, ...)
## 2 {
## 3 check_installed("maps", reason = "for `map_data()`")
## 4 map_obj <- maps::map(map, region, exact = exact, plot = FALSE,
## 5 fill = TRUE, ...)
## 6 fortify(map_obj)
# Perform the merge using the new GEOID columns
map_data <- left_join(us_counties, combined_data, by = "GEOID")
# Check column names
print(colnames(combined_data))
## [1] "GEOID" "NAME.x" "variable.x" "Gini_Index"
## [5] "moe.x" "year" "NAME.y" "variable.y"
## [9] "Median_Income" "moe.y"
print(colnames(income_inequality))
## [1] "GEOID" "NAME" "variable" "estimate" "moe" "year"
print(colnames(median_income))
## [1] "GEOID" "NAME" "variable" "estimate" "moe" "year"
print(colnames(us_counties))
## [1] "ID" "geom" "GEOID"
print(colnames(map_data))
## [1] "ID" "GEOID" "NAME.x" "variable.x"
## [5] "Gini_Index" "moe.x" "year" "NAME.y"
## [9] "variable.y" "Median_Income" "moe.y" "geom"
# Since combined_data has a NAME column with "County, State" format, we will use that for merging
combined_data$GEOID <- combined_data$NAME.x
# check if combined_data has a column that matches GEOID in us_counties
if("NAME.x" %in% names(combined_data)) {
# If yes, use that for GEOID
combined_data$GEOID <- combined_data$NAME.x
} else {
}
# Now perform the join using the GEOID columns
map_data <- left_join(us_counties, combined_data, by = "GEOID")
ggplot(map_data) +
geom_sf(aes(fill = Gini_Index)) +
labs(title = "Income Inequality by County",
fill = "Gini Index") +
theme_minimal()
ggplot(map_data) +
geom_sf(aes(fill = Gini_Index), color = "white", size = 0.1) +
scale_fill_viridis_c(
name = "Gini Index",
labels = scales::label_number(suffix = ""),
option = "D"
) +
labs(
title = "Income Inequality by County in 2020",
subtitle = "Visualized using Gini Index values",
caption = "Source: American Community Survey 2016-2020"
) +
theme_minimal() +
theme(
legend.position = "right",
plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(size = 12),
plot.caption = element_text(size = 8)
)
# Filter for Louisiana entries in combined_data
louisiana_combined <- combined_data[grep(", Louisiana", combined_data$NAME.x), ]
# Check the first few entries to see the naming convention
head(louisiana_combined)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x Gini_Index moe.x year NAME.y variable.y
## <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr>
## 1 Acadia Parish, Lo… Acadi… B19083_001 0.468 0.0126 2015 Acadi… B19013_001
## 2 Allen Parish, Lou… Allen… B19083_001 0.469 0.0376 2015 Allen… B19013_001
## 3 Ascension Parish,… Ascen… B19083_001 0.404 0.0101 2015 Ascen… B19013_001
## 4 Assumption Parish… Assum… B19083_001 0.450 0.0284 2015 Assum… B19013_001
## 5 Avoyelles Parish,… Avoye… B19083_001 0.488 0.0189 2015 Avoye… B19013_001
## 6 Beauregard Parish… Beaur… B19083_001 0.449 0.0227 2015 Beaur… B19013_001
## # ℹ 2 more variables: Median_Income <dbl>, moe.y <dbl>
# Filter for Louisiana entries in us_counties
louisiana_counties <- us_counties[grep("louisiana", us_counties$ID, ignore.case = TRUE), ]
# Check the first few entries to see the naming convention
head(louisiana_counties)
## Simple feature collection with 6 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -93.74163 ymin: 29.62765 xmax: -90.63619 ymax: 31.35225
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom
## louisiana,acadia louisiana,acadia MULTIPOLYGON (((-92.61863 3...
## louisiana,allen louisiana,allen MULTIPOLYGON (((-92.5957 30...
## louisiana,ascension louisiana,ascension MULTIPOLYGON (((-91.12894 3...
## louisiana,assumption louisiana,assumption MULTIPOLYGON (((-91.23207 3...
## louisiana,avoyelles louisiana,avoyelles MULTIPOLYGON (((-92.32642 3...
## louisiana,beauregard louisiana,beauregard MULTIPOLYGON (((-93.12856 3...
## GEOID
## louisiana,acadia Acadia County, Louisiana
## louisiana,allen Allen County, Louisiana
## louisiana,ascension Ascension County, Louisiana
## louisiana,assumption Assumption County, Louisiana
## louisiana,avoyelles Avoyelles County, Louisiana
## louisiana,beauregard Beauregard County, Louisiana
# Create a new column GEOID with updated names
us_counties$GEOID <- ifelse(grepl("louisiana", us_counties$ID, ignore.case = TRUE),
gsub("County", "Parish", us_counties$GEOID),
us_counties$GEOID)
# Filter for Louisiana entries in us_counties
louisiana_counties <- us_counties[grep("louisiana", us_counties$ID, ignore.case = TRUE), ]
# Check the first few entries to see the naming convention
head(louisiana_counties)
## Simple feature collection with 6 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -93.74163 ymin: 29.62765 xmax: -90.63619 ymax: 31.35225
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom
## louisiana,acadia louisiana,acadia MULTIPOLYGON (((-92.61863 3...
## louisiana,allen louisiana,allen MULTIPOLYGON (((-92.5957 30...
## louisiana,ascension louisiana,ascension MULTIPOLYGON (((-91.12894 3...
## louisiana,assumption louisiana,assumption MULTIPOLYGON (((-91.23207 3...
## louisiana,avoyelles louisiana,avoyelles MULTIPOLYGON (((-92.32642 3...
## louisiana,beauregard louisiana,beauregard MULTIPOLYGON (((-93.12856 3...
## GEOID
## louisiana,acadia Acadia Parish, Louisiana
## louisiana,allen Allen Parish, Louisiana
## louisiana,ascension Ascension Parish, Louisiana
## louisiana,assumption Assumption Parish, Louisiana
## louisiana,avoyelles Avoyelles Parish, Louisiana
## louisiana,beauregard Beauregard Parish, Louisiana
# Since combined_data has a NAME column with "County, State" format, we will use that for merging
combined_data$GEOID <- combined_data$NAME.x
# Perform the merge using the new GEOID columns
map_data <- left_join(us_counties, combined_data, by = "GEOID")
ggplot(map_data) +
geom_sf(aes(fill = Gini_Index), color = "white", size = 0.1) + # Use 'estimate' for fill
scale_fill_viridis_c(
name = "Gini Index",
labels = scales::label_number(suffix = ""),
option = "D"
) +
labs(
title = "Income Inequality by County in 2020",
subtitle = "Visualized using Gini Index values",
caption = "Source: American Community Survey 2016-2020"
) +
theme_minimal() +
theme(
legend.position = "right",
plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(size = 12),
plot.caption = element_text(size = 8)
)
ggplot(map_data) +
geom_sf(aes(fill = Median_Income), color = "white", size = 0.1) + # Use 'estimate' for fill
scale_fill_viridis_c(
name = "Median Income",
labels = scales::label_number(suffix = ""),
option = "D"
) +
labs(
title = "Median Household Income by County in 2020",
subtitle = "Visualized using Gini Index values",
caption = "Source: American Community Survey 2016-2020"
) +
theme_minimal() +
theme(
legend.position = "right",
plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(size = 12),
plot.caption = element_text(size = 8)
)
head(combined_data)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x Gini_Index moe.x year NAME.y variable.y
## <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr>
## 1 Autauga County, A… Autau… B19083_001 0.423 0.0175 2015 Autau… B19013_001
## 2 Baldwin County, A… Baldw… B19083_001 0.456 0.0108 2015 Baldw… B19013_001
## 3 Barbour County, A… Barbo… B19083_001 0.464 0.015 2015 Barbo… B19013_001
## 4 Bibb County, Alab… Bibb … B19083_001 0.441 0.0475 2015 Bibb … B19013_001
## 5 Blount County, Al… Bloun… B19083_001 0.404 0.0145 2015 Bloun… B19013_001
## 6 Bullock County, A… Bullo… B19083_001 0.465 0.0414 2015 Bullo… B19013_001
## # ℹ 2 more variables: Median_Income <dbl>, moe.y <dbl>
str(combined_data)
## tibble [19,321 × 10] (S3: tbl_df/tbl/data.frame)
## $ GEOID : chr [1:19321] "Autauga County, Alabama" "Baldwin County, Alabama" "Barbour County, Alabama" "Bibb County, Alabama" ...
## $ NAME.x : chr [1:19321] "Autauga County, Alabama" "Baldwin County, Alabama" "Barbour County, Alabama" "Bibb County, Alabama" ...
## $ variable.x : chr [1:19321] "B19083_001" "B19083_001" "B19083_001" "B19083_001" ...
## $ Gini_Index : num [1:19321] 0.423 0.456 0.464 0.441 0.404 ...
## $ moe.x : num [1:19321] 0.0175 0.0108 0.015 0.0475 0.0145 0.0414 0.0184 0.0111 0.0356 0.0243 ...
## $ year : int [1:19321] 2015 2015 2015 2015 2015 2015 2015 2015 2015 2015 ...
## $ NAME.y : chr [1:19321] "Autauga County, Alabama" "Baldwin County, Alabama" "Barbour County, Alabama" "Bibb County, Alabama" ...
## $ variable.y : chr [1:19321] "B19013_001" "B19013_001" "B19013_001" "B19013_001" ...
## $ Median_Income: num [1:19321] 51281 50254 32964 38678 45813 ...
## $ moe.y : num [1:19321] 2391 1263 2973 3995 3141 ...
str(map_data)
## Classes 'sf' and 'data.frame': 18001 obs. of 12 variables:
## $ ID : chr "alabama,autauga" "alabama,autauga" "alabama,autauga" "alabama,autauga" ...
## $ GEOID : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ NAME.x : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ variable.x : chr "B19083_001" "B19083_001" "B19083_001" "B19083_001" ...
## $ Gini_Index : num 0.423 0.429 0.45 0.46 0.454 ...
## $ moe.x : num 0.0175 0.0177 0.0391 0.041 0.0392 0.0326 0.0108 0.0098 0.01 0.0097 ...
## $ year : int 2015 2016 2017 2018 2019 2020 2015 2016 2017 2018 ...
## $ NAME.y : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ variable.y : chr "B19013_001" "B19013_001" "B19013_001" "B19013_001" ...
## $ Median_Income: num 51281 53099 55317 58786 58731 ...
## $ moe.y : num 2391 2631 2838 2972 4410 ...
## $ geom :sfc_MULTIPOLYGON of length 18001; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:51, 1:2] -86.5 -86.5 -86.5 -86.6 -86.6 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geom"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr [1:11] "ID" "GEOID" "NAME.x" "variable.x" ...
head(us_counties)
## Simple feature collection with 6 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.01778 ymin: 30.24071 xmax: -85.06131 ymax: 34.2686
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## ID geom
## alabama,autauga alabama,autauga MULTIPOLYGON (((-86.50517 3...
## alabama,baldwin alabama,baldwin MULTIPOLYGON (((-87.93757 3...
## alabama,barbour alabama,barbour MULTIPOLYGON (((-85.42801 3...
## alabama,bibb alabama,bibb MULTIPOLYGON (((-87.02083 3...
## alabama,blount alabama,blount MULTIPOLYGON (((-86.9578 33...
## alabama,bullock alabama,bullock MULTIPOLYGON (((-85.66866 3...
## GEOID
## alabama,autauga Autauga County, Alabama
## alabama,baldwin Baldwin County, Alabama
## alabama,barbour Barbour County, Alabama
## alabama,bibb Bibb County, Alabama
## alabama,blount Blount County, Alabama
## alabama,bullock Bullock County, Alabama
# Load US map data
us_counties <- sf::st_as_sf(maps::map("county", plot = FALSE, fill = TRUE))
# Correct the GEOID in us_counties to match the format in combined_data
us_counties$GEOID <- sapply(strsplit(as.character(us_counties$ID), ","),
function(x) {
paste(trimws(x[2]), "County,", trimws(x[1]))
})
# Convert to title case and ensure whitespace is managed correctly
us_counties$GEOID <- gsub(" ,", ", ", toTitleCase(us_counties$GEOID))
# Join geographic data with combined data
map_data <- left_join(us_counties, combined_data, by = "GEOID")
# Remove rows with NA in any of the key columns
map_data <- map_data %>%
filter(!is.na(Gini_Index), !is.na(Median_Income))
# Optional: Check the structure and head of the final data
str(map_data)
## Classes 'sf' and 'data.frame': 17583 obs. of 12 variables:
## $ ID : chr "alabama,autauga" "alabama,autauga" "alabama,autauga" "alabama,autauga" ...
## $ GEOID : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ NAME.x : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ variable.x : chr "B19083_001" "B19083_001" "B19083_001" "B19083_001" ...
## $ Gini_Index : num 0.423 0.429 0.45 0.46 0.454 ...
## $ moe.x : num 0.0175 0.0177 0.0391 0.041 0.0392 0.0326 0.0108 0.0098 0.01 0.0097 ...
## $ year : int 2015 2016 2017 2018 2019 2020 2015 2016 2017 2018 ...
## $ NAME.y : chr "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" "Autauga County, Alabama" ...
## $ variable.y : chr "B19013_001" "B19013_001" "B19013_001" "B19013_001" ...
## $ Median_Income: num 51281 53099 55317 58786 58731 ...
## $ moe.y : num 2391 2631 2838 2972 4410 ...
## $ geom :sfc_MULTIPOLYGON of length 17583; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:51, 1:2] -86.5 -86.5 -86.5 -86.6 -86.6 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geom"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr [1:11] "ID" "GEOID" "NAME.x" "variable.x" ...
head(map_data, n = 20)
## Simple feature collection with 20 features and 11 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.01778 ymin: 30.24071 xmax: -85.06131 ymax: 33.24874
## Geodetic CRS: +proj=longlat +ellps=clrk66 +no_defs +type=crs
## First 10 features:
## ID GEOID NAME.x variable.x
## 1 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 2 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 3 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 4 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 5 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 6 alabama,autauga Autauga County, Alabama Autauga County, Alabama B19083_001
## 7 alabama,baldwin Baldwin County, Alabama Baldwin County, Alabama B19083_001
## 8 alabama,baldwin Baldwin County, Alabama Baldwin County, Alabama B19083_001
## 9 alabama,baldwin Baldwin County, Alabama Baldwin County, Alabama B19083_001
## 10 alabama,baldwin Baldwin County, Alabama Baldwin County, Alabama B19083_001
## Gini_Index moe.x year NAME.y variable.y Median_Income
## 1 0.4227 0.0175 2015 Autauga County, Alabama B19013_001 51281
## 2 0.4295 0.0177 2016 Autauga County, Alabama B19013_001 53099
## 3 0.4501 0.0391 2017 Autauga County, Alabama B19013_001 55317
## 4 0.4602 0.0410 2018 Autauga County, Alabama B19013_001 58786
## 5 0.4542 0.0392 2019 Autauga County, Alabama B19013_001 58731
## 6 0.4552 0.0326 2020 Autauga County, Alabama B19013_001 57982
## 7 0.4564 0.0108 2015 Baldwin County, Alabama B19013_001 50254
## 8 0.4608 0.0098 2016 Baldwin County, Alabama B19013_001 51365
## 9 0.4618 0.0100 2017 Baldwin County, Alabama B19013_001 52562
## 10 0.4609 0.0097 2018 Baldwin County, Alabama B19013_001 55962
## moe.y geom
## 1 2391 MULTIPOLYGON (((-86.50517 3...
## 2 2631 MULTIPOLYGON (((-86.50517 3...
## 3 2838 MULTIPOLYGON (((-86.50517 3...
## 4 2972 MULTIPOLYGON (((-86.50517 3...
## 5 4410 MULTIPOLYGON (((-86.50517 3...
## 6 4839 MULTIPOLYGON (((-86.50517 3...
## 7 1263 MULTIPOLYGON (((-87.93757 3...
## 8 991 MULTIPOLYGON (((-87.93757 3...
## 9 1348 MULTIPOLYGON (((-87.93757 3...
## 10 1204 MULTIPOLYGON (((-87.93757 3...
print(summary(map_data))
## ID GEOID NAME.x variable.x
## Length:17583 Length:17583 Length:17583 Length:17583
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Gini_Index moe.x year NAME.y
## Min. :0.2567 Min. :0.00120 Min. :2015 Length:17583
## 1st Qu.:0.4200 1st Qu.:0.01440 1st Qu.:2016 Class :character
## Median :0.4418 Median :0.02180 Median :2017 Mode :character
## Mean :0.4440 Mean :0.02494 Mean :2018
## 3rd Qu.:0.4655 3rd Qu.:0.03140 3rd Qu.:2019
## Max. :0.6962 Max. :0.26180 Max. :2020
## variable.y Median_Income moe.y geom
## Length:17583 Min. : 19501 Min. : 247 MULTIPOLYGON :17583
## Class :character 1st Qu.: 41706 1st Qu.: 1865 epsg:NA : 0
## Mode :character Median : 49067 Median : 2723 +proj=long...: 0
## Mean : 50621 Mean : 3304
## 3rd Qu.: 56762 3rd Qu.: 4018
## Max. :147111 Max. :65791
The project to create a Shiny application for visualizing income inequality and median household income across U.S. counties encountered significant challenges, particularly in implementing the R script. The goal was to integrate data from the American Community Survey (ACS) from 2015 to 2020 using R packages such as shiny, ggplot2, sf, dplyr, tidycensus, maps, scales, and rlang. The user interface was thoughtfully designed to allow a selection of years, data display options, and visualization settings. However, difficulties arose during the rendering phase of the application. Despite debugging attempts and checks on the data structure, the persistent error “non-numeric argument to binary operator” indicated unresolved data manipulation or plotting issues.
This setback highlighted the complexities of working with diverse data sources and interactive web frameworks. Despite the logical structure of the code and its apparent potential for creating a dynamic and informative map, the application still needs to materialize as envisioned. The process, though fraught with obstacles, offered valuable insights into data handling, dataset merging, and the intricacies of interactive visualization in R. It underscored the ongoing nature of the project and the continuous journey in data science and application development, emphasizing that resolution of these challenges is still a work in progress.
ui <- fluidPage(
titlePanel("Median Household Income by US County"),
sidebarLayout(
sidebarPanel(
selectInput("yearInput", "Select Year", choices = 2015:2020),
hr(),
helpText("Visualization Settings:"),
sliderInput("opacityInput",
"Map Opacity:",
min = 0,
max = 1,
value = 0.7),
selectInput("colorSchemeInput",
"Select Color Scheme:",
choices = c("Blue", "Red", "Green", "Purple", "Orange"),
selected = "Orange")
),
mainPanel(
plotOutput("mapOutput")
)
)
)
server <- function(input, output) {
# Reactive expression for combined data
data_reactive <- reactive({
req(input$yearInput) # Ensure this runs only when a year is selected
# Get ACS data for the selected year
median_income <- get_acs(geography = "county",
variables = "B19013_001",
year = input$yearInput,
survey = "acs5")
# Rename the estimate to Median_Income and ensure it's numeric
median_income <- median_income %>%
mutate(Median_Income = as.numeric(estimate)) %>%
select(GEOID, Median_Income, year)
# Load and prepare US counties map data
us_counties <- sf::st_as_sf(maps::map("county", plot = FALSE, fill = TRUE))
us_counties <- us_counties %>%
mutate(GEOID = sapply(strsplit(as.character(ID), ","),
function(x) paste(trimws(x[2]), "County,", trimws(x[1])))) %>%
mutate(GEOID = gsub(" ,", ", ", toTitleCase(GEOID)))
# Merge map data with ACS data
map_data <- left_join(us_counties, median_income, by = "GEOID")
# Check data structure and handle NA values
map_data <- map_data %>%
drop_na(Median_Income) %>%
mutate(Median_Income = as.numeric(Median_Income))
str(map_data) # Check the structure of map_data
return(map_data)
})
# Render the map
output$mapOutput <- renderPlot({
# Retrieve the data
map_data <- data_reactive()
# Plotting code
ggplot(map_data) +
geom_sf(aes(fill = Median_Income), color = "white", size = 0.1) +
scale_fill_viridis_c(
na.value = "grey50",
labels = label_number(suffix = ""),
option = "D"
) +
labs(
title = paste("Median Household Income by County in", as.character(input$yearInput)),
caption = "Source: American Community Survey"
) +
theme_minimal() +
theme(
legend.position = "right",
plot.title = element_text(size = 14, face = "bold"),
plot.caption = element_text(size = 8)
) +
guides(fill = guide_legend(override.aes = list(alpha = input$opacityInput))) # Set the legend's opacity based on input
})
}
# Run the application
shinyApp(ui = ui, server = server)
##
## Listening on http://127.0.0.1:6198
## Warning: Error in -: non-numeric argument to binary operator