Rows: 810103 Columns: 24
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (17): StateAbbr, StateDesc, CityName, GeographicLevel, DataSource, Categ...
dbl (6): Year, Data_Value, Low_Confidence_Limit, High_Confidence_Limit, Cit...
num (1): PopulationCount
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
data(cities500)
Warning in data(cities500): data set 'cities500' not found
Split GeoLocation (lat, long) into two columns: lat and long
# A tibble: 6 × 25
Year StateAbbr StateDesc CityName GeographicLevel DataSource Category
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 CA California Hawthorne Census Tract BRFSS Health Outcom…
2 2017 CA California Hawthorne City BRFSS Unhealthy Beh…
3 2017 CA California Hayward City BRFSS Health Outcom…
4 2017 CA California Hayward City BRFSS Unhealthy Beh…
5 2017 CA California Hemet City BRFSS Prevention
6 2017 CA California Indio Census Tract BRFSS Health Outcom…
# ℹ 18 more variables: UniqueID <chr>, Measure <chr>, Data_Value_Unit <chr>,
# DataValueTypeID <chr>, Data_Value_Type <chr>, Data_Value <dbl>,
# Low_Confidence_Limit <dbl>, High_Confidence_Limit <dbl>,
# Data_Value_Footnote_Symbol <chr>, Data_Value_Footnote <chr>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
# A tibble: 6 × 18
Year StateAbbr StateDesc CityName GeographicLevel Category UniqueID Measure
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 AL Alabama Montgome… City Prevent… 151000 Choles…
2 2017 CA California Concord City Prevent… 616000 Visits…
3 2017 CA California Concord City Prevent… 616000 Choles…
4 2017 CA California Fontana City Prevent… 624680 Visits…
5 2017 CA California Richmond Census Tract Prevent… 0660620… Choles…
6 2017 FL Florida Davie Census Tract Prevent… 1216475… Choles…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
ny <- prevention |>filter(StateAbbr=="NY")head(ny)
# A tibble: 6 × 18
Year StateAbbr StateDesc CityName GeographicLevel Category UniqueID Measure
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 NY New York Buffalo Census Tract Prevent… 3611000… "Chole…
2 2017 NY New York Rochester Census Tract Prevent… 3663000… "Curre…
3 2017 NY New York Rochester Census Tract Prevent… 3663000… "Visit…
4 2017 NY New York Rochester Census Tract Prevent… 3663000… "Chole…
5 2017 NY New York Schenecta… Census Tract Prevent… 3665508… "Takin…
6 2017 NY New York New York Census Tract Prevent… 3651000… "Curre…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
Check for NY state cities in dataset
unique(ny$CityName)
[1] "Buffalo" "Rochester" "Schenectady" "New York" "Mount Vernon"
[6] "New Rochelle" "Albany" "Syracuse" "Yonkers"
1. Filtering dataset to concentrate only on Buffalo, NY
# A tibble: 6 × 18
Year StateAbbr StateDesc CityName GeographicLevel Category UniqueID Measure
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 NY New York Buffalo Census Tract Prevention 3611000… Choles…
2 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
3 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
4 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
5 2017 NY New York Buffalo Census Tract Prevention 3611000… Choles…
6 2017 NY New York Buffalo Census Tract Prevention 3611000… Taking…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
nrow(buff_data)
[1] 320
names(buff_data) # lists all variables in cleaned dataset
ggplot(buff_data, aes(x = Short_Question_Text, y = Data_Value, fill = Short_Question_Text)) +geom_bar(stat ="identity") +theme_minimal() +scale_fill_manual(values =c("#FFA07A", "#98FB98", "#87CEEB", "#DA70D6"), name ="Indicators") +labs(title ="Health Indicators in Buffalo, NY - 2017",x ="Health Indicator",y ="Crude Prevalence (%)") +theme(axis.text.x =element_text(angle =45, hjust =1),plot.title =element_text(size =16, face ="bold", color ="#2E4A62") # Adjust title font size, style, and color )
3. Map of subsetted dataset
Double-checking numeric values and data summary
str(buff_data$lat)
num [1:320] 42.9 42.9 42.8 42.9 42.9 ...
str(buff_data$long)
num [1:320] -78.9 -78.9 -78.8 -78.8 -78.9 ...
head(buff_data)
# A tibble: 6 × 18
Year StateAbbr StateDesc CityName GeographicLevel Category UniqueID Measure
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 NY New York Buffalo Census Tract Prevention 3611000… Choles…
2 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
3 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
4 2017 NY New York Buffalo Census Tract Prevention 3611000… Visits…
5 2017 NY New York Buffalo Census Tract Prevention 3611000… Choles…
6 2017 NY New York Buffalo Census Tract Prevention 3611000… Taking…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
summary(buff_data$Data_Value)
Min. 1st Qu. Median Mean 3rd Qu. Max.
7.30 24.23 73.75 60.85 79.20 86.70
library(leaflet)
Warning: package 'leaflet' was built under R version 4.4.2
# Ensure lat and long are correctly referenced and not NAbuff_data <- buff_data[!is.na(buff_data$lat) &!is.na(buff_data$long), ]leaflet(buff_data) %>%addProviderTiles("CartoDB.Positron") %>%addCircles(lng =~long,lat =~lat,weight =1,radius =~scales::rescale(Data_Value, to =c(50, 500)), # Rescale to fixed rangecolor ="#FDBB30", fillColor ="#002654", fillOpacity =0.6,popup =~paste("<strong>Health Indicator:</strong>", Short_Question_Text,"<br><strong>Crude Prevalence:</strong>", Data_Value, "%") ) %>%setView(lng =-78.8784, lat =42.8864, zoom =12)
To have a better look at the prevalence
library(ggplot2)library(sf)
Warning: package 'sf' was built under R version 4.4.2
Linking to GEOS 3.12.2, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE
buffalo_data_sf <-st_as_sf(buff_data, coords =c("long", "lat"), crs =4326)ggplot() +geom_sf(data = buffalo_data_sf, aes(size = Data_Value, color = Short_Question_Text), alpha =0.7) +theme_minimal() +labs(title ="Health Indicators in Buffalo, 2017",color ="Indicator",size ="Prevalence (%)") +scale_color_manual(values =c("#FFA07A", "#98FB98", "#87CEEB", "#DA70D6")) +theme(plot.title =element_text(size =16, face ="bold", color ="#2E4A62"))
4. Buffalo map with hover tooltip
library(leaflet)library(scales)
Warning: package 'scales' was built under R version 4.4.1
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
The visualizations on Buffalo’s health indicators in 2017 bring together different layers of insight into community health behaviors. The bar plot highlights variations in prevalence rates across indicators like annual checkups and cholesterol screenings, showing where public health engagement is stronger and where more outreach might be needed, and the geographic scatter plot maps these indicators across Buffalo neighborhoods, allowing us to see how prevalence varies by location. Taking it a step further, the interactive map offers a deeper dive by adding neighborhood names and specific data in a mouseover tooltip, making it easier to identify areas with higher or lower engagement. Together, these tools paint a clear picture of Buffalo’s health landscape, helping pinpoint where resources could have the greatest impact.