library(tidyverse)
library(tidyr)
setwd("C:/Users/tosin/Downloads")
cities500 <- read_csv("500CitiesLocalHealthIndicators.cdc.csv")
data(cities500)Healthy Cities GIS Assignment
Load the libraries and set the working directory
The GeoLocation variable has (lat, long) format
Split GeoLocation (lat, long) into two columns: lat and long
latlong <- cities500|>
mutate(GeoLocation = str_replace_all(GeoLocation, "[()]", ""))|>
separate(GeoLocation, into = c("lat", "long"), sep = ",", convert = TRUE)
head(latlong)# A tibble: 6 × 25
Year StateAbbr StateDesc CityName GeographicLevel DataSource Category
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 CA California Hawthorne Census Tract BRFSS Health Outcom…
2 2017 CA California Hawthorne City BRFSS Unhealthy Beh…
3 2017 CA California Hayward City BRFSS Health Outcom…
4 2017 CA California Hayward City BRFSS Unhealthy Beh…
5 2017 CA California Hemet City BRFSS Prevention
6 2017 CA California Indio Census Tract BRFSS Health Outcom…
# ℹ 18 more variables: UniqueID <chr>, Measure <chr>, Data_Value_Unit <chr>,
# DataValueTypeID <chr>, Data_Value_Type <chr>, Data_Value <dbl>,
# Low_Confidence_Limit <dbl>, High_Confidence_Limit <dbl>,
# Data_Value_Footnote_Symbol <chr>, Data_Value_Footnote <chr>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
Filter chunk here (you may need multiple chunks)
latlong_ca_aap <- latlong |>
filter(
StateAbbr == "CA",
Data_Value_Type == "Age-adjusted prevalence",
Year == 2017,
Measure == "Coronary heart disease among adults aged >=18 Years"
) |>
filter(!is.na(lat) & !is.na(long) & !is.na(Data_Value))top_cities <- latlong_ca_aap |>
count(CityName, sort = TRUE)|>
slice_head(n = 5) |>
pull(CityName) Create a histogram plot to explore patterns in your filtered dataset.
ggplot(latlong_ca_aap, aes(x = Data_Value)) +
geom_histogram(binwidth = 0.5, fill = "royalblue", color = "white", alpha = 0.7) +
labs(
title = "Distribution of Coronary Heart Disease Prevalence\nCalifornia Cities (2017)",
x = "Prevalence (%)",
y = "Number of Observations"
) +
theme_classic()Create a map to visualize your filtered dataset.
library(leaflet)
leaflet(latlong_ca_aap) |>
addProviderTiles("Esri.WorldStreetMap") |>
addCircleMarkers(
lng = ~long,
lat = ~lat,
radius = 5,
color = "orchid",
fillColor = "orchid",
fillOpacity = 0.7,
stroke = TRUE
) |>
setView(
lng = mean(latlong_ca_aap$long, na.rm = TRUE),
lat = mean(latlong_ca_aap$lat, na.rm = TRUE),
zoom = 6.5
)Refined Map with mouseclick
library(leaflet)
leaflet(latlong_ca_aap) |>
addProviderTiles("Esri.WorldStreetMap") |>
addCircleMarkers(
lng = ~long,
lat = ~lat,
radius = 5,
color = "orchid",
fillColor = "orchid",
fillOpacity = 0.7,
stroke = TRUE,
popup = ~paste0(
"<strong>City: </strong>", CityName, "<br>",
"<strong>CHD Prevalence: </strong>", round(Data_Value, 2), "%"
)
) |>
setView(
lng = mean(latlong_ca_aap$long, na.rm = TRUE),
lat = mean(latlong_ca_aap$lat, na.rm = TRUE),
zoom = 6.5
)5. Write a paragraph
The histogram provides a clear overview of how coronary heart disease prevalence varies among California cities, illustrating whether most cities have similar rates or if there is a wide range of values. This allows the audience to quickly spot patterns, such as clusters of cities with higher or lower prevalence, and to identify any unusual cases that stand out from the rest. The interactive Leaflet map builds on this analysis by adding a geographical component teach city is represented by a orchid marker placed at its actual location on the map. When a user clicks on a marker, a popup appears showing the city’s name and its specific prevalence value. This interactivity makes it easy to explore the data spatially, helping users identify regional patterns, clusters, or outliers that might not be as obvious from the histogram alone.