Healthy Cities GIS Assignment

Author

Charanpreet Singh

Loading the libraries & accessing CSV

library(tidyverse)
library(tidyr)
cities500 <- read_csv("500CitiesLocalHealthIndicators.cdc.csv")
data(cities500)

Cleaning the dataset

latlong <- cities500 |>
  mutate(GeoLocation = str_replace_all(GeoLocation, "[()]", "")) |>
  separate(GeoLocation, into = c("lat", "long"), sep = ",", convert = TRUE)

Filtering out data to utilize

latlong_clean <- latlong |>
  filter(StateDesc != "United States") |>
  filter(Data_Value_Type == "Crude prevalence") |>
  filter(Year == 2017) |>
  filter(StateAbbr == "MD") |>
  filter(Category == "Health Outcomes") |>
  filter(Measure == "Arthritis among adults aged >=18 Years") |>
  select(CityName, StateAbbr, Measure, Data_Value, GeographicLevel, lat, long)

Checking # of obersrvations

nrow(latlong_clean)
[1] 201

201 observations spotted, which is below that 900 mark # Scatterplot Spread of Arthritis in MD

#non map plot
ggplot(latlong_clean, aes(x = long, y = lat, size = Data_Value)) +
  geom_point(alpha = 0.7, color = "darkgreen") +
  labs(title = "Geographic Spread of Arthritis Crude Prevalence in MD (2017)",
       x = "Longitude", y = "Latitude", size = "Prevalence (%)") +
  theme_minimal()
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).

This scatterplot shows the geographic spread of arthritis prevalence across Maryland cities in 2017, with larger points representing higher rates.

Map of Subsetted data

library(leaflet)
# leaflet()
# Define color palette for prevalence values
pal3 <- colorNumeric(palette = "YlGnBu", domain = latlong_clean$Data_Value)

leaflet(latlong_clean) |>
  addProviderTiles("CartoDB.Positron") |>
  addCircleMarkers(
    lng = ~long,
    lat = ~lat,
    radius = ~Data_Value / 2,
    color = ~pal3(Data_Value),
    stroke = TRUE,
    weight = 1,
    fillOpacity = 0.6,
    label = ~paste0(CityName, ": ", Data_Value, "%")
  ) |>
  addLegend(
    position = "bottomleft",
    pal = pal3,
    values = ~Data_Value,
    title = "Arthritis Prevalence (%)",
    opacity = 1
  )

This basic interactive map shows arthritis prevalence by city with circle size indicating how common the condition is.

Filtering data for smoking prevelance

Filtering more data to include only MD cities with all smoking adults in 2017

smoking_data <- latlong |>
  filter(StateDesc != "United States") |>
  filter(Data_Value_Type == "Crude prevalence") |>
  filter(Year == 2017) |>
  filter(StateAbbr == "MD") |>
  filter(Category == "Unhealthy Behaviors") |>
  filter(Measure == "Current smoking among adults aged >=18 Years") |>
  select(CityName, Data_Value, lat, long)

Creating mouseclick interactive Map of Smoking Prevelance in Baltimore

pal4 <- colorNumeric(palette = "YlOrRd", domain = smoking_data$Data_Value)

leaflet(smoking_data) |>
  addProviderTiles("CartoDB.Positron") |>
  addCircleMarkers(
  lng = ~long,
  lat = ~lat,
  radius = ~Data_Value / 2,
  color = ~pal4(Data_Value),
  stroke = FALSE,
  fillOpacity = 0.85,
  popup = ~paste0("<b>", CityName, 
                  "</b><br>Smoking Rate: ", Data_Value, "%")
) |>
  addLegend(
    position = "bottomleft",
    pal = pal4,
    values = ~Data_Value,
    title = "Current Smoking Prevalence (%)",
    opacity = 1
  )

The map highlights cities where smoking rates are higher, helping identify areas where unhealthy behaviors may be more common. Some areas show noticeably elevated smoking rates compared to others, suggesting potential links to broader health challenges. Since smoking is a known risk factor for chronic conditions like arthritis, this map provides useful context alongside the arthritis prevalence map.

Analysis

In this project, I analyzed public health data from Maryland cities in 2017 to visualize the geographic distribution of two major health indicators: arthritis prevalence and current smoking rates. The scatterplot provided a spatial overview of arthritis, with larger points representing cities where more adults suffer from joint pain and related conditions. The first interactive map focused on arthritis prevalence, using proportional circles and a warm color scale to highlight areas with higher rates. The second map visualized smoking prevalence, a key behavioral risk factor for many chronic diseases including arthritis. By presenting two separate but related datasets, the visualizations not only show where health problems are most concentrated, but also offer insight into potential contributing behaviors.

Sources

https://loading.io/color/feature/YlOrRd-5/
For color palletes