Healthy Cities GIS Assignment

Author

Asher Scott

Load the libraries and set the working directory

library(tidyverse)
library(tidyr)
setwd("/Users/asherscott/Desktop/Data 110")
cities500 <- read_csv("500CitiesLocalHealthIndicators.cdc.csv")
data(cities500)

The GeoLocation variable has (lat, long) format

Split GeoLocation (lat, long) into two columns: lat and long

latlong <- cities500|>
  mutate(GeoLocation = str_replace_all(GeoLocation, "[()]", ""))|>
  separate(GeoLocation, into = c("lat", "long"), sep = ",", convert = TRUE)
head(latlong)
# A tibble: 6 × 25
   Year StateAbbr StateDesc  CityName  GeographicLevel DataSource Category      
  <dbl> <chr>     <chr>      <chr>     <chr>           <chr>      <chr>         
1  2017 CA        California Hawthorne Census Tract    BRFSS      Health Outcom…
2  2017 CA        California Hawthorne City            BRFSS      Unhealthy Beh…
3  2017 CA        California Hayward   City            BRFSS      Health Outcom…
4  2017 CA        California Hayward   City            BRFSS      Unhealthy Beh…
5  2017 CA        California Hemet     City            BRFSS      Prevention    
6  2017 CA        California Indio     Census Tract    BRFSS      Health Outcom…
# ℹ 18 more variables: UniqueID <chr>, Measure <chr>, Data_Value_Unit <chr>,
#   DataValueTypeID <chr>, Data_Value_Type <chr>, Data_Value <dbl>,
#   Low_Confidence_Limit <dbl>, High_Confidence_Limit <dbl>,
#   Data_Value_Footnote_Symbol <chr>, Data_Value_Footnote <chr>,
#   PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
#   MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>

Filter the dataset

Remove the StateDesc that includes the United Sates, select Prevention as the category (of interest), filter for only measuring crude prevalence and select only 2017.

latlong_clean <- latlong |>
  filter(StateDesc != "United States") |>
  filter(Category == "Prevention") |>
  filter(Data_Value_Type == "Crude prevalence") |>
  filter(Year == 2017)
head(latlong_clean)
# A tibble: 6 × 25
   Year StateAbbr StateDesc  CityName   GeographicLevel DataSource Category  
  <dbl> <chr>     <chr>      <chr>      <chr>           <chr>      <chr>     
1  2017 AL        Alabama    Montgomery City            BRFSS      Prevention
2  2017 CA        California Concord    City            BRFSS      Prevention
3  2017 CA        California Concord    City            BRFSS      Prevention
4  2017 CA        California Fontana    City            BRFSS      Prevention
5  2017 CA        California Richmond   Census Tract    BRFSS      Prevention
6  2017 FL        Florida    Davie      Census Tract    BRFSS      Prevention
# ℹ 18 more variables: UniqueID <chr>, Measure <chr>, Data_Value_Unit <chr>,
#   DataValueTypeID <chr>, Data_Value_Type <chr>, Data_Value <dbl>,
#   Low_Confidence_Limit <dbl>, High_Confidence_Limit <dbl>,
#   Data_Value_Footnote_Symbol <chr>, Data_Value_Footnote <chr>,
#   PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
#   MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>

What variables are included? (can any of them be removed?)

names(latlong_clean)
 [1] "Year"                       "StateAbbr"                 
 [3] "StateDesc"                  "CityName"                  
 [5] "GeographicLevel"            "DataSource"                
 [7] "Category"                   "UniqueID"                  
 [9] "Measure"                    "Data_Value_Unit"           
[11] "DataValueTypeID"            "Data_Value_Type"           
[13] "Data_Value"                 "Low_Confidence_Limit"      
[15] "High_Confidence_Limit"      "Data_Value_Footnote_Symbol"
[17] "Data_Value_Footnote"        "PopulationCount"           
[19] "lat"                        "long"                      
[21] "CategoryID"                 "MeasureId"                 
[23] "CityFIPS"                   "TractFIPS"                 
[25] "Short_Question_Text"       

Remove the variables that will not be used in the assignment

prevention <- latlong_clean |>
  select(-DataSource,-Data_Value_Unit, -DataValueTypeID, -Low_Confidence_Limit, -High_Confidence_Limit, -Data_Value_Footnote_Symbol, -Data_Value_Footnote)
head(prevention)
# A tibble: 6 × 18
   Year StateAbbr StateDesc  CityName  GeographicLevel Category UniqueID Measure
  <dbl> <chr>     <chr>      <chr>     <chr>           <chr>    <chr>    <chr>  
1  2017 AL        Alabama    Montgome… City            Prevent… 151000   Choles…
2  2017 CA        California Concord   City            Prevent… 616000   Visits…
3  2017 CA        California Concord   City            Prevent… 616000   Choles…
4  2017 CA        California Fontana   City            Prevent… 624680   Visits…
5  2017 CA        California Richmond  Census Tract    Prevent… 0660620… Choles…
6  2017 FL        Florida    Davie     Census Tract    Prevent… 1216475… Choles…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
#   PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
#   MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
md <- prevention |>
  filter(StateAbbr=="MD")
head(md)
# A tibble: 6 × 18
   Year StateAbbr StateDesc CityName  GeographicLevel Category  UniqueID Measure
  <dbl> <chr>     <chr>     <chr>     <chr>           <chr>     <chr>    <chr>  
1  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Chole…
2  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Visit…
3  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Visit…
4  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Curre…
5  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Curre…
6  2017 MD        Maryland  Baltimore Census Tract    Preventi… 2404000… "Visit…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
#   PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
#   MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
unique(md$CityName)
[1] "Baltimore"

The new dataset “Prevention” is a manageable dataset now.

For your assignment, work with a cleaned dataset.

1. Once you run the above code, filter this dataset one more time for any particular subset with no more than 900 observations.

Filter chunk here

prevention2 <- latlong_clean |>
  select(-DataSource,-Data_Value_Unit, -Low_Confidence_Limit, -High_Confidence_Limit, -Data_Value_Footnote_Symbol, -Data_Value_Footnote, -UniqueID)
head(prevention)
# A tibble: 6 × 18
   Year StateAbbr StateDesc  CityName  GeographicLevel Category UniqueID Measure
  <dbl> <chr>     <chr>      <chr>     <chr>           <chr>    <chr>    <chr>  
1  2017 AL        Alabama    Montgome… City            Prevent… 151000   Choles…
2  2017 CA        California Concord   City            Prevent… 616000   Visits…
3  2017 CA        California Concord   City            Prevent… 616000   Choles…
4  2017 CA        California Fontana   City            Prevent… 624680   Visits…
5  2017 CA        California Richmond  Census Tract    Prevent… 0660620… Choles…
6  2017 FL        Florida    Davie     Census Tract    Prevent… 1216475… Choles…
# ℹ 10 more variables: Data_Value_Type <chr>, Data_Value <dbl>,
#   PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
#   MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
ATL <- prevention2 %>%
  filter(CityName %in% c("Atlanta"))
head(ATL)
# A tibble: 6 × 18
   Year StateAbbr StateDesc CityName GeographicLevel Category   Measure         
  <dbl> <chr>     <chr>     <chr>    <chr>           <chr>      <chr>           
1  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Cholesterol sc…
2  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Current lack o…
3  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Current lack o…
4  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Visits to doct…
5  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Current lack o…
6  2017 GA        Georgia   Atlanta  Census Tract    Prevention "Cholesterol sc…
# ℹ 11 more variables: DataValueTypeID <chr>, Data_Value_Type <chr>,
#   Data_Value <dbl>, PopulationCount <dbl>, lat <dbl>, long <dbl>,
#   CategoryID <chr>, MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>,
#   Short_Question_Text <chr>

2. Based on the GIS tutorial (Japan earthquakes), create one plot about something in your subsetted dataset.

First plot chunk here

ggplot(ATL, mapping = aes(x = Data_Value, fill = Short_Question_Text)) +
  geom_density(alpha = 0.6) +  
  labs(
    title = "Atlanta Short Question Responses",
    x = "Data Percentage"
  ) +
  scale_fill_manual(values = c(
    "Taking BP Medication" = "hotpink", 
    "Annual Checkup" = "yellow", 
    "Cholesterol Screening" = "steelblue",
    "Health Insurance" = "darkorange"
  ))
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_density()`).

3. Now create a map of your subsetted dataset.

First map chunk here

library(leaflet)
leaflet(ATL) %>%
  setView(lng = - 84.3885, lat = 33.7501, zoom =6) %>%
  addProviderTiles("Esri.WorldStreetMap") %>%
  addCircles(
    data = ATL,
    radius = ATL$Data_Value,
    color = "red")
Assuming "long" and "lat" are longitude and latitude, respectively

4. Refine your map to include a mouse-click tooltip

Refined map chunk here

popATL <- paste0(
      "<b>Year: </b>", ATL$Year, "<br>",
      "<b>Population: </b>", ATL$PopulationCount, "<br>",
       "<b>Health Percentage: </b>", ATL$Data_Value, "<br>",
      "<b>Short Response Text: </b>", ATL$Short_Question_Text, "<br>")
leaflet(ATL) %>%
  setView(lng = - 84.3885, lat = 33.7501, zoom =6) %>%
  addProviderTiles("Esri.WorldStreetMap") %>%
  addCircles(
    data = ATL,
    radius = ATL$Data_Value,
    color = "red",
    popup = popATL)
Assuming "long" and "lat" are longitude and latitude, respectively

5. Write a paragraph

In a paragraph, describe the plots you created and what they show.

My first graph is a density plot showing the percentage of all the Short Question Text responses from the city of Atlanta.I used Data_Value as the X. Annual Checkup, Cholesterol Screening, and Taking BP Medication were all hovering around 75%, while Health Insurance hovered around the 25% range. I chose distinctive colors so they can have better distinctions. My second graph is mapping the city of Atlanta, Georgia. After looking up and plugging in the coordinates of the city, I then added my dataset which then displayed the areas in the city where the data was collected. I changed the color to red because I thought it stands out more. My Final graph is a continuation of the mapping one. In the previous chunk I created a popup and added the year, population, health percentage (data_value) and Short Response Text. I wanted to add in the “Measure” column but only an error message would come up. I also struggled to add color distinctions based on Short responses.