library(tidyverse)
library(plotly)
library(leaflet)
library(ggrepel)
library(ggthemes)
library(viridis)
setwd("/Users/nhi.vu/Desktop/DATA110")
cities500 <- read_csv("500CitiesLocalHealthIndicators.cdc.csv")
data(cities500)500 Healthy Cities GIS Assignment
Load the libraries and set the working directory
The GeoLocation variable has (lat, long) format
Split GeoLocation (lat, long) into two columns: lat and long
latlong <- cities500|>
mutate(GeoLocation = str_replace_all(GeoLocation, "[()]", ""))|>
separate(GeoLocation, into = c("lat", "long"), sep = ",", convert = TRUE)
head(latlong)# A tibble: 6 × 25
Year StateAbbr StateDesc CityName GeographicLevel DataSource Category
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 2017 CA California Hawthorne Census Tract BRFSS Health Outcom…
2 2017 CA California Hawthorne City BRFSS Unhealthy Beh…
3 2017 CA California Hayward City BRFSS Health Outcom…
4 2017 CA California Hayward City BRFSS Unhealthy Beh…
5 2017 CA California Hemet City BRFSS Prevention
6 2017 CA California Indio Census Tract BRFSS Health Outcom…
# ℹ 18 more variables: UniqueID <chr>, Measure <chr>, Data_Value_Unit <chr>,
# DataValueTypeID <chr>, Data_Value_Type <chr>, Data_Value <dbl>,
# Low_Confidence_Limit <dbl>, High_Confidence_Limit <dbl>,
# Data_Value_Footnote_Symbol <chr>, Data_Value_Footnote <chr>,
# PopulationCount <dbl>, lat <dbl>, long <dbl>, CategoryID <chr>,
# MeasureId <chr>, CityFIPS <dbl>, TractFIPS <dbl>, Short_Question_Text <chr>
For your assignment, work with a cleaned dataset.
1. Once you run the above code and learn how to filter in this format, filter this dataset however you choose so that you have a subset with no more than 900 observations.
Filter chunk here
obesity1 <- latlong |>
filter(StateAbbr == "CA") |>
filter(Data_Value_Type == "Age-adjusted prevalence") |>
filter(Category == "Unhealthy Behaviors") |>
filter(Short_Question_Text %in% c("Obesity", "Binge Drinking", "Current Smoking")) |>
filter(Year == 2017) |>
select (Year, StateAbbr, CityName, PopulationCount, Data_Value, Short_Question_Text, lat, long)
head(obesity1)# A tibble: 6 × 8
Year StateAbbr CityName PopulationCount Data_Value Short_Question_Text lat
<dbl> <chr> <chr> <dbl> <dbl> <chr> <dbl>
1 2017 CA Indio 76036 17.7 Binge Drinking 33.7
2 2017 CA Corona 152374 26.8 Obesity 33.9
3 2017 CA Fullerton 135161 18.4 Binge Drinking 33.9
4 2017 CA Fullerton 135161 20.5 Obesity 33.9
5 2017 CA San Diego 1307402 22.4 Obesity 32.8
6 2017 CA Tracy 82922 20 Binge Drinking 37.7
# ℹ 1 more variable: long <dbl>
top15 <- obesity1 |>
arrange(desc(PopulationCount)) |>
head(15)2. Based on the GIS tutorial (Japan earthquakes), create one plot about something in your subsetted dataset.
First plot chunk here
p1 <- ggplot(top15, aes(x = CityName,
y = Data_Value,
fill = Short_Question_Text,
text = paste0("City: ", CityName, # to add text for the tool tips
"<br>Population Count: ", PopulationCount,
"<br>Prevalence (%): ", Data_Value, "%"))) +
geom_col(position = "dodge") +
coord_flip()+
labs(title = "Obesity vs Smoking vs Drinking in Top 5 most populated CA Cities (2017)",
x = "City",
y = "Prevalence (%)",
fill = "Health Behavior") +
theme_grey()+
scale_fill_wsj() p2 <- ggplotly(p1, tooltip = "text") #to make it interactive, can be hover over.
p2You can hover over the points for exact percentage!
3. Now create a map of your subsetted dataset.
First map chunk here
leaflet() |>
setView(lng = -119.4, lat = 36.7, zoom = 5.6) |> # Put in the coords for CA to have it set as the view
addProviderTiles("Esri.WorldStreetMap") |> # this is for the design of the map
addCircles(
data = obesity1, # this is to plot the circles on the map
radius = obesity1$PopulationCount / 100,
color = "green") Assuming "long" and "lat" are longitude and latitude, respectively
4. Refine your map to include a mouse-click tooltip
Refined map chunk here
popup1 <- paste0("<b>City: </b>", obesity1$CityName, "<br>", #adding the text for the tool tip
"<b>Population: </b>", obesity1$PopulationCount, "<br>",
"<b>Unhealthy Behavior: </b>", obesity1$Short_Question_Text, "<br>",
"<b>Prevalence Rate: </b>", obesity1$Data_Value, "%", "<br>")
leaflet() |>
setView(lng = -119.4, lat = 36.7, zoom = 5.6) |> # Put in the coords for CA to have it set as the view
addProviderTiles("Esri.WorldStreetMap") |> # this is for the design of the map
addCircles(
data = obesity1, # this is to plot the circles on the map
radius = obesity1$PopulationCount / 100,
color = "green",
popup = popup1) Assuming "long" and "lat" are longitude and latitude, respectively
You can know click on each circle to see their information!
Next I’m adding colors for 3 different types of unhealthy behaviors
colors1 <- colorFactor(palette = c("blue", "brown", "deeppink"), levels = c("Binge Drinking", "Current Smoking", "Obesity"), obesity1$Short_Question_Text)
map1 <- leaflet(obesity1) |>
setView(lng = -119, lat = 36, zoom = 6.2) |>
addProviderTiles("Esri.WorldStreetMap") |>
addCircles(
fillColor = ~colors1(Short_Question_Text),
fillOpacity = 1,
data = obesity1,
radius = obesity1$PopulationCount / 100,
color = ~colors1(Short_Question_Text),
popup = popup1) Assuming "long" and "lat" are longitude and latitude, respectively
map1 |>
addLegend("bottomleft", pal = colors1, values = obesity1$Short_Question_Text,
title = "Unhealthy Behaviors", opacity = 1)5. Write a paragraph
In a paragraph, describe the plots you created and what they show.
I first downloaded all of the library packages that I needed. I then cleaned and filtered the data set to only use the information I need. For the project, I wanted to make a bar graph of the prevalence rates of smoking, binge drinking, and obesity in CA in 2017. I originally wanted to include the 15 most populated cities in California, but the filtering took me down to the top 5 instead. To improve readability for viewers to read the names of the cities, I utilized coord_flip(). It can be seen from the bar graph, which issue became the most common in most cities, which is obesity. Although in San Francisco, binge drinking was prevalent the most. I added tool tips for the viewer that displayed the city name, population and exact percentage when the user hovers over any of the bars.
Following the bar chart, I created an interactive map of California with the same data set. The map indicates all the cities in CA and which unhealthy behavior is highlighted in each city. I used the instructions that were included in the earthquake mapping tutorial: I set the coordinates, selected a particular map design and placed circle markers for each city. Each circle marker had a tooltip that displayed the city name, population, unhealthy behavior, and the prevalence rate of the particular unhealthy behavior. In addition, I used color coding and included a legend which improved the visualization substantially.
For the mapping graph, we can see again that obesity comes across as a public health issue for a number of the California cities in the data set. Overall, the project illustrates that obesity is pervasive as evidenced by its classification as a public health issue in California.