The rising burden of obesity is a serious public health concern in the United States. In this blog post, we will explore how to visualize obesity rates by state using publicly available data from the CDC and create informative maps using ggplot2.

Data Sources

We will use the CDC’s dataset on national obesity rates, accessible via this link: National Obesity Data.

We will also need geographical data for U.S. states, which we can easily generate using the maps package in R.

We’ll start by loading the necessary libraries and importing data.

library(ggplot2)
library(dplyr)
library(sf)
library(maps)
library(ggmap)
library(rio)



Importing the obesity data

df <- import('https://data-lakecountyil.opendata.arcgis.com/api/download/v1/items/3e0c1eb04e5c48b3be9040b0589d3ccf/csv?layers=8')



#### Here is a quick glimpse of the data

glimpse(df)
## Rows: 52
## Columns: 5
## $ OBJECTID      <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
## $ NAME          <chr> "Texas", "California", "Kentucky", "Georgia", "Wisconsin…
## $ Obesity       <dbl> 32.4, 24.2, 34.6, 30.7, 30.7, 30.1, 29.2, 33.8, 36.2, 25…
## $ Shape__Area   <dbl> 7.672329e+12, 5.327809e+12, 1.128830e+12, 1.652980e+12, …
## $ Shape__Length <dbl> 15408322, 14518698, 6346699, 5795596, 6806782, 7976011, …

Next, we can access the state boundaries:

us_states <- map_data("state")


df$NAME <- tolower(df$NAME) # Convert the 'NAME' column to lowercase to match with map_data



Data Preparation

Now, we’ll join the obesity data with the U.S. state boundary data. This is crucial for mapping the obesity rates correctly according to the geography of each state.

obesity_map_data <- us_states %>%
    left_join(df, by = c("region" = "NAME"))



With our data prepared, it’s time to create our map with ggplot2.

ggplot(obesity_map_data, aes(x = long, y = lat, group = group, fill = Obesity)) +
    geom_polygon(color = "black") +
    scale_fill_gradient(low = "lightblue", high = "darkred",
                        limits = c(10, 50), # Set limits from 10 to 50
                        na.value = "grey90") +
    labs(title = "Obesity Rates in the United States",
         fill = "Obesity (%)") +
    theme_minimal() +
    theme(axis.text = element_blank(),
          axis.ticks = element_blank(),
          panel.grid = element_blank(),
          legend.position = "bottom",  
          legend.direction = "horizontal", 
          legend.box = "horizontal",  
          legend.title.align = 0.5) +  
    guides(fill = guide_colorbar(barwidth = 20, barheight = 1, 
                                  title.position = "top",  
                                  title.hjust = 0.5))