For this project, I decided to look at fast food restaurants per state, obesity levels per state, and poverty levels per state. Because studies show that there is correlation between obesity and poverty rate, I wanted to test this across states. Furthermore, because obesity tends to correlate with fast food, I wanted to see if these two variables seemed to relate to each other by state. I utilized data that included obesity level per state, number of fast food restaurants per state, number of fast food restaurants per capita and poverty levels per state.
When beginning this project, I installed the following packages:
library(sf)
library(ggplot2)
library(maptools)
library(ggthemes)
library(tibble)
library(viridis)
library(tidyverse)
And used the following basic theme within my plots:
theme(panel.grid.major = element_line(colour = 'transparent'),
plot.title = element_text(hjust = 0.5),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
legend.title = element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
I began by loading a map of the United States into R, and filtering out the data to only include states in the continental United States.
usa <- st_read("cb_2017_us_state_20m/cb_2017_us_state_20m.shp")
usa_48 <- usa %>%
filter(!(NAME %in% c("Alaska", "District of Columbia", "Hawaii", "Puerto Rico")))
I created the following code in order to load the dataset regarding obesity rates and further analyze the obesity rate in each state:
obesitymap <- st_read("obesity data.csv")
obesitymap %>% rename("NAME" = field_1) %>% rename("Obesity" = field_2) ->obesityclean
obesityclean %>%
inner_join(usa_48) -> mergedobesity
mergedobesitydiscrete <- mergedobesity %>% data.frame(group = as.factor(c(24.2, 20.2, 28.6, 30.8, 32.1, 34.6, 36.2, 28.9, 26.1, 32.4, 25, 30.1, 33.8, 32.4, 29.2, 30.7, 35.6, 28.4, 34.5,31.3,34.2,30,25.3,29.7, 30.7,31.7,30.4,24.3,31.2, 35.6,31.4, 26.7,26.3,25.6,28.8,30.1,31,26,29.8,33.9,30, 26.8,23.6, 24.5,25.1,26.4,35.6,29)), x=c(24.2, 20.2, 28.6, 30.8, 32.1, 34.6, 36.2, 28.9, 26.1, 32.4, 25, 30.1, 33.8, 32.4, 29.2, 30.7, 35.6, 28.4, 34.5,31.3,34.2,30,25.3,29.7, 30.7,31.7,30.4,24.3,31.2, 35.6,31.4, 26.7,26.3,25.6,28.8,30.1,31,26,29.8,33.9,30, 26.8,23.6, 24.5,25.1,26.4,35.6,29))
mergedobesitydiscrete %>%
ggplot() +
geom_sf(aes(fill=x)) +
coord_sf(xlim = c(-130,-60), ylim = c(20,50))+ ggtitle("Obesity Rates per State")+
theme(panel.grid.major = element_line(colour = 'transparent'), axis.title.x=element_blank(),
plot.title = element_text(hjust = 0.5),
axis.text.x=element_blank(),
legend.title=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
The following code was used to load the poverty rate data set and analyze the poverty levels in each state:
povertymap <- st_read("poverty data.csv")
povertymap %>% rename("NAME" = field_1) %>% rename("Poverty" = field_2) ->povertyclean
merge(povertyclean, usa_48, by="NAME") -> mergedpoverty
mergedpovertydiscrete <- mergedpoverty %>% data.frame(group = as.factor(c(16.2, 16.1, 16, 13.9, 8.5, 9.8, 11.6, 13, 15.4, 11.1, 12.1, 11.8, 9.8, 11.2, 15.2, 20.2, 12.7, 7.1, 9.6, 11.1, 8.7, 21.1, 13, 11.7, 9.6, 10.1, 6.4, 9.4, 17.8, 11.9, 13.6, 11.1, 13.7, 14.6, 11.8, 11.1, 11.4, 14.1, 14.5, 14.9, 13.8, 8.6, 9.6, 11.4, 11, 18, 10.7, 10.9)),
x=c(16.2, 16.1, 16, 13.9, 8.5, 9.8, 11.6, 13, 15.4, 11.1, 12.1, 11.8, 9.8, 11.2, 15.2, 20.2, 12.7, 7.1, 9.6, 11.1, 8.7, 21.1, 13, 11.7, 9.6, 10.1, 6.4, 9.4, 17.8, 11.9, 13.6, 11.1, 13.7, 14.6, 11.8, 11.1, 11.4, 14.1, 14.5, 14.9, 13.8, 8.6, 9.6, 11.4, 11, 18, 10.7, 10.9))
mergedpovertydiscrete %>%
ggplot() +
geom_sf(aes(fill=x)) +
coord_sf(xlim = c(-130,-60), ylim = c(20,50)) + ggtitle("Poverty by State") +
theme(panel.grid.major = element_line(colour = 'transparent'),
plot.title = element_text(hjust = 0.5),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
legend.title = element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
Additionally, the following code was used to both load the dataset for fast food locations and the fast food per capita in each state as well as map the results:
fastfood <- read.csv("fast-food-restaurants/FastFoodRestaurants.csv")
fastfood %>%
filter(country %in% "US") -> fastfoodus
fastfoodus %>%
group_by(province) %>%
add_tally() -> fastfoodprovince
fastfoodnumbers <- ggplot() +
geom_sf(data=usa_48) +
geom_point(data=fastfoodprovince, aes(longitude, latitude, alpha = n, color = "red")) +
coord_sf(xlim = c(-130,-60), ylim = c(20,50)) + ggtitle("Fast Food Restaurant Locations")+
theme(panel.grid.major = element_line(colour = 'transparent'), axis.title.x=element_blank(),
legend.position="none",
plot.title = element_text(hjust= 0.5),
axis.text.x=element_blank(),
legend.title=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
fastfoodmap <- st_read("fast food data.csv")
fastfoodmap %>% rename("NAME" = field_1) %>% rename("ffpc" = field_2) ->fastfoodclean
fastfoodclean %>%
inner_join(usa_48) -> mergedfastfood
mergedfastfooddiscrete <- mergedfastfood %>% data.frame(group = as.factor(c(1.9, 2, 2.1, 2.1, 2.4, 2.5, 3.1, 3.1, 3.1, 3.2, 3.2, 3.3, 3.6, 3.6, 3.6, 3.6, 3.7, 3.8, 3.8, 3.9, 4, 4, 4, 4.1, 4.1, 4.3, 4.3, 4.3, 4.3, 4.4, 4.5, 4.5, 4.6, 4.7, 4.7, 4.7, 4.7, 4.7, 4.8, 4.9, 4.9, 4.9, 5, 5.2, 5.3, 5.3, 5.4, 6.3)), x=c(1.9, 2, 2.1, 2.1, 2.4, 2.5, 3.1, 3.1, 3.1, 3.2, 3.2, 3.3, 3.6, 3.6, 3.6, 3.7, 3.8, 3.8, 3.9, 4, 4, 4, 4.1, 4.1, 4.3, 4.3, 4.3, 4.3, 4.4, 4.5, 4.5, 4.6, 4.7, 4.7, 4.7, 4.7, 4.7, 4.8, 4.9, 4.9, 4.9, 5, 5.2, 5.3, 5.3, 5.4, 6.3))
fastfoodplot <- mergedfastfooddiscrete %>%
ggplot() +
geom_sf(aes(fill=x)) +
coord_sf(xlim = c(-130,-60), ylim = c(20,50))+ ggtitle("Fast Food per Capita by State")+
theme(panel.grid.major = element_line(colour = 'transparent'), axis.title.x=element_blank(),
plot.title = element_text(hjust= 0.5),
axis.text.x=element_blank(),
legend.title=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
From this code, I was able to generate the following maps:
The first factor that I analyzed was obesity rate per state, which is the percentage of residents in each state that are considered obese. While 20.2% of people are obese in Colorado at the low end of the spectrum, 36.2% of people located in Louisiana fall into the obesity category, making Louisiana the state with the highest rate of obesity. When defining obesity, it typically means that someone is at least 20 percent over their ideal body rate. These people also have a BMI of 30 or higher.
When looking at fast food per capita, this compared the number of fast food restaurants to every 10,000 people located in each state. The fast food per capita ranges from Vermont, with 1.9 fast food restaurants per 10,000 people, to Alabama, with 6.3 fast food restaurants per capita.
This map shows the traditional latitude and longitude of fast food restaurants across the United States. This map is useful in visualizing the areas in which concentrations of fast food restaurants are high. The darker red colors indicate higher concentrations of fast food restaurants in these locations.
The poverty rate by state map shows the percentage of people that fall below the poverty line in each state. The state poverty levels range from 6.4 percent of residents (New Hampshire) to 21.1 percent of residents (Mississippi). As of 2018, the qualifications for being considered at or below poverty level are as following:
When looking first at the obesity levels by state, it is evident that there are high levels of obesity surrounding the southeastern portion of the United States, and into the southern portion of the Midwest. Regionally, there seems to be lower levels of obesity towards the west coast of the United States.
Furthermore, the number of fast food restaurant seems to correlate with the hypothesis in being highly concentrated in both the southeastern portion of the United States, along with California and Texas. In similarity, when looking at poverty levels, the results are extremely similar, with the highest poverty levels falling in the same region of the United States.
While the maps are helpful in gaining a general overview of areas where there are correlation between obesity, poverty, and fast food, I looked further into the specific states with both the highest and lowest rates of all three categories. When looking at tabular results for the states with the highest obesity rates, amount of fast food per capita, and poverty rate, I received the following results:
Obesity Rate | Fast Food per | Poverty Rate
| capita |
------------- | ------------- | ------------
Louisiana | Alabama | Mississippi
West Virginia | Nebraska | Louisiana
Mississippi | West Virginia | West Virginia
Alabama | Oklahoma | New Mexico
Kentucky | Tennessee | Alabama
Arkansas | Indiana | Arizona
Kansas | Washington | Arkansas
Oklahoma | Georgia | Georgia
Tennessee | Missouri | Kentucky
Texas | South Carolina| Tennessee
These results helped to further solidify my hypothesis as a majority of the states reoccur across two, if not three, categories. There is a high correlation between poverty rate and obesity rate, with 6 out of the 10 states appearing to be in the ten states with both the highest obesity and poverty rates. Furthermore, 3 states appeared to fall in the top 10 for all three categories shown.
When looking at tabular results for the states with the lowest obesity rates, amount of fast food per capita, and poverty rate, I received the following results:
Obesity Rate | Fast Food per | Poverty Rate
| capita |
------------- | ------------- | ------------
Colorado | Vermont | New Hampshire
Montana | New Jersey | Maryland
California | New York | Colorado
Massachussetts| Mississippi | Utah
Utah | Connecticut | Minnesota
New York | Rhode Island | New Jersey
Vermont | Massachussetts| Massachussetts
Connecticut | Maine | Nebraska
New Jersey | Washington | Vermont
Rhode Island | Pennsylvania | Connecticut
This table helped to further solidify my results as, again, a majority of the states with low obesity rates, poverty rates, and fast food per capita appeared across multiple categories. In fact, 6 out of 10 states with the lowest obesity rates also appeared to have the lowest poverty rates. Furthermore, 4 states appeared in the bottom 10 across all three categories.
In conclusion, I found that my hypothesis was supported. There was a high correlation between obesity rate and poverty rate across states. Furthermore, there was a strong correlation between fast food per capita and obesity rates. Overall, the maps showed that similar areas had similar shadings across all three maps. Additionally, the states listed in the tables seemed to often repeat across two (if not three) categories.
To further analyze my results, I would look into the effects of different regions of the United States. As there seems to be large portions of states in the southeast that fall into high levels of obesity, fast food, and poverty, I think that it would be interesting to further analyze this segment of the United States in accordance with these variables. By focusing on a specific segment of the United States, I could compare a state with high obesity levels to a state with low obesity level by county and city.