Assignment: Investigating the Relationship Between Healthcare Spending and Adult Obesity
Public health outcomes often reflect differences in policy priorities and resource allocation. This assignment uses state-level adult obesity prevalence data to explore whether differences in healthcare spending correlate with obesity rates.
Do differences in per capita healthcare spending correlate with adult obesity prevalence?
Your task is to analyze the provided obesity data and source additional data on healthcare spending to address this question.
Data Collection:
1. Obesity Data:
The state-level adult obesity prevalence dataset is provided in the attached file.
Obesity 2024
2. Per Capita Healthcare Spending Data:
url <- "https://raw.githubusercontent.com/Aconrard/DATA608/refs/heads/main/adult_obesity.csv"
df_1 <- read.csv(url)
url2 <- "https://raw.githubusercontent.com/Aconrard/DATA608/refs/heads/main/raw_data.csv"
df_2 <- read.csv(url2)
url3 <- 'https://raw.githubusercontent.com/Aconrard/DATA608/refs/heads/main/scraped_table.csv'
df_3 <- read.csv(url3)
1. Merge the Datasets:
df_2$State <- rownames(df_2)
rownames(df_2) <- 1:nrow(df_2)
df_2 <- df_2[, c("State", setdiff(names(df_2), "State"))]
df_2 <- df_2[-(1:3), ]
colnames(df_2)[2] <- "Expenditures_Millions"
df_2 <- df_2[1:(nrow(df_2)-7), ]
colnames(df_3)[1] <- "State"
combined_df_1 <- merge(df_1, df_2, by = "State", all = TRUE)
combined_df <- merge(combined_df_1, df_3, by= "State", all = TRUE)
combined_df$Expenditures_Millions <- gsub("[$,]", "", combined_df$Expenditures_Millions)
combined_df$Population_Estimate.2024 <- gsub(",","", combined_df$Population_Estimate.2024)
combined_df$Expenditures_Millions <- as.numeric(combined_df$Expenditures_Millions)
combined_df$Population_Estimate.2024 <- as.numeric(combined_df$Population_Estimate.2024)
combined_df <- subset(combined_df, State !="Puerto Rico")
combined_df <- combined_df |>
mutate(per_capita_exp = round((Expenditures_Millions*1000000)/Population_Estimate.2024))
q1 <- quantile(combined_df$per_capita_exp, 0.2)
q2 <- quantile(combined_df$per_capita_exp, 0.4)
q3 <- quantile(combined_df$per_capita_exp, 0.6)
q4 <- quantile(combined_df$per_capita_exp, 0.8)
combined_df$Quintile <- ifelse(combined_df$per_capita_exp <= q1, "Lowest 20%",
ifelse(combined_df$per_capita_exp <= q2, "Second Lowest 20%",
ifelse(combined_df$per_capita_exp <= q3, "Middle 20%",
ifelse(combined_df$per_capita_exp <= q4, "Second Highest 20%",
"Highest 20%"))))
combined_df$Quintile <- factor(combined_df$Quintile,
levels = c("Lowest 20%",
"Second Lowest 20%",
"Middle 20%",
"Second Highest 20%",
"Highest 20%"))
2. Focus Metric:
## [1] "Mean Obesity Prevalence by Spending Quintile:"
## Quintile Obesity..
## 1 Lowest 20% 33.10
## 2 Second Lowest 20% 35.70
## 3 Middle 20% 34.34
## 4 Second Highest 20% 32.86
## 5 Highest 20% 32.24
1. Create a Single Visualization in R (ggplot2):
2. Accessibility Requirements:
ggplot(combined_df, aes(x = Quintile, y = Obesity.., fill = Quintile)) +
geom_boxplot() +
scale_fill_brewer(palette = "Set2") +
labs(x = "Per Capita Healthcare Spending Quintiles",
y = "Obesity Prevalence (%)",
title = "Obesity Rates by Per Capita Spending Quintiles") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none", plot.title = element_text(hjust = 0.5))
Analysis of obesity rates across healthcare spending quintiles reveals a complex relationship. While the second lowest spending quintile shows the highest median obesity rate at approximately 36%, the highest and second-highest spending quintiles demonstrate lower obesity rates, suggesting that increased healthcare spending might have some protective effect. However, this relationship is not straightforward, as evidenced by Hawaii’s position as an extreme outlier in the middle quintile (26.1%). When Hawaii is excluded from the analysis, the middle quintile actually emerges with the highest obesity prevalence. The substantial overlap in obesity rates across all spending quintiles indicates that higher healthcare spending does not necessarily translate to lower obesity rates, suggesting that states, particularly those in the second lowest spending quintile, may need to examine their healthcare strategies beyond simply increasing expenditure to effectively address obesity issues.