This report analyzes whether higher per capita healthcare spending is associated with reduced adult obesity prevalence at the state level. The purpose is to see if healthcare investment correlates with improved public health outcomes related to obesity.
Hc_Spend <- read_excel("HcSpend.xlsx")
adult_obesity <- read_excel("adult obesity.xlsx")
head(Hc_Spend)
## # A tibble: 6 × 2
## Location `2020__Health Spending per Capita`
## <chr> <dbl>
## 1 United States 10191
## 2 Alabama 9280
## 3 Alaska 13642
## 4 Arizona 8756
## 5 Arkansas 9338
## 6 California 10299
head(adult_obesity)
## # A tibble: 6 × 3
## Rank State `Obesity %`
## <dbl> <chr> <dbl>
## 1 1 West Virginia 0.412
## 2 2 Louisiana 0.399
## 3 2 Mississippi 0.401
## 4 3 Arkansas 0.4
## 5 5 Alabama 0.392
## 6 6 Oklahoma 0.387
str(Hc_Spend)
## tibble [52 × 2] (S3: tbl_df/tbl/data.frame)
## $ Location : chr [1:52] "United States" "Alabama" "Alaska" "Arizona" ...
## $ 2020__Health Spending per Capita: num [1:52] 10191 9280 13642 8756 9338 ...
Hc_Spend_clean <- Hc_Spend %>%
rename(Per_Capita_Spend = `2020__Health Spending per Capita`) %>%
select(Location, Per_Capita_Spend)
str(adult_obesity$`Obesity %`)
## num [1:51] 0.412 0.399 0.401 0.4 0.392 0.387 0.378 0.378 0.376 0.366 ...
adult_obesity <- adult_obesity %>%
rename(
Location = State,
Obesity_Percent = `Obesity %`
)
merged_data <- inner_join(adult_obesity, Hc_Spend_clean, by = "Location")
if (max(merged_data$Obesity_Percent, na.rm = TRUE) <= 1) {
merged_data <- merged_data %>%
mutate(Obesity_Percent = Obesity_Percent * 100)
}
# Merge Dataset
merged_data <- merged_data %>%
mutate(
Spending_Quintile = ntile(Per_Capita_Spend, 5),
Spending_Quintile = factor(
Spending_Quintile,
levels = 1:5,
labels = c("Lowest 20%", "Low", "Middle", "High", "Highest 20%")
)
)
# Inspect final merged data
head(merged_data)
## # A tibble: 6 × 5
## Rank Location Obesity_Percent Per_Capita_Spend Spending_Quintile
## <dbl> <chr> <dbl> <dbl> <fct>
## 1 1 West Virginia 41.2 12769 Highest 20%
## 2 2 Louisiana 39.9 10515 High
## 3 2 Mississippi 40.1 9394 Low
## 4 3 Arkansas 40 9338 Low
## 5 5 Alabama 39.2 9280 Low
## 6 6 Oklahoma 38.7 9444 Low
print(merged_data)
## # A tibble: 51 × 5
## Rank Location Obesity_Percent Per_Capita_Spend Spending_Quintile
## <dbl> <chr> <dbl> <dbl> <fct>
## 1 1 West Virginia 41.2 12769 Highest 20%
## 2 2 Louisiana 39.9 10515 High
## 3 2 Mississippi 40.1 9394 Low
## 4 3 Arkansas 40 9338 Low
## 5 5 Alabama 39.2 9280 Low
## 6 6 Oklahoma 38.7 9444 Low
## 7 7 Indiana 37.8 10517 High
## 8 7 Iowa 37.8 9789 Low
## 9 8 Tennessee 37.6 9336 Low
## 10 9 Nebraska 36.6 10514 Middle
## # ℹ 41 more rows
ggplot(merged_data, aes(x = Spending_Quintile, y = Obesity_Percent, fill = Spending_Quintile)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, option = "D") +
labs(
title = "Obesity Prevalence by Healthcare Spending Quintile",
x = "Healthcare Spending Quintile",
y = "Adult Obesity Rate (%)",
fill = "Spending Quintile"
) +
theme_minimal(base_size = 14)
The boxplot illustrates a nonlinear relationship between healthcare costs and obesity rates. While states in the top expenditure quintet had lower median obesity rates and a broader range of outcomes, the middle quintiles have similar obesity levels, and the low spending quintile has an unexpectedly greater median obesity rate than the lowest spending group. These trends indicate that, while healthcare costs may play a role, other social and economic factors have a significant impact on obesity prevalence.