Let’s be honest about food and weight

This is a study of eating habits from 460 real people to see what actually affects weight.

Everyone says “eat less, exercise more” but which specific foods and habits really matter? And which ones are just noise?

Turns out some things we worry about don’t matter much, but a few things matter A LOT.

The math behind this

A Chi-square test was used for this study which is a statistical test used to compare observed results with expected results . Think of it like this:

\[\chi^2 = \sum \frac{(\text{what actually happened} - \text{what we'd expect by chance})^2}{\text{what we'd expect by chance}}\]

If the number is big enough, it means the pattern is real, not just random luck.

p-value less than 0.05 = “Yeah, this is probably real”

Loading the data

# Read the study data
data <- read_csv("obesity_habits_data.csv")

# Take a quick look
head(data, 6)
## # A tibble: 6 × 8
##   study_variable  category       overweight_count overweight_percent
##   <chr>           <chr>                     <dbl>              <dbl>
## 1 hungry          never                        65               14.1
## 2 hungry          yes                          25                5.4
## 3 main_diet       rice                         86               18.7
## 4 main_diet       others                        4                0.9
## 5 meal_preference vegetarian                    6                1.3
## 6 meal_preference non_vegetarian               84               18.3
## # ℹ 4 more variables: non_overweight_count <dbl>, non_overweight_percent <dbl>,
## #   chi_square <dbl>, p_value <dbl>
cat("Total habits tested:", length(unique(data$study_variable)))
## Total habits tested: 18
cat("\nTotal people in study:", sum(data$overweight_count[1:2] + data$non_overweight_count[1:2]))
## 
## Total people in study: 460

What actually matters

# Find what's actually significant
significant_stuff <- data %>%
  filter(p_value < 0.05) %>%
  select(study_variable, chi_square, p_value) %>%
  distinct() %>%
  arrange(p_value) %>%
  head(5)

# Make it readable
significant_stuff %>%
  mutate(
    "What people do" = case_when(
      study_variable == "fruit_consumption" ~ "How much fruit they eat",
      study_variable == "junk_food_frequency" ~ "How often they eat junk", 
      study_variable == "fast_food_restaurant" ~ "Fast food frequency",
      study_variable == "advertisement_exposure" ~ "Seeing food ads",
      study_variable == "reason_skipping" ~ "Why they skip meals",
      TRUE ~ study_variable
    ),
    "How much it matters" = round(chi_square, 1),
    "Confidence level" = ifelse(p_value < 0.001, "99.9%+", paste0(round((1-p_value)*100), "%"))
  ) %>%
  select(-study_variable, -chi_square, -p_value) %>%
  kable()
What people do How much it matters Confidence level
How much fruit they eat 29.4 100%
How often they eat junk 12.9 100%
Seeing food ads 9.9 100%
Why they skip meals 5.9 98%
Fast food frequency 4.2 96%

Only 5 out of 17 things tested actually mattered. The rest? Probably just random.

Fruit: The big winner

Here’s the most shocking finding. Look at these percentages:

27% vs 7%. That’s not a small difference.

Junk food: Actually matters too

This one surprised me because there are so many normal-weight people who eat junk daily. But look at the percentages:

Still, lots of people eat junk daily and stay normal weight. It’s not a guarantee, just higher odds.

The 3D view: Everything together

People in the far corner (high on all three) are much more likely to be overweight.

What doesn’t matter (surprisingly)

Surprisingly, vegetables didn’t show up as significant. Neither did dairy, soda, or even high-fat foods.

The actual code

# Here's literally all you need to test this stuff:

# Load your data
data <- read_csv("obesity_habits_data.csv")

# Pick one habit to test
fruit_habit <- data %>% filter(study_variable == "fruit_consumption")

# Look at the percentages, not just raw counts
for(i in 1:nrow(fruit_habit)) {
  total <- fruit_habit$overweight_count[i] + fruit_habit$non_overweight_count[i]
  pct_overweight <- round((fruit_habit$overweight_count[i] / total) * 100, 1)
  cat(fruit_habit$category[i], ":", pct_overweight, "% overweight\n")
}
## less_1_per_day : 27.5 % overweight
## 1_or_more_per_day : 6.8 % overweight
# The p-value tells you if it's real
cat("p-value =", fruit_habit$p_value[1])
## p-value = 0.001
if(fruit_habit$p_value[1] < 0.05) cat(" <- This matters!")
##  <- This matters!

What I actually learned

The stuff that matters: - Eating fruit daily: 27% → 7% overweight rate
- Eating junk daily: 25% vs 11% overweight rate - Fast food 3+ times/day: 14% vs 22% overweight rate (wait, that’s backwards?)

The stuff that doesn’t: - Vegetables, dairy, soda, high-fat foods, meal skipping

Honest takeaways: - Fruit is oddly an important indicator for obesity - Daily junk food does increase your odds - A lot of “common sense” nutrition advice didn’t show up in the data - Sample size was 460 people, so take it with a grain of salt

Real talk

This is just one study. But it’s interesting that fruit came out so strong while vegetables didn’t matter.

Maybe it’s because: - People who eat fruit daily have other healthy habits?
- Fruit replaces other snacks? - The fiber and vitamins do something special?

Or maybe it’s just this particular group of 460 people.

Bottom line: If you’re going to change one thing, try trading that processed snack for a piece of fruit. Seems to be the biggest bang for your buck based on this data.