Obesity is a complex health issue influenced by various demographic and psychographic factors. While demographic aspects such as age, gender, and socioeconomic status play a significant role in obesity prevalence, psychographic factors such as dietary habits, physical activity levels, and lifestyle choices further determine an individual’s risk. Understanding the interaction between these factors is crucial for developing effective public health interventions. This study aims to answer the question: How do demographic and psychographic factors influence obesity rates, and is transportation method a key lifestyle habit that is associated with higher obesity level?
To explore this, and analysis on the dataset Estimation of Obesity Levels Based on Eating Habits and Physical Condition from UCI Machine Learning Repository. The dataset includes variables such as age, gender, eating frequency, physical activity levels, alcohol consumption, and transportation habits.
The dataset includes the following key variables:
Gender
,
Age
family_history_with_overweight
, FAVC
(hDo you
eat high caloric food frequently), NCP
(How many main meals
do you have daily), CAEC
(Do you eat any food between
meals), CH2O
(How much water do you drink daily),
FAF
(How often do you have physical activity),
CALC
(How often do you drink alcohol), MTRANS
(Which transportation do you usually use)NObeyesdad
(Obesity level)library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(tidyr)
library(dplyr)
library(RColorBrewer)
setwd("~/Desktop/DATA/Data 101/Midterm/v2")
obesity_data <- read.csv('ObesityDataSet_raw_and_data_sinthetic.csv')
head(obesity_data)
## Gender Age Height Weight family_history_with_overweight FAVC FCVC NCP
## 1 Female 21 1.62 64.0 yes no 2 3
## 2 Female 21 1.52 56.0 yes no 3 3
## 3 Male 23 1.80 77.0 yes no 2 3
## 4 Male 27 1.80 87.0 no no 3 3
## 5 Male 22 1.78 89.8 no no 2 1
## 6 Male 29 1.62 53.0 no yes 2 3
## CAEC SMOKE CH2O SCC FAF TUE CALC MTRANS
## 1 Sometimes no 2 no 0 1 no Public_Transportation
## 2 Sometimes yes 3 yes 3 0 Sometimes Public_Transportation
## 3 Sometimes no 2 no 2 1 Frequently Public_Transportation
## 4 Sometimes no 2 no 2 0 Frequently Walking
## 5 Sometimes no 2 no 0 0 Sometimes Public_Transportation
## 6 Sometimes no 2 no 0 0 Sometimes Automobile
## NObeyesdad
## 1 Normal_Weight
## 2 Normal_Weight
## 3 Normal_Weight
## 4 Overweight_Level_I
## 5 Overweight_Level_II
## 6 Normal_Weight
“This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition”.(UCI Machine Learning Repository)
data_clean <- obesity_data %>%
select(Gender, Age, family_history_with_overweight, FAVC, NCP, CAEC, CH2O, FAF, CALC, MTRANS, NObeyesdad) |>
drop_na()
sum(is.na(data_clean))
## [1] 0
ggplot(data_clean, aes(x = NObeyesdad, fill = NObeyesdad)) +
geom_bar() +
labs(title = "Distribution of Obesity Levels",
x = "Obesity Level",
y = "Count",
fill = "Obesity Level") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set2")
##Demographic Factors
ggplot(data_clean, aes(x = Age, fill = NObeyesdad)) +
geom_histogram(binwidth = 5, alpha = 0.7, position = "identity") +
facet_wrap(~Gender) +
labs(title = "Obesity Rates by Age and Gender",
x = "Age",
y = "Count",
fill = "Obesity Level") +
scale_fill_brewer(palette = "Set3")
Demographic variables such as age and gender significantly influence obesity rates. Research indicates that obesity prevalence tends to increase with age, particularly in middle-aged adults, due to metabolic slowdowns and sedentary lifestyles (Ogden et al., 2020). Additionally, gender differences play a role, as men and women experience obesity differently due to biological and hormonal variations. Studies suggest that men are more likely to accumulate visceral fat, whereas women may experience higher obesity rates due to hormonal changes during pregnancy and menopause (Flegal et al., 2019).
Beyond demographic influences, psychographic factors such as eating behaviors, physical activity, alcohol consumption, and transportation method are crucial determinants of obesity. Frequent consumption of high-calorie foods and excessive snacking contribute significantly to weight gain. Individuals who consume processed foods high in sugar and fats tend to have a higher body mass index (BMI) (Malik et al., 2013). Additionally, a sedentary lifestyle, characterized by minimal physical activity and prolonged screen time, is strongly correlated with obesity risk (Booth et al., 2017). Moreover, alcohol consumption has been linked to obesity, as excessive drinking increases calorie intake and disrupts metabolic processes (Traversy & Chaput, 2015).
max(data_clean$NCP, na.rm = TRUE)
## [1] 4
ggplot(data_clean, aes(x = NObeyesdad, y = NCP, fill = NObeyesdad)) +
geom_boxplot() +
labs(title = "Influence of Eating Frequency on Obesity",
x = "Obesity Level",
y = "Meals per Day",
fill = "Obesity Level") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set3")
ggplot(data_clean, aes(x = NObeyesdad, y = FAF, fill = NObeyesdad)) +
geom_violin() +
labs(title = "Distribution of Physical Activity Frequency by Obesity Level",
x = "Obesity Level",
y = "Physical Activity Frequency (times per week)",
fill = "Obesity Level") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set3")
ggplot(data_clean, aes(x = as.factor(CALC), fill = NObeyesdad)) +
geom_bar(position = "dodge") +
labs(title = "Alcohol Consumption and Obesity Levels",
x = "Alcohol Consumption Frequency",
y = "Count",
fill = "Obesity Level") +
scale_fill_brewer(palette = "Set3")
ggplot(data_clean, aes(x = MTRANS, fill = NObeyesdad)) +
geom_bar(position = "dodge") +
labs(title = "Transportation Method and Obesity Levels",
x = "Transportation Method",
y = "Count") +
scale_fill_brewer(palette = "Set3") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
In conclusion, both demographic and psychographic factors significantly influence obesity rates, and transportation method plays a key role in shaping these trends. Demographically, age and gender exhibit distinct patterns in obesity, with certain age groups being more susceptible. Psychographically, eating frequency and caloric intake are strongly linked to higher obesity levels, while physical activity is crucial for obesity prevention. Alcohol consumption also varies in its impact on obesity, necessitating further investigation. Importantly, transportation method is a key lifestyle habit that influences obesity levels, as reliance on passive transportation (e.g., driving) is associated with increased sedentary behavior, which contributes to higher obesity rates. In contrast, active transportation methods like walking and cycling are linked to lower obesity rates due to increased physical activity. Public health strategies should target at-risk demographic groups and encourage healthier lifestyle habits, such as active transportation, to reduce obesity levels. Future research should further explore the role of socioeconomic factors, mental health, and the long-term effects of lifestyle changes on obesity.
“Estimation of Obesity Levels Based On Eating Habits and Physical Condition .” UCI Machine Learning Repository, 2019, https://doi.org/10.24432/C5H31Z.
Flegal KM, Kruszon-Moran D, Carroll MD, Fryar CD, Ogden CL. Trends in Obesity Among Adults in the United States, 2005 to 2014. JAMA. 2016;315(21):2284–2291. doi:10.1001/jama.2016.6458
Malik, V. S., Pan, A., Willett, W. C., & Hu, F. B. (2013). Sugar-sweetened beverages and weight gain in children and adults: A systematic review and meta-analysis. The American Journal of Clinical Nutrition, 98(4), 1084-1102. https://doi.org/10.3945/ajcn.113.058362
Traversy, G., & Chaput, P. Alcohol Consumption and Obesity: An Update. Current Obesity Reports, 4(1), 122. https://doi.org/10.1007/s13679-014-0129-4
Ogden CL, Fryar CD, Martin CB, et al. Trends in Obesity Prevalence by Race and Hispanic Origin—1999-2000 to 2017-2018. JAMA. 2020;324(12):1208–1210. doi:10.1001/jama.2020.14590