Obesity is a major public health concern linked to conditions such as heart disease and diabetes. Understanding its trends over time can help lead to targeted health interventions. More specifically, spotting obesity trend differences between males and females can result in more effective and personalized health strategies. Biological, social, and psychological factors differ between genders, so a one-size-fits-all approach to obesity treatment and prevention is often less effective. In this project, we will explore the prevalence of obesity from the late 1980s to the late 2010s using a dataset from the National Center for Health Statistics (NCHS). This dataset showcases the percentage of adults (aged 20 and over) who are normal weight, overweight, or obese for selected yearly timeframes, broken down by racial and sex demographics. This project will focus specifically on the “obese” category and compare the sex demographic of men and women over time to answer our main research question: How have obesity rates among U.S. adults changed over the past few decades, and do these trends differ between men and women?
The dataset for this project is labeled “Normal_weight__overweight__and_obesity_among_adults_aged_20_and_over__by_selected_characteristics__United_States.csv”. Here are the key variables that will be used:
YEAR: The time period of the survey data. Utilizes multi-year ranges (e.g., “1988-1994” or “2015-2018”), representing combined years of a national survey. We will use YEAR to examine trends chronologically.
PANEL: The weight status category. The dataset has three main categories: Normal weight, Overweight, and Obesity (defined by body mass index, BMI, cutoffs). For example, Obesity corresponds to BMI ≥ 30.0. We will filter the data to this category to specifically study obesity trends.
STUB_NAME: A descriptor for the demographic grouping. For instance, “Sex” or “Race and Hispanic origin” appear here; It indicates what kind of subgroup the data row pertains to. We will primarily use rows where STUB_NAME is Sex, as our question compares genders.
STUB_LABEL: The specific subgroup within the STUB_NAME category. For STUB_NAME “Sex,” the STUB_LABEL is either Male or Female. For STUB_NAME “Total,” STUB_LABEL might be “20 years and over” (i.e., the entire adult population). We will use STUB_LABEL to distinguish between male and female data.
ESTIMATE: The numeric value of the estimate for that row, typically the percentage of the population in the given weight category for the specified group and time. For example, an ESTIMATE of 30 means 30% of that group were obese in that time period. We will analyze these percentages.
UNIT: The unit or description of the estimate. In this dataset, the unit is “Percent of population” (most values are age-adjusted percentages of the population in that category). All our analyses will be based on these percentage values. (Note: Age-adjustment accounts for differences in age distribution over time (1990s tended to have younger US citizens on average compared to current 2020s), ensuring fair trend comparisons.)
# Loading Necessary packages
library(dplyr) # for data manipulation
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2) # for visualizing data
# Read the CSV file (our data)
data <- read.csv("Normal_weight__overweight__and_obesity_among_adults_aged_20_and_over__by_selected_characteristics__United_States.csv")
# Examining data structure and the first few rows
str(data) # structure of the dataset (including columns and types)
## 'data.frame': 3360 obs. of 16 variables:
## $ INDICATOR : chr "Normal weight, overweight, and obesity among adults aged 20 and over" "Normal weight, overweight, and obesity among adults aged 20 and over" "Normal weight, overweight, and obesity among adults aged 20 and over" "Normal weight, overweight, and obesity among adults aged 20 and over" ...
## $ PANEL : chr "Normal weight (BMI from 18.5 to 24.9)" "Normal weight (BMI from 18.5 to 24.9)" "Normal weight (BMI from 18.5 to 24.9)" "Normal weight (BMI from 18.5 to 24.9)" ...
## $ PANEL_NUM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ UNIT : chr "Percent of population, age-adjusted" "Percent of population, age-adjusted" "Percent of population, age-adjusted" "Percent of population, age-adjusted" ...
## $ UNIT_NUM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ STUB_NAME : chr "Total" "Total" "Total" "Total" ...
## $ STUB_NAME_NUM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ STUB_LABEL : chr "20 years and over" "20 years and over" "20 years and over" "20 years and over" ...
## $ STUB_LABEL_NUM: num 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 ...
## $ YEAR : chr "1988-1994" "1999-2002" "2001-2004" "2003-2006" ...
## $ YEAR_NUM : int 1 2 3 4 5 6 7 8 9 10 ...
## $ AGE : chr "20 years and over" "20 years and over" "20 years and over" "20 years and over" ...
## $ AGE_NUM : num 1 1 1 1 1 1 1 1 1 1 ...
## $ ESTIMATE : num 41.6 33 32.3 31.6 30.8 29.8 29.6 28.9 27.7 26 ...
## $ SE : num 0.8 0.8 0.7 0.8 0.7 0.7 0.9 0.8 0.9 1 ...
## $ FLAG : chr "" "" "" "" ...
head(data) # displaying the first 6 rows of data
## INDICATOR
## 1 Normal weight, overweight, and obesity among adults aged 20 and over
## 2 Normal weight, overweight, and obesity among adults aged 20 and over
## 3 Normal weight, overweight, and obesity among adults aged 20 and over
## 4 Normal weight, overweight, and obesity among adults aged 20 and over
## 5 Normal weight, overweight, and obesity among adults aged 20 and over
## 6 Normal weight, overweight, and obesity among adults aged 20 and over
## PANEL PANEL_NUM
## 1 Normal weight (BMI from 18.5 to 24.9) 1
## 2 Normal weight (BMI from 18.5 to 24.9) 1
## 3 Normal weight (BMI from 18.5 to 24.9) 1
## 4 Normal weight (BMI from 18.5 to 24.9) 1
## 5 Normal weight (BMI from 18.5 to 24.9) 1
## 6 Normal weight (BMI from 18.5 to 24.9) 1
## UNIT UNIT_NUM STUB_NAME STUB_NAME_NUM
## 1 Percent of population, age-adjusted 1 Total 1
## 2 Percent of population, age-adjusted 1 Total 1
## 3 Percent of population, age-adjusted 1 Total 1
## 4 Percent of population, age-adjusted 1 Total 1
## 5 Percent of population, age-adjusted 1 Total 1
## 6 Percent of population, age-adjusted 1 Total 1
## STUB_LABEL STUB_LABEL_NUM YEAR YEAR_NUM AGE AGE_NUM
## 1 20 years and over 1.1 1988-1994 1 20 years and over 1
## 2 20 years and over 1.1 1999-2002 2 20 years and over 1
## 3 20 years and over 1.1 2001-2004 3 20 years and over 1
## 4 20 years and over 1.1 2003-2006 4 20 years and over 1
## 5 20 years and over 1.1 2005-2008 5 20 years and over 1
## 6 20 years and over 1.1 2007-2010 6 20 years and over 1
## ESTIMATE SE FLAG
## 1 41.6 0.8
## 2 33.0 0.8
## 3 32.3 0.7
## 4 31.6 0.8
## 5 30.8 0.7
## 6 29.8 0.7
For our analysis, we will only need to focus on the Obesity category and the grouping by Sex (Male/Female). We will filter the following data, ensuring we only use relevant columns.
# Filtering the dataset for only Obesity category and Sex (M/F) grouping
obesity_by_sex <- data |>
filter( # filtering values based on necessary data for question
PANEL == "Obesity (BMI greater than or equal to 30.0)",
STUB_NAME == "Sex",
UNIT == "Percent of population, age-adjusted"
) |>
select(YEAR, STUB_LABEL, ESTIMATE) |> # selecting only necessary columns
rename(Sex = STUB_LABEL) # renaming STUB_LABEL for clarity
Based on our new dataset, obesity_by_sex, we will confirm the unique ranges of years and test our data:
# Checking for distinct years and sampling the end of the data
unique(obesity_by_sex$YEAR)
## [1] "1988-1994" "1999-2002" "2001-2004" "2003-2006" "2005-2008" "2007-2010"
## [7] "2009-2012" "2011-2014" "2013-2016" "2015-2018"
head(obesity_by_sex)
## YEAR Sex ESTIMATE
## 1 1988-1994 Male 20.2
## 2 1999-2002 Male 27.5
## 3 2001-2004 Male 29.5
## 4 2003-2006 Male 32.4
## 5 2005-2008 Male 32.7
## 6 2007-2010 Male 33.9
We will now generate a summary table to compare the earliest and latest obesity rates for men and women. This will showcase the percentage from the 1980s to the 2010s for both sexes, as well as the change in percent.
# Summarize average obesity percentage across selected years, including midpoints
obesity_summary <- obesity_by_sex |>
filter(YEAR %in% c("1988-1994", "2001-2004", "2007-2010", "2015-2018")) |>
group_by(Sex, YEAR) |>
summarise(avg_percent = mean(ESTIMATE)) |>
ungroup()
## `summarise()` has grouped output by 'Sex'. You can override using the `.groups`
## argument.
# Add a column to show the change from the earliest year for each sex
obesity_summary <- obesity_summary |>
group_by(Sex) |>
mutate(change = avg_percent - first(avg_percent)) |>
ungroup()
# View the updated summary table
obesity_summary
## # A tibble: 8 × 4
## Sex YEAR avg_percent change
## <chr> <chr> <dbl> <dbl>
## 1 Female 1988-1994 25.5 0
## 2 Female 2001-2004 33.2 7.7
## 3 Female 2007-2010 35.5 10
## 4 Female 2015-2018 41.5 16
## 5 Male 1988-1994 20.2 0
## 6 Male 2001-2004 29.5 9.3
## 7 Male 2007-2010 33.9 13.7
## 8 Male 2015-2018 40.5 20.3
We will now create a bar chart, showcasing a visual representation of our summary table above.
# Create a bar chart comparing obesity rates by sex and year
ggplot(obesity_summary, aes(x = Sex, y = avg_percent, fill = YEAR)) +
geom_col(position = "dodge") +
labs(title = "Obesity Rates by Sex (1988–1994 vs 2015–2018)",
x = "Sex",
y = "Obesity Prevalence (% of Adults)",
fill = "Survey Period") +
theme_minimal()
The following results show a clear upward trend in obesity among adults in the U.S. between 1988 and 2018. In 1988–1994, about 20 percent of men and 25 percent of women were obese. By 2015–2018, both reached roughly 40 percent. While women had higher rates in earlier decades, men have since caught up, showcasing a nearly equal prevalence of obesity. These findings answer our question—How have obesity rates among U.S. adults changed over the past few decades, and do these trends differ between men and women?—by depicting that the obese population in the U.S. has steadily increased, with men experiencing a slightly greater percentage point change over time.
To further support this conclusion, a summary table including intermediate years shows that the rise in obesity has been gradual and consistent—not isolated to just two points in time. This highlights how public health concerns around obesity have grown steadily over multiple decades.
Future analyses could look at how these trends differ by race, ethnicity, income level, or location. Combining gender trends with other demographic factors could showcase populations that are especially vulnerable and in greater need of targeted health interventions. Such analyses could support more effective and inclusive health strategies moving forward.
National Center for Health Statistics (NCHS), Centers for Disease
Control and Prevention.
Normal weight, overweight, and obesity among adults aged 20 and
over, by selected characteristics: United States.
Retrieved from https://data.cdc.gov/National-Center-for-Health-Statistics/Normal-weight-overweight-and-obesity-among-adults-/3nzu-udr9
(Accessed October 11, 2025)