Project1 Homework

Author

Mamokotjo Letjama

Introduction

Topic: SHIP Adolescents Who Have Obesity 2010, 2013-2014, 2016, 2018, 2021

This dataset was published by the Maryland Department of Health, and it tracks the percentage of adolescents diagnosed with obesity across various Maryland jurisdictions from 2010 to 2021. It provides annual data that allows for the analysis of long-term trends in adolescent obesity and highlights disparities based on race and ethnicity. My objective is to explore the dataset to uncover how obesity rates have shifted over time and where significant health obesity persists. Such insights are essential to determine public health interventions and policies that aim to reduce adolescent obesity in Maryland. The primary measure of interest is percentage of “Adolescents Who have Obesity” which is specified as Value in point form. The dataset has nine Race/Ethnicity groups. Geographic Area specifies as jurisdiction.

Source:Source: Maryland Department of Health https://opendata.maryland.gov/Health-and-Human-Services/SHIP-Adolescents-Who-Have-Obesity-2010-2013-2014-2/hedp-3fxm/about_data

load tidyverse

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)

Load Adolescents Obesity dataset

The data comes from SHIP , and the website

setwd("C:/Users/tmats/OneDrive/DATA110/Working Directories")
obesity <- read_csv("obesity.csv")
Rows: 658 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): jurisdiction, Race ethnicity, Measure
dbl (2): value, Year

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

##Data cleaning and inspection for the obesity dataset: Converts all column names to lowercase for consistency and to avoid case-sensitivity issues in later analysis. Replace spaces in column names with underscores, making them easier to reference in code. Perform quick preview of its structure and values.

names(obesity) <- tolower(names(obesity))
names(obesity) <- gsub(" ","_",names(obesity))
head(obesity)
# A tibble: 6 × 5
  jurisdiction     value race_ethnicity                       year measure      
  <chr>            <dbl> <chr>                               <dbl> <chr>        
1 State             12.6 All races/ ethnicities (aggregated)  2016 Adolescents …
2 Allegany          16.1 All races/ ethnicities (aggregated)  2016 Adolescents …
3 Anne Arundel      13   All races/ ethnicities (aggregated)  2016 Adolescents …
4 Baltimore City    19   All races/ ethnicities (aggregated)  2016 Adolescents …
5 Baltimore County  14.7 All races/ ethnicities (aggregated)  2016 Adolescents …
6 Calvert           11.3 All races/ ethnicities (aggregated)  2016 Adolescents …

remove the na.’s in value column:

Ensuring the Obesity dataset only includes complete data.

Obesity_new <- obesity |>
  filter(!is.na(value)) 

rename columns

for clarity, making it easier to interpret the data as representing obesity percentages.

Obesity_new |>
  rename(percentage=value)
# A tibble: 551 × 5
   jurisdiction     percentage race_ethnicity                       year measure
   <chr>                 <dbl> <chr>                               <dbl> <chr>  
 1 State                  12.6 All races/ ethnicities (aggregated)  2016 Adoles…
 2 Allegany               16.1 All races/ ethnicities (aggregated)  2016 Adoles…
 3 Anne Arundel           13   All races/ ethnicities (aggregated)  2016 Adoles…
 4 Baltimore City         19   All races/ ethnicities (aggregated)  2016 Adoles…
 5 Baltimore County       14.7 All races/ ethnicities (aggregated)  2016 Adoles…
 6 Calvert                11.3 All races/ ethnicities (aggregated)  2016 Adoles…
 7 Caroline               16   All races/ ethnicities (aggregated)  2016 Adoles…
 8 Carroll                 9.4 All races/ ethnicities (aggregated)  2016 Adoles…
 9 Cecil                  16.3 All races/ ethnicities (aggregated)  2016 Adoles…
10 Charles                13   All races/ ethnicities (aggregated)  2016 Adoles…
# ℹ 541 more rows

Used RColorBrewer palette created sequential palette

Create a graph using a sequential RColorBrewer palette (“OrRd”) to create a colored column chart that displays annual adolescent obesity percentages by race/ethnicity

ggplot(Obesity_new, aes(x=year, y=value, fill=race_ethnicity)) +
  geom_col(alpha=0.9,color="white")+
  scale_fill_brewer(palette ="OrRd") + 
  labs(title="Adolescent Obesity by Race/Ethnicity",
       x = "Year", y = "Obesity Percentage", fill = "race_ethnicity", caption="Source: Maryland State Health Improvement Process") +
  theme_minimal()

Create visualization: To display adolescent obesity percentages over time, separated by each racial and ethnic group.

ggplot(Obesity_new, aes(x=year, y=value, fill=race_ethnicity)) +
  geom_col(alpha=0.9,color="white")+
  scale_fill_brewer(palette ="OrRd") + 
  labs(title="Adolescent Obesity by Race/Ethnicity",
       x = "Year", y = "Obesity Percentage", fill = "race_ethnicity", caption="Source: Maryland State Health Improvement Process") +
  facet_wrap((~race_ethnicity)) +
  theme_minimal()

Summary of statisticts

by_race_ethnicity <- Obesity_new |>
  group_by(race_ethnicity) |>
  summarise(value=mean(value))
head(Obesity_new)
# A tibble: 6 × 5
  jurisdiction     value race_ethnicity                       year measure      
  <chr>            <dbl> <chr>                               <dbl> <chr>        
1 State             12.6 All races/ ethnicities (aggregated)  2016 Adolescents …
2 Allegany          16.1 All races/ ethnicities (aggregated)  2016 Adolescents …
3 Anne Arundel      13   All races/ ethnicities (aggregated)  2016 Adolescents …
4 Baltimore City    19   All races/ ethnicities (aggregated)  2016 Adolescents …
5 Baltimore County  14.7 All races/ ethnicities (aggregated)  2016 Adolescents …
6 Calvert           11.3 All races/ ethnicities (aggregated)  2016 Adolescents …
by_race_ethnicity |>
  ggplot(aes(race_ethnicity, value, fill = race_ethnicity)) +
  geom_bar(stat = "identity") +
  labs(title="Adolescent Obesity by Race/Ethnicity",
       x = "Year", y = "Obesity Percentage", fill = "race_ethnicity", caption="Source: Maryland State Health Improvement Process") +
  theme_minimal()

Brief Essay

The graph reveals a notable upward trend in adolescent obesity across most racial and ethnic groups in Maryland from 2010 to 2021. Black or African American Non-Hispanic/Latino and Hispanic/Latino adolescents consistently show the highest obesity rate, indicating they are disproportionately affected. The overall rates have climbed across nearly all groups, suggesting a broad public health. According to the State of Childhood Obesity organization, high school students’ obesity rate was 12.8 % in Maryland, between 2021 and 2022.However, some racial groups, such as Asian or Asian/Pacific Islander Non-Hispanic/Latino, appear to have relatively lower obesity rates, which could prompt further investigation into protective factors or reporting differences. In summary, adolescent obesity is rising, but it’s not rising equally. This points to the importance of culturally tailored interventions and sustained public health efforts. The challenge was to convert point values to percentages. To isolate trends for each group, I tried to use a ( facet_wrap,) the view was not clear.