library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Summary:

The main question, goal, or purpose for your project

  1. What variables have stronger influence on Obesity levels determination?

  2. How different variables impact on an individual’s obesity levels?

Visualizations for at least two interesting aspects of the data worth further investigation, and a short explanation for why

obdf <- read.csv("~/Downloads/ObesityDataSet_raw_and_data_sinthetic.csv", header=TRUE)
gas <-(obdf |> group_by(obdf[,c('Gender','NObeyesdad')])) |> summarize(Mean_BMI = median(Weight), count = n())
## `summarise()` has grouped output by 'Gender'. You can override using the
## `.groups` argument.
ggplot(gas, aes(x = NObeyesdad, y = count, fill = Gender)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Physical Activity Frequency FAF vs Obesity Levels Visualization NObeyesdad

ggplot(data = obdf[,c('NObeyesdad','FAF' )], aes(x = NObeyesdad, y = FAF ,fill = NObeyesdad )) + geom_boxplot() + theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Your plan moving forward (i.e., what is on your to-do list?)

  1. Investigate correlations between variables and obesity levels(if possible between other varibales).

  2. Test some the hypothesis below and try to draw some meanful conclusions from the dataset.

Initial Findings

at least two hypotheses (no need to test them yet; words are fine)

  1. Higher vegetable consumption (FCVC) along with higher physical activity frequency (FAF) is associated with healthier obesity levels.

  2. Does Age has a role in Obesity levels,example younger people are more normal weight category inclined than older people?

one visualization for each hypothesis

  1. Higher vegetable consumption (FCVC) along with higher physical activity frequency (FAF) is associated with healthier (lower) obesity levels.
ggplot(obdf, aes(x = FCVC, y = FAF, color = NObeyesdad)) +
  geom_point(alpha = 0.7)  +
  theme_minimal() +
  scale_color_brewer(palette = "Set1") + facet_wrap(~ NObeyesdad)

  1. Does Age has a role in Obesity levels,example younger people are more normal weight category inclined than older people?
ggplot(obdf, aes(x = NObeyesdad, y = Age)) +
  geom_boxplot(fill = "lightblue")  +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))