Loading the needed packages
library(tidyverse)
library(scales)
library(patchwork)
library(statsExpressions)
library(DT)
library(ggstatsplot)
Loading the dataset
infer<-read_csv("infer.csv") %>% select(1:8)
Athletism - checking out the proportions/percentages between levels
and testing for equality

In this case, I need to sample more “nonathletic” respondents to
make the levels more equal within this group.
Gender

Age

I need to sample more respondents from the underrepresented levels
like 40-44, 35-39, and 45-49.
Frequency of the physical exercise

Duration of the physical exercise

I need to sample more respondents from “> 2 hours” group
level.
Sport type

My sample is highly unbalanced. I need to sample more respondents
from underrepresented sports.
ANALYSIS OF QUESTIONARY ANSWERS IN THE CONTEXT OF EXERCISE FREQUENCY
AND DURATION
I have decided that the most important variables that I want to dig
into are exercise frequency and duration. Therefore, my attention turns
to them. I will look for associations between those two variables and
the different respondents’ answers. Also, I will check out for equal
proportions between the question answers within the levels of those two
most important for me variables.



Based on their answers, this shows me that the frequency of exercise
significantly affects the mood of respondents. The more frequently
respondents exercise, the more strongly they agree with the question
statement.
