Disclaimer: The whole point of this work is not really about the Student mental Health, but rather it is basically about practicing piping in R programming from the dataset to ggplot visualization.
A STATISTICAL RESEARCH ON THE EFFECTS OF MENTAL HEALTH ON STUDENTS’ CGPA dataset This Data set was collected by a survey conducted by Google forms from University student in order to examine their current academic situation and mental health.
For more details about the Student Mental Health data set see Link.
Loading package
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.3
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'tidyr' was built under R version 4.2.3
## Warning: package 'readr' was built under R version 4.2.3
## Warning: package 'purrr' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## Warning: package 'stringr' was built under R version 4.2.3
## Warning: package 'forcats' was built under R version 4.2.3
## Warning: package 'lubridate' was built under R version 4.2.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.1 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Loading Dataset
StudentMentahealth <- read.csv("StudentMentahealth.csv")
head(StudentMentahealth) # To check the first six columns of the dataset
## Timestamp Choose.your.gender Age What.is.your.course.
## 1 08/07/2020 12:02 Female 18 Engineering
## 2 08/07/2020 12:04 Male 21 Islamic education
## 3 08/07/2020 12:05 Male 19 BIT
## 4 08/07/2020 12:06 Female 22 Laws
## 5 08/07/2020 12:13 Male 23 Mathemathics
## 6 08/07/2020 12:31 Male 19 Engineering
## Your.current.year.of.Study What.is.your.CGPA. Marital.status
## 1 year 1 3.00 - 3.49 No
## 2 year 2 3.00 - 3.49 No
## 3 Year 1 3.00 - 3.49 No
## 4 year 3 3.00 - 3.49 Yes
## 5 year 4 3.00 - 3.49 No
## 6 Year 2 3.50 - 4.00 No
## Do.you.have.Depression. Do.you.have.Anxiety. Do.you.have.Panic.attack.
## 1 Yes No Yes
## 2 No Yes No
## 3 Yes Yes Yes
## 4 Yes No No
## 5 No No No
## 6 No No Yes
## Did.you.seek.any.specialist.for.a.treatment.
## 1 No
## 2 No
## 3 No
## 4 No
## 5 No
## 6 No
Explore and manipulate the dataset
glimpse(StudentMentahealth) # To check variable string
## Rows: 101
## Columns: 11
## $ Timestamp <chr> "08/07/2020 12:02", "08/0…
## $ Choose.your.gender <chr> "Female", "Male", "Male",…
## $ Age <int> 18, 21, 19, 22, 23, 19, 2…
## $ What.is.your.course. <chr> "Engineering", "Islamic e…
## $ Your.current.year.of.Study <chr> "year 1", "year 2", "Year…
## $ What.is.your.CGPA. <chr> "3.00 - 3.49", "3.00 - 3.…
## $ Marital.status <chr> "No", "No", "No", "Yes", …
## $ Do.you.have.Depression. <chr> "Yes", "No", "Yes", "Yes"…
## $ Do.you.have.Anxiety. <chr> "No", "Yes", "Yes", "No",…
## $ Do.you.have.Panic.attack. <chr> "Yes", "No", "Yes", "No",…
## $ Did.you.seek.any.specialist.for.a.treatment. <chr> "No", "No", "No", "No", "…
names(StudentMentahealth) # To check columns heads
## [1] "Timestamp"
## [2] "Choose.your.gender"
## [3] "Age"
## [4] "What.is.your.course."
## [5] "Your.current.year.of.Study"
## [6] "What.is.your.CGPA."
## [7] "Marital.status"
## [8] "Do.you.have.Depression."
## [9] "Do.you.have.Anxiety."
## [10] "Do.you.have.Panic.attack."
## [11] "Did.you.seek.any.specialist.for.a.treatment."
str(StudentMentahealth) # To check the structure of the dataset
## 'data.frame': 101 obs. of 11 variables:
## $ Timestamp : chr "08/07/2020 12:02" "08/07/2020 12:04" "08/07/2020 12:05" "08/07/2020 12:06" ...
## $ Choose.your.gender : chr "Female" "Male" "Male" "Female" ...
## $ Age : int 18 21 19 22 23 19 23 18 19 18 ...
## $ What.is.your.course. : chr "Engineering" "Islamic education" "BIT" "Laws" ...
## $ Your.current.year.of.Study : chr "year 1" "year 2" "Year 1" "year 3" ...
## $ What.is.your.CGPA. : chr "3.00 - 3.49" "3.00 - 3.49" "3.00 - 3.49" "3.00 - 3.49" ...
## $ Marital.status : chr "No" "No" "No" "Yes" ...
## $ Do.you.have.Depression. : chr "Yes" "No" "Yes" "Yes" ...
## $ Do.you.have.Anxiety. : chr "No" "Yes" "Yes" "No" ...
## $ Do.you.have.Panic.attack. : chr "Yes" "No" "Yes" "No" ...
## $ Did.you.seek.any.specialist.for.a.treatment.: chr "No" "No" "No" "No" ...
unique(StudentMentahealth$What.is.your.CGPA.) # To check all the unique values on the dataset
## [1] "3.00 - 3.49" "3.50 - 4.00" "3.50 - 4.00 " "2.50 - 2.99" "2.00 - 2.49"
## [6] "0 - 1.99"
class(StudentMentahealth$What.is.your.CGPA.) #Change CGPA variable to numeric variable
## [1] "character"
Converting the character variable to factor variable
StudentMentahealth$What.is.your.CGPA. <- as.factor(StudentMentahealth$What.is.your.CGPA.)
Reviewing selected data to work with and change marital status from “Yes” to Married and “No” to Single
StudentMentahealth %>%
filter(What.is.your.course. %in% c("Engineering","BIT","BCS","Laws","Kirkhs",
"Pendidikan Islam","Biomedical science","koe") &
Your.current.year.of.Study > "Year 2") %>%
select(What.is.your.course.,Your.current.year.of.Study,
Choose.your.gender,Age,What.is.your.CGPA., Marital.status) %>%
mutate(Marital.status = recode(Marital.status,
"Yes" = "Married",
"No" = "Single")) %>%
arrange(-Age)
## What.is.your.course. Your.current.year.of.Study Choose.your.gender Age
## 1 Engineering Year 3 Female 24
## 2 BCS Year 3 Male 24
## 3 BIT Year 3 Female 24
## 4 BCS year 4 Female 24
## 5 BIT Year 3 Female 24
## 6 BCS year 3 Female 24
## 7 BIT Year 3 Male 24
## 8 BCS Year 3 Female 23
## 9 Kirkhs Year 3 Male 23
## 10 Pendidikan Islam year 4 Female 23
## 11 Laws year 3 Female 22
## 12 Engineering year 4 Female 22
## 13 koe year 3 Female 20
## 14 Biomedical science year 3 Female 19
## 15 BIT Year 3 Female 19
## 16 Laws Year 3 Female 18
## 17 Engineering year 4 Female 18
## What.is.your.CGPA. Marital.status
## 1 3.50 - 4.00 Married
## 2 3.50 - 4.00 Single
## 3 3.50 - 4.00 Married
## 4 3.50 - 4.00 Single
## 5 3.00 - 3.49 Single
## 6 3.50 - 4.00 Single
## 7 3.50 - 4.00 Single
## 8 3.50 - 4.00 Single
## 9 3.50 - 4.00 Single
## 10 3.50 - 4.00 Single
## 11 3.00 - 3.49 Married
## 12 3.50 - 4.00 Single
## 13 3.00 - 3.49 Married
## 14 3.00 - 3.49 Single
## 15 3.00 - 3.49 Married
## 16 3.50 - 4.00 Single
## 17 3.50 - 4.00 Single
Ploting a bar chat to check the most courses student offer
StudentMentahealth %>%
filter(What.is.your.course. %in% c("Engineering","BIT","BCS","Laws","Kirkhs",
"Pendidikan Islam","Biomedical science","koe") &
Your.current.year.of.Study > "Year 2") %>%
select(What.is.your.course.,Your.current.year.of.Study,
Choose.your.gender,Age,What.is.your.CGPA., Marital.status) %>%
mutate(Marital.status = recode(Marital.status,
"Yes" = "Married",
"No" = "Single")) %>%
arrange(-Age) %>%
ggplot(aes(fct_infreq(What.is.your.course.)))+
geom_bar(fill = "#97B3C6")+
coord_flip()+ # To flip the x axis to the y axis and then y axis to x axis
theme_bw()+
labs(title = "Student Vs Course")
Ploting a point graph too check the student CGPA from selected courses from 3year and above student
StudentMentahealth %>%
filter(What.is.your.course. %in% c("Engineering","BIT","BCS","Laws","Kirkhs",
"Pendidikan Islam","Biomedical science","koe") &
Your.current.year.of.Study > "Year 2") %>%
select(What.is.your.course.,Your.current.year.of.Study,
Choose.your.gender,Age,What.is.your.CGPA., Marital.status) %>%
mutate(Marital.status = recode(Marital.status,
"Yes" = "Married",
"No" = "Single")) %>%
arrange(-Age) %>%
ggplot(aes(What.is.your.course.,What.is.your.CGPA.))+
geom_point(color = "brown", size = 5)+
geom_line()+
theme_bw()+
labs(title = "Course Vs CGPA")