EXPLORATORY DATA ANALYSIS ON BRFSS
Research Problems:
Research question 1: Exploratory Data Analysis and some statistics on the variables “sleptim1” and “addepev2” in terms of the following:
1.1 What are its statistics using the function summary in R?
1.2 Provide statistics using the function summary without NA’s and data with at most 10 hours of sleep.
Research question 2: What analysis can you share on the Perception of others to the Depressive Disorder of the Respondents with those having less than 6 hours of sleep on average?
Research question 3: What insights can you provide in comparing between having less than 6 hours of sleep and having 6 to 10 hours of sleep that were perceived with depression disorder (addepev2)?
Answers:
Research question 1:
load("brfss2013.Rdata")
library("magrittr")
## Warning: package 'magrittr' was built under R version 4.0.5
library("dplyr")
## Warning: package 'dplyr' was built under R version 4.0.4
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library("ggplot2")
Using the function summary in R on the observations of the sleptim1 variable
summary(brfss2013$sleptim1)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 6.000 7.000 7.052 8.000 450.000 7387
Using the function summary in R without the NA’s
withoutNA<-brfss2013%>%
filter(!is.na(sleptim1))
summary(withoutNA$sleptim1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 6.000 7.000 7.052 8.000 450.000
Using the function summary in R for at most 10 hours average of sleep in the variable sleptim1
atmost10hrs<-withoutNA%>%
filter(sleptim1<11)
summary(atmost10hrs$sleptim1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 6.000 7.000 6.976 8.000 10.000
Based on the result above, it shows the same results except for the Maximum. This implies that only few responded of having beyond 10 hours of sleep on the average.
Using the function summary in R on the observation of the addepev2 variable
summary(brfss2013$addepev2)
## Yes No NA's
## 95779 393707 2289
Using the function summary in R without the NA’s
withoutNA1<-brfss2013%>%
filter(!is.na(addepev2))
summary(withoutNA1$addepev2)
## Yes No
## 95779 393707
Research question 2:
dep_dis <- withoutNA %>%
filter(!is.na(sleptim1),!is.na(addepev2),sleptim1<6)%>%
group_by(addepev2)%>%
summarise(count=n())
dep_dis
## # A tibble: 2 x 2
## addepev2 count
## <fct> <int>
## 1 Yes 17828
## 2 No 34275
ggplot(data = dep_dis,aes(x=addepev2,y=count)) + geom_bar(stat="identity",fill='blue') + xlab("People with Depressive Disorder having less than 6 hours of sleep on average") + ylab("Number of US citizens ")
(17828/(17828+34275))*100
## [1] 34.21684
The result above shows that, 34.22% of the respondents perceived that having a less than 6 hours of sleep on the average have depressive disorder. This means that 1 out of 3 of those who sleep less than 6 hours on the average is having depression disorder.
Research question 3:
dep_dis1 <- atmost10hrs %>%
filter(!is.na(sleptim1),!is.na(addepev2),sleptim1>5)%>%
group_by(addepev2)%>%
summarise(count=n())
dep_dis1
## # A tibble: 2 x 2
## addepev2 count
## <fct> <int>
## 1 Yes 73771
## 2 No 350259
ggplot(data = dep_dis1, aes(x=addepev2,y=count)) + geom_bar(stat="identity",fill='red') + xlab("Number of People with Depressive Disorder having 6 to 10 hours of sleep on average") + ylab("Number of US citizens ")
(73771 /(73771 + 350259)) * 100
## [1] 17.39759
Based on the results on question 2 and with the results of question 3, it shows that 17.40% of the respondents perceived that having 6 to 10 hours of average sleep have lower depression disorder. We can also observe based on the two results that the depressive disorder of people who sleep on 6 to 10 hours on average is half lesser than with those people who sleep in less that 6 hours on average. This suggests that having 6 to 10 hours of average sleep would lower depressive disorder as perceived by others.