summary(Durias$age)
Min. 1st Qu. Median Mean 3rd Qu. Max.
30.00 43.00 50.00 49.27 57.00 60.00
Durias%>%
group_by(age = if_else(age<=50, "At most 50", "More than 50"))%>%
summarise(count = n())%>%
mutate(Percentage =round(count/sum(count)*100, 2))
Durias%>%
group_by(age = if_else(age>=50, "At least 50", "Less than 50"))%>%
summarise(count = n())%>%
mutate(Percentage =round(count/sum(count)*100, 2))
library(dplyr)
Durias%>%
group_by(Gender)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
Durias$Education<-Durias$Education...72
table(Durias$Education)
Colege graduate College graduate College level
1 21 16
Elementary graduate Elementary level Ementary level
17 21 1
High chool graduate High schoo graduate High school graduate
1 1 22
High School graduate High school level High scool level
4 37 1
High sschool graduate Highschool level
1 1
library(dplyr)
Durias%>%
group_by(Education)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
Durias1<-Durias%>%
mutate(Educationcode = recode(`Education`,
"Colege graduate" = "College graduate", "Ementary level" = "Elementary level", "High schoo graduate" = "High school graduate", "High School graduate" = "High school graduate", "High sschool graduate" = "High school graduate", "High scool level" = "High school level", "Highschool level " = "High school level", "Highschool level" = "High school level", "Highschool level " = "High school level", "High chool graduate" = "High school graduate"))
Durias1<-Durias1%>%
reorder_levels(Educationcode, order = c("Elementary level", "Elementary graduate", "High school level", "High school graduate","College level", "College graduate"))
Durias1
library(dplyr)
Durias1%>%
group_by(Educationcode)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
Durias1$Income<-Durias1$Income...76
Durias1$Income<-as.factor(Durias1$Income)
library(dplyr)
Durias1%>%
group_by(Income)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
table(Durias$Stress)
Extremely severe mild Mild Moderate
2 1 41 24
Normal Severe
75 2
Durias1<-Durias%>%
mutate(Stress1 = recode(`Stress`,
"mild" = "Mild"))
Durias1<-Durias1%>%
reorder_levels(Stress1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe"))
Durias1
Durias1<-Durias1%>%
reorder_levels(Stress1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe", "College graduate"))
Durias1%>%
group_by(Stress1)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
Durias1<-Durias%>%
mutate(Anxiety1 = recode(`Anxiety`,
"mild" = "Mild"))
Durias1<-Durias1%>%
reorder_levels(Anxiety1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe"))
Durias1%>%
group_by(Anxiety1)%>%
summarise(count=n())%>%
mutate(Percentage =round((count/sum(count)*100),2))
Durias2<-Durias1%>%
gather(key = "Coping", value = "Score", ReappraisalMean, SocialSupportMean, ProbSolvingMean, RelMean, TolMean, Emomean, OveracMean, RelaxMean,Subsmean) %>%
convert_as_factor(Coping)
Durias2
#Summary statistics
Durias2%>%
group_by(Coping) %>%
get_summary_stats(Score, type = "mean_sd")
multiple <- lm(StTotal ~ ReappraisalMean + SocialSupportMean +ProbSolvingMean + RelMean + TolMean + Emomean + OveracMean + RelaxMean + Subsmean, data = Durias1)
summary(multiple)
Call:
lm(formula = StTotal ~ ReappraisalMean + SocialSupportMean +
ProbSolvingMean + RelMean + TolMean + Emomean + OveracMean +
RelaxMean + Subsmean, data = Durias1)
Residuals:
Min 1Q Median 3Q Max
-12.2452 -4.1688 0.1868 3.0768 20.7605
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.6781 4.8332 1.589 0.1145
ReappraisalMean 0.5264 1.1958 0.440 0.6605
SocialSupportMean 1.1563 1.0585 1.092 0.2766
ProbSolvingMean -2.5310 1.3180 -1.920 0.0569 .
RelMean 0.8649 1.2815 0.675 0.5009
TolMean -0.3901 0.9907 -0.394 0.6943
Emomean 1.9311 1.3611 1.419 0.1583
OveracMean 0.8203 1.3734 0.597 0.5513
RelaxMean -1.2134 1.2422 -0.977 0.3304
Subsmean 3.9883 1.6741 2.382 0.0186 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.868 on 135 degrees of freedom
Multiple R-squared: 0.1216, Adjusted R-squared: 0.063
F-statistic: 2.076 on 9 and 135 DF, p-value: 0.03584
As shown in the above results, it shows that the model is better than a model with only the intercept because at least one coefficient β is significantly different from 0 with a p -value = 0.03584. It also shows that substance-use significantly predict stress with a p-value results of 0.0186. The coefficient of substance-use is 3.9883, this means that higher substance-use score indicates higher stress level. On, the average, a one unit increase in substance-use increases its stress level by 3.9883.
library(performance)
check_model(multiple)
*Posterior Predictive Check- The model’s predictions should closely resemble the actual observed data when assessing new data points. This involves generating simulated data based on the model and comparing it to the real observed data, ensuring the model adequately represents the uncertainty in predicting new observations.
*Linearity - The relationship between the independent and dependent variables should be linear. This means that a straight line should be able to fit the data reasonably well.
*Homogeneity of Variance - The variance of the error terms should be constant across all levels of the independent variables. This means that the scatter of the points around the regression line should be the same for all values of the independent variables.
*Influential Observations - Certain data points can significantly impact the regression model’s parameters and predictions. This means outliers or high leverage points could distort parameter estimation and require identification and potential handling and that points should be inside the reasonable line.
*Colinearity - This refers to the situation when two or more independent variables in the regression model are highly correlated with each other which may inflate parameter uncertainty. Independent variables should be minimally correlated with each other to allow for accurate estimation of their individual effects on the dependent variable.
*Normality of Residuals - The error terms should be normally distributed. This means that the histogram of the error terms should be bell-shaped.cor.test(Durias1$StTotal, Durias$AnTotal)
Pearson's product-moment correlation
data: Durias1$StTotal and Durias$AnTotal
t = 9.743, df = 143, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.5224234 0.7204693
sample estimates:
cor
0.6316421
Based on the results above, it shows that there is a positive correlation between stress and anxiety with a correlation value of 0.6316421. It further shows that there is a signification relationship between anxiety and stress with a p-value result of 2.2e-16, that is, 0.00000000000000022.