What is the minimum age?

Answer: 30

summary(Durias$age)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  30.00   43.00   50.00   49.27   57.00   60.00 

In the variable “age”, group the “age” variable into two groups, with atmost 50 years and more than 50 years old.

Durias%>%
  group_by(age = if_else(age<=50, "At most 50", "More than 50"))%>%
  summarise(count = n())%>%
  mutate(Percentage =round(count/sum(count)*100, 2))

How many of them with at least 50 years old?

Answer: 74

Durias%>%
  group_by(age = if_else(age>=50, "At least 50", "Less than 50"))%>%
  summarise(count = n())%>%
  mutate(Percentage =round(count/sum(count)*100, 2))

Socio-Demographic Profile

Gender

library(dplyr)
Durias%>%
  group_by(Gender)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))

Education

Durias$Education<-Durias$Education...72
table(Durias$Education)

      Colege graduate      College graduate         College level 
                    1                    21                    16 
  Elementary graduate      Elementary level        Ementary level 
                   17                    21                     1 
  High chool graduate   High schoo graduate  High school graduate 
                    1                     1                    22 
 High School graduate     High school level      High scool level 
                    4                    37                     1 
High sschool graduate      Highschool level 
                    1                     1 
library(dplyr)
Durias%>%
  group_by(Education)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))
Durias1<-Durias%>%
  mutate(Educationcode = recode(`Education`,
                           "Colege graduate" = "College graduate", "Ementary level" = "Elementary level", "High schoo graduate" = "High school graduate", "High School graduate" = "High school graduate", "High sschool graduate" = "High school graduate", "High scool level" = "High school level", "Highschool level " = "High school level", "Highschool level" = "High school level", "Highschool level " = "High school level", "High chool graduate" = "High school graduate"))
Durias1<-Durias1%>%
   reorder_levels(Educationcode, order = c("Elementary level", "Elementary graduate", "High school level", "High school graduate","College level", "College graduate"))
Durias1
library(dplyr)
Durias1%>%
  group_by(Educationcode)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))

Income

Durias1$Income<-Durias1$Income...76
Durias1$Income<-as.factor(Durias1$Income)
library(dplyr)
Durias1%>%
  group_by(Income)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))

1. What is the respondent’s level of stress and anxiety as measured by Depression, Anxiety, and Stress Scale (DASS-21)?

table(Durias$Stress)

Extremely severe             mild             Mild         Moderate 
               2                1               41               24 
          Normal           Severe 
              75                2 
Durias1<-Durias%>%
  mutate(Stress1 = recode(`Stress`,
                           "mild" = "Mild"))
Durias1<-Durias1%>%
   reorder_levels(Stress1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe"))
Durias1
Durias1<-Durias1%>%
   reorder_levels(Stress1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe", "College graduate"))
Durias1%>%
  group_by(Stress1)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))
Durias1<-Durias%>%
  mutate(Anxiety1 = recode(`Anxiety`,
                           "mild" = "Mild"))
Durias1<-Durias1%>%
   reorder_levels(Anxiety1, order = c("Normal", "Mild", "Moderate", "Severe","Extremely severe"))
Durias1%>%
  group_by(Anxiety1)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))
Durias2<-Durias1%>%
  gather(key = "Coping", value = "Score", ReappraisalMean, SocialSupportMean, ProbSolvingMean, RelMean, TolMean, Emomean, OveracMean, RelaxMean,Subsmean) %>%
  convert_as_factor(Coping)
Durias2
#Summary statistics
Durias2%>%
  group_by(Coping) %>%
   get_summary_stats(Score, type = "mean_sd")

3.1 Is there a significant relationship between respondent’s level of stress and coping mechanisms?

multiple <- lm(StTotal ~ ReappraisalMean + SocialSupportMean +ProbSolvingMean + RelMean + TolMean + Emomean + OveracMean + RelaxMean + Subsmean, data = Durias1)
summary(multiple)

Call:
lm(formula = StTotal ~ ReappraisalMean + SocialSupportMean + 
    ProbSolvingMean + RelMean + TolMean + Emomean + OveracMean + 
    RelaxMean + Subsmean, data = Durias1)

Residuals:
     Min       1Q   Median       3Q      Max 
-12.2452  -4.1688   0.1868   3.0768  20.7605 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)  
(Intercept)         7.6781     4.8332   1.589   0.1145  
ReappraisalMean     0.5264     1.1958   0.440   0.6605  
SocialSupportMean   1.1563     1.0585   1.092   0.2766  
ProbSolvingMean    -2.5310     1.3180  -1.920   0.0569 .
RelMean             0.8649     1.2815   0.675   0.5009  
TolMean            -0.3901     0.9907  -0.394   0.6943  
Emomean             1.9311     1.3611   1.419   0.1583  
OveracMean          0.8203     1.3734   0.597   0.5513  
RelaxMean          -1.2134     1.2422  -0.977   0.3304  
Subsmean            3.9883     1.6741   2.382   0.0186 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.868 on 135 degrees of freedom
Multiple R-squared:  0.1216,    Adjusted R-squared:  0.063 
F-statistic: 2.076 on 9 and 135 DF,  p-value: 0.03584

As shown in the above results, it shows that the model is better than a model with only the intercept because at least one coefficient β is significantly different from 0 with a p -value = 0.03584. It also shows that substance-use significantly predict stress with a p-value results of 0.0186. The coefficient of substance-use is 3.9883, this means that higher substance-use score indicates higher stress level. On, the average, a one unit increase in substance-use increases its stress level by 3.9883.

Checking of Assumptions

For model assumptions, install the package “performance.”

library(performance)
check_model(multiple)

Explain each assumptions in using multiple regression.

Answer:

    *Posterior Predictive Check- The model’s predictions should closely resemble the actual observed data when assessing new data points. This involves generating simulated data based on the model and comparing it to the real observed data, ensuring the model adequately represents the uncertainty in predicting new observations.

    *Linearity - The relationship between the independent and dependent variables should be linear. This means that a straight line should be able to fit the data reasonably well.

    *Homogeneity of Variance - The variance of the error terms should be constant across all levels of the independent variables. This means that the scatter of the points around the regression line should be the same for all values of the independent variables.

    *Influential Observations - Certain data points can significantly impact the regression model’s parameters and predictions. This means outliers or high leverage points could distort parameter estimation and require identification and potential handling and that points should be inside the reasonable line.

    *Colinearity - This refers to the situation when two or more independent variables in the regression model are highly correlated with each other which may inflate parameter uncertainty. Independent variables should be minimally correlated with each other to allow for accurate estimation of their individual effects on the dependent variable.

    *Normality of Residuals - The error terms should be normally distributed. This means that the histogram of the error terms should be bell-shaped.

4. Is there a significant relationship between stress and anxiety?

cor.test(Durias1$StTotal, Durias$AnTotal)

    Pearson's product-moment correlation

data:  Durias1$StTotal and Durias$AnTotal
t = 9.743, df = 143, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5224234 0.7204693
sample estimates:
      cor 
0.6316421 

Based on the results above, it shows that there is a positive correlation between stress and anxiety with a correlation value of 0.6316421. It further shows that there is a signification relationship between anxiety and stress with a p-value result of 2.2e-16, that is, 0.00000000000000022.