1 Loading Libraries

library(psych)
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
library(sjPlot)

2 Importing Data

d <- read.csv(file="Data/projectdata.csv", header=T)

3 State Your Hypothesis

We hypothesize that perceived stress and social support will significantly predict generalized anxiety.

4 Check Your Variables

str(d)
## 'data.frame':    612 obs. of  7 variables:
##  $ X                  : int  520 2814 3146 3295 717 6056 4753 5365 2044 1965 ...
##  $ education          : chr  "1 equivalent to not completing high school" "prefer not to say" "2 equivalent to high school completion" "prefer not to say" ...
##  $ relationship_status: chr  "Single, never married" "Single, never married" "Prefer not to say" "Single, never married" ...
##  $ exercise           : chr  "1 less than 1 hour" "1 less than 1 hour" "1 less than 1 hour" "1 less than 1 hour" ...
##  $ pss                : num  2.75 2.25 3 2 1.75 2 1 1.25 3 1.25 ...
##  $ support            : num  2.83 3 4 4 3.67 ...
##  $ gad                : num  1.14 1.29 1 1 1.14 ...
cont <- na.omit(subset(d, select=c(gad, pss, support)))
cont$row_id <- 1:nrow(cont)

cont$pss <- scale(cont$pss, center=T, scale=T)
cont$support <- scale(cont$support, center=T, scale=T)

describe(cont)
##         vars   n   mean     sd median trimmed    mad   min    max  range  skew
## gad        1 612   2.17   0.93   2.00    2.11   1.06  1.00   4.00   3.00  0.47
## pss        2 612   0.00   1.00   0.13    0.02   1.16 -2.23   1.96   4.19 -0.15
## support    3 612   0.00   1.00   0.03    0.04   1.04 -2.59   1.60   4.19 -0.30
## row_id     4 612 306.50 176.81 306.50  306.50 226.84  1.00 612.00 611.00  0.00
##         kurtosis   se
## gad        -1.00 0.04
## pss        -0.78 0.04
## support    -0.64 0.04
## row_id     -1.21 7.15
hist(cont$pss)

hist(cont$support)

hist(cont$gad)

plot(cont$pss, cont$gad)

plot(cont$support, cont$gad)

plot(cont$pss, cont$support)

5 View Your Correlations

corr_output_m <- corr.test(cont)
corr_output_m
## Call:corr.test(x = cont)
## Correlation matrix 
##           gad   pss support row_id
## gad      1.00  0.74   -0.43   0.58
## pss      0.74  1.00   -0.49   0.61
## support -0.43 -0.49    1.00  -0.51
## row_id   0.58  0.61   -0.51   1.00
## Sample Size 
## [1] 612
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##         gad pss support row_id
## gad       0   0       0      0
## pss       0   0       0      0
## support   0   0       0      0
## row_id    0   0       0      0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

6 Run a Multiple Linear Regression

reg_model <- lm(gad ~ pss + support, data = cont)

7 Check Your Assumptions

7.1 Multiple Linear Regression Assumptions

Assumptions we’ve discussed previously:

  • Observations should be independent
  • Variables should be continuous and normally distributed
  • Outliers should be identified and removed
  • Relationship between the variables should be linear
  • Homogeneity of variance [NOTE: We are skipping this here]
  • Residuals should be normal and have constant variance

New assumptions:

  • Number of cases should be adequate (N ≥ 80 + 8*m, where m is the number of independent variables)
  • Independent variables should not be too correlated (aka multicollinearity)

7.2 Count Number of Cases

needed <- 80 + 8*2
nrow(cont) >= needed
## [1] TRUE

7.3 Check for multicollinearity

vif(reg_model)
##      pss  support 
## 1.311145 1.311145

7.4 Check linearity with Residuals vs Fitted plot

plot(reg_model, 1)

7.5 Check for outliers using Cook’s distance and a Residuals vs Leverage plot

plot(reg_model, 4)

plot(reg_model, 5)

7.6 Check normality of residuals with a Q-Q plot

plot(reg_model, 2)

7.7 Issues with My Data

Before interpreting our results, we assessed our variables to see if they met the assumptions for a multiple linear regression. We detected slight issues with linearity in a Residuals vs Fitted plot. We did not detect any outliers (by visually analyzing Cook’s Distance and Residuals vs Leverage plots) or issues with the normality of our residuals (by visually analyzing a Q-Q plot), nor were there any issues of multicollinearity among our two independent variables.

8 View Test Output

summary(reg_model)
## 
## Call:
## lm(formula = gad ~ pss + support, data = cont)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.79836 -0.40806 -0.02024  0.39685  2.72301 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.17274    0.02515  86.390  < 2e-16 ***
## pss          0.64300    0.02882  22.309  < 2e-16 ***
## support     -0.08604    0.02882  -2.985  0.00295 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6222 on 609 degrees of freedom
## Multiple R-squared:  0.5517, Adjusted R-squared:  0.5502 
## F-statistic: 374.7 on 2 and 609 DF,  p-value: < 2.2e-16

Effect size, based on Regression ß (Beta Estimate) value in our output

  • Trivial: Less than 0.10 (ß < 0.10)
  • Small: 0.10–0.29 (0.10 < ß < 0.29)
  • Medium: 0.30–0.49 (0.30 < ß < 0.49)
  • Large: 0.50 or greater (ß > 0.50)

9 Write Up Results

To test our hypothesis that perceived stress and social support would significantly predict generalized anxiety, we used a multiple regression to model the associations between these variables. We confirmed that our data met the assumptions of a linear regression, aside from there being slight issues with linearity.

Our hypothesis was supported. The model was statistically significant, Adj. R2 = 0.55, F(2, 609) = 374.70, p < .001. Our results indicate that perceived stress positively predicted generalized anxiety and had a large effect size (ß = 0.64; per Cohen, 1988), while social support negatively predicted generalized anxiety and had a trivial effect size (ß = -0.09). Full output from the regression model is reported in Table 1. This means that people’s generalized anxiety increases by 0.64 units for every one unit increase in their perceived stress, while it decreases by 0.09 units for every one unit increase in their social support.

Table 1: Multiple Regression Model Predicting Generalized Anxiety
  Generalized Anxiety
Predictors Estimates SE CI p
Intercept 2.17 0.03 2.12 – 2.22 <0.001
Perceived Stress 0.64 0.03 0.59 – 0.70 <0.001
Social Support -0.09 0.03 -0.14 – -0.03 0.003
Observations 612
R2 / R2 adjusted 0.552 / 0.550

References

Cohen J. (1988). Statistical Power Analysis for the Behavioral Sciences. New York, NY: Routledge Academic.