The relationship between depression and anxiety

Researchers sought to investigate whether there is a relationship between depression and anxiety. The data is saved under “affect2.csv”

Column names	Description
depression	Depression scores
anxiety	Anxiety scores

Exercise 1

1.1. Import and inspect the data contained in “affect2.csv” (2 marks)

affect2 <- read.csv("affect2.csv")

head(affect2)

1.2. Calculate the correlation between depression and anxiety (2 marks)

cor(affect2$depression, affect2$anxiety)

## [1] 0.6817462

1.3. Describe the relationship between depression and anxiety (i.e. direction and strength) (2 marks)

There is a strong positive relationship between depression and anxiety.

1.4. Report on the significance of this correlation and what that means in terms of the relationship between these variables? (2 marks)

Note. The p-value < 2.2e-16 where the alpha = 0.05.

Because the p-value is alot smaller than the alpha of 0.05, the null hypothesis would be rejected. This would mean that the positive relationship seen in this sample is not likely to have occurred by chance. Furthermore, the two variable are related and tend to co-occur.

Exercise 2

The researchers posit that high levels of anxiety will result in higher levels of depression. To understand if this is true they conduct a simple linear regression.

2.1. State the predictor and outcome variables, and formulate a research question. (3 marks)

The predictor here is anxiety scores and the outcome is depression scores.

Do higher levels of anxiety predict higher leels of deppression?

2.2. Run a simple regression of depression on anxiety, and save the output under “mod1” (2 marks)

Reminder. Use the lm() command to run the regression. Use the help file (“?lm()”) if you need a refresher on what arguments go into the function.

Note. Remember the order of the variables is important when creating a regression model.

mod1 <- lm(depression ~ anxiety, data = affect2)

2.3. Inspect “mod1” using the summary() command (1 mark)

INSERT CODE CHUNK HERE

summary(mod1)

## 
## Call:
## lm(formula = depression ~ anxiety, data = affect2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -51.033 -11.239  -1.298   7.781  69.781 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -43.5734     6.5777  -6.624 5.19e-10 ***
## anxiety       1.8921     0.1615  11.713  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.47 on 158 degrees of freedom
## Multiple R-squared:  0.4648, Adjusted R-squared:  0.4614 
## F-statistic: 137.2 on 1 and 158 DF,  p-value: < 2.2e-16

2.4 Use the values from the output in 2.3 to present the regression equation (2 marks)

Reminder. The base equation looks like this: y = ax + b

y = 1.8921x - 43.57

2.5. What percentage of variance in the outcome variable is explained by our predictor? (3)

46.48% of the of the variance in the depression scores is explained by the anxiety scores.

2.6. Plot the relationship between depression and anxiety on a scatterplot and include the regression line (i.e., the line of best fit). (4 marks)

Be sure to label your axes appropriately.

Reminder. Use the help file for “?plot()” and “?abline()” to find out more information about what kind of arguments go into these functions.

plot(affect2$depression, affect2$anxiety, xlab = "Axiety Scores", ylab = "Depression Scores", main = "Relationship Between Anxiety and Depression")
abline(lm(affect2$anxiety ~ affect2$depression), col = "blue")

2.7. Looking at the summary output in 2.3, the regression equation in 2.4, and the plot in 2.5, what can you say about the relationship between depression and anxiety? (i.e., answer the research question you formulated in 2.1) (4 marks)

Consider the model’s overall p-value, what the regression equation actually means in terms of these variables, and how the scatter plot visually represents this relationship.

The findings revealed that higher levels of anxiety strongly predict higher levels of depression. The regression model indicated that the association is statistically significant, and the equation displays a positive slope, with depression increasing as anxiety increases. Anxiety accounts for a significant portion of the variation in depression, indicating that it is a powerful predictor. The scatter plot visually supports this observation, showing a positive trend with points clustering along the line of best fit.

BEFORE YOU SUBMIT YOUR COMPLETED RMARKDOWN, DID YOU?

Edit the YAML header?
Ensure you have read through each part of the question?
Run all code chunks once you completed everything?
Save your work?

TO SUBMIT

Exit this
Zip your entire tutorial assignment folder
Edit the name of the zip to something appropriate that includes your student number, and the tutorial assignment number (e.g. ABCDEF001_tutassignment3)
Submit zipped folder on Amathuba under the relevant “Assignments” tab

Free-form tutorial 7: Correlation and regression

Oyena Qwabe

29/04/2026

Total /25