Open the Required Packages

library(readxl)
library(ggpubr)
## Loading required package: ggplot2

Import Datasets

Dataset6.3 <- read_excel("D:/SLU/AdvAppliedAnalytics/Dataset6.3.xlsx")

Calculate mean and standard deviation

  1. Stress_Pre
mean(Dataset6.3$Stress_Pre)
## [1] 65.86954
sd(Dataset6.3$Stress_Pre)
## [1] 9.496524

The variable Stress_Pre had a mean of 65.87 and a standard deviation of 9.50

  1. Stress_Post
mean(Dataset6.3$Stress_Post)
## [1] 57.90782
sd(Dataset6.3$Stress_Post)
## [1] 10.1712

The variable Stress_Post had a mean of 57.91 and a standard deviation of 10.17

Create Histograms for Stress_Pre

hist(Dataset6.3$Stress_Pre,
     main = "Stress_Pre",
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

The variable “Stress_Pre” appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve.

Create Histograms for Stress_Post

hist(Dataset6.3$Stress_Post,
     main = "Stress_Post",
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

The variable “Stress_Post” appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve.

Conduct Shapiro–Wilk tests for to check the normality of each variable

  1. Stress_Pre
shapiro.test(Dataset6.3$Stress_Pre)
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.3$Stress_Pre
## W = 0.98426, p-value = 0.8855

The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.89), so the data is normal

  1. Stress_Post
shapiro.test(Dataset6.3$Stress_Post) 
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.3$Stress_Post
## W = 0.96865, p-value = 0.4072

The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.41), so the data is normal

Correlation Analysis

cor.test(Dataset6.3$Stress_Pre, Dataset6.3$Stress_Post, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  Dataset6.3$Stress_Pre and Dataset6.3$Stress_Post
## t = 1.5355, df = 33, p-value = 0.1342
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.08208415  0.54460746
## sample estimates:
##      cor 
## 0.258226

The Pearson Correlation test was selected because both variables were normally distributed according to the histograms and the Shapiro-Wilk tests. The p-value (probability value) is 0.13, which is above .05. This means the results are not statistically significant. The null hypothesis is supported. The correlation value is 0.26 The correlation is positive, which means as Stress_Pre increases, Stress_Post increases. The correlation value is less than 0.30, which means the relationship is weak

Scatterplots

ggscatter(
  Dataset6.3,
  x = "Stress_Pre",
  y = "Stress_Post",
  add = "reg.line",
  xlab = "Stress_Pre",
  ylab = "Stress_Post"
)

The line of best fit is pointing to the top right. This means the direction of the data is positive. As Stress_Pre increases, Stress_Post increases. The dots loosely hug the line. This means there is a weak relationship between the variables. The dots form a straight-line pattern. This means the data is linear. There may be a possible outlier (a student with relatively low Stress_Pre around the high 50s but very low Stress_Post around the mid-30s). However, the point does not appear to drastically change the overall trend and does not strongly affect the relationship between the variables.

Report the Results

The Stress_Pre (M = 65.87, SD = 9.50) was not significantly correlated with the Stress_Pre (M = 57.91, SD = 10.17), r(33) = .26, p = .134 The relationship was positive and weak. As the independent variable increased, the dependent variable increased.