Open the Required Packages

library(readxl)
library(ggpubr)
## Loading required package: ggplot2

Import Datasets

Dataset6.4 <- read_excel("D:/SLU/AdvAppliedAnalytics/Dataset6.4.xlsx")

Calculate mean and standard deviation

  1. Stress_Pre
mean(Dataset6.4$Stress_Pre)
## [1] 51.53601
sd(Dataset6.4$Stress_Pre)
## [1] 17.21906

The variable Stress_Pre had a mean of 51.54 and a standard deviation of 17.22

  1. Stress_Post
mean(Dataset6.4$Stress_Post)
## [1] 41.4913
sd(Dataset6.4$Stress_Post)
## [1] 18.88901

The variable Stress_Post had a mean of 49.50 and a standard deviation of 18.89

Create Histograms for Stress_Pre

hist(Dataset6.4$Stress_Pre,
     main = "Stress_Pre",
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

For the Works Stress_Pre, the data appears Positively Skewed (not-normal). The kurtosis also does not appears bell-shaped (not-normal)

Create Histograms for Stress_Post

hist(Dataset6.4$Stress_Post,
     main = "Stress_Post",
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

The variable “Stress_Post” appears normally distributed. The data looks slightly positively Skewed (some data is in the left). The data also appears to have a proper bell curve.

Conduct Shapiro–Wilk tests for to check the normality of each variable

  1. Stress_Pre
shapiro.test(Dataset6.4$Stress_Pre)
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.4$Stress_Pre
## W = 0.91878, p-value = 0.01315

The Shaprio-Wilk p-value for Stress_Pre normality test is less than 0.05 (0.01), so the data is not normal

  1. Stress_Post
shapiro.test(Dataset6.4$Stress_Post) 
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.4$Stress_Post
## W = 0.97307, p-value = 0.5328

The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.53), so the data is normal

Correlation Analysis

cor.test(Dataset6.4$Stress_Pre, Dataset6.4$Stress_Post, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  Dataset6.4$Stress_Pre and Dataset6.4$Stress_Post
## t = 9.8294, df = 33, p-value = 2.487e-11
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7442522 0.9292540
## sample estimates:
##       cor 
## 0.8633669

The Spearman Correlation test was selected because one of the variables was abnormally distributed according to the histograms and the Shapiro-Wilk tests. The p-value (probability value) is 7.834e-08, which is below .05. This means the results are statistically significant. The alternate hypothesis is supported. The rho-value is 0.85
The correlation is positive, which means as Stress_Pre increases, Stress_Post increases. The correlation value is greater 0.50, which means the relationship is strong

Scatterplots

ggscatter(
  Dataset6.4,
  x = "Stress_Pre",
  y = "Stress_Post",
  add = "reg.line",
  xlab = "Stress_Pre",
  ylab = "Stress_Post"
)

The line of best fit is pointing to the top right. This means the direction of the data is positive. As Stress_Pre increases, Stress_Post increases. The dots closely hug the line. This means there is a strong relationship between the variables. The dots form a straight-line pattern. This means the data is linear. There may be a possible outlier (around Stress_Pre in the 40s–50s with low Stress_Post). However, the point does not appear to drastically change the overall trend and does not strongly affect the relationship between the variables.

Report the Results

The Stress_Pre variable (M = 51.54, SD = 17.22) was significantly correlated with the Stress_Post variable (M = 41.49, SD = 18.89), ρ = .85, p < .001 The relationship was positive and strong. As the Stress_Pre increased, the Stress_Post increased.