library(readxl)
library(ggpubr)
## Loading required package: ggplot2
Dataset6.4 <- read_excel("D:/SLU/AdvAppliedAnalytics/Dataset6.4.xlsx")
mean(Dataset6.4$Stress_Pre)
## [1] 51.53601
sd(Dataset6.4$Stress_Pre)
## [1] 17.21906
The variable Stress_Pre had a mean of 51.54 and a standard deviation of 17.22
mean(Dataset6.4$Stress_Post)
## [1] 41.4913
sd(Dataset6.4$Stress_Post)
## [1] 18.88901
The variable Stress_Post had a mean of 49.50 and a standard deviation of 18.89
hist(Dataset6.4$Stress_Pre,
main = "Stress_Pre",
breaks = 20,
col = "lightblue",
border = "white",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
For the Works Stress_Pre, the data appears Positively Skewed (not-normal). The kurtosis also does not appears bell-shaped (not-normal)
hist(Dataset6.4$Stress_Post,
main = "Stress_Post",
breaks = 20,
col = "lightblue",
border = "white",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The variable “Stress_Post” appears normally distributed. The data looks slightly positively Skewed (some data is in the left). The data also appears to have a proper bell curve.
shapiro.test(Dataset6.4$Stress_Pre)
##
## Shapiro-Wilk normality test
##
## data: Dataset6.4$Stress_Pre
## W = 0.91878, p-value = 0.01315
The Shaprio-Wilk p-value for Stress_Pre normality test is less than 0.05 (0.01), so the data is not normal
shapiro.test(Dataset6.4$Stress_Post)
##
## Shapiro-Wilk normality test
##
## data: Dataset6.4$Stress_Post
## W = 0.97307, p-value = 0.5328
The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.53), so the data is normal
cor.test(Dataset6.4$Stress_Pre, Dataset6.4$Stress_Post, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: Dataset6.4$Stress_Pre and Dataset6.4$Stress_Post
## t = 9.8294, df = 33, p-value = 2.487e-11
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7442522 0.9292540
## sample estimates:
## cor
## 0.8633669
The Spearman Correlation test was selected because one of the
variables was abnormally distributed according to the histograms and the
Shapiro-Wilk tests. The p-value (probability value) is 7.834e-08, which
is below .05. This means the results are statistically significant. The
alternate hypothesis is supported. The rho-value is 0.85
The correlation is positive, which means as Stress_Pre increases,
Stress_Post increases. The correlation value is greater 0.50, which
means the relationship is strong
ggscatter(
Dataset6.4,
x = "Stress_Pre",
y = "Stress_Post",
add = "reg.line",
xlab = "Stress_Pre",
ylab = "Stress_Post"
)
The line of best fit is pointing to the top right. This means the direction of the data is positive. As Stress_Pre increases, Stress_Post increases. The dots closely hug the line. This means there is a strong relationship between the variables. The dots form a straight-line pattern. This means the data is linear. There may be a possible outlier (around Stress_Pre in the 40s–50s with low Stress_Post). However, the point does not appear to drastically change the overall trend and does not strongly affect the relationship between the variables.
The Stress_Pre variable (M = 51.54, SD = 17.22) was significantly correlated with the Stress_Post variable (M = 41.49, SD = 18.89), ρ = .85, p < .001 The relationship was positive and strong. As the Stress_Pre increased, the Stress_Post increased.