library(readxl)
library(ggpubr)
## Loading required package: ggplot2
Dataset6.3 <- read_excel("D:/SLU/AdvAppliedAnalytics/Dataset6.3.xlsx")
mean(Dataset6.3$Stress_Pre)
## [1] 65.86954
sd(Dataset6.3$Stress_Pre)
## [1] 9.496524
The variable Stress_Pre had a mean of 65.87 and a standard deviation of 9.50
mean(Dataset6.3$Stress_Post)
## [1] 57.90782
sd(Dataset6.3$Stress_Post)
## [1] 10.1712
The variable Stress_Post had a mean of 57.91 and a standard deviation of 10.17
hist(Dataset6.3$Stress_Pre,
main = "Stress_Pre",
breaks = 20,
col = "lightblue",
border = "white",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The variable “Stress_Pre” appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve.
hist(Dataset6.3$Stress_Post,
main = "Stress_Post",
breaks = 20,
col = "lightblue",
border = "white",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The variable “Stress_Post” appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve.
shapiro.test(Dataset6.3$Stress_Pre)
##
## Shapiro-Wilk normality test
##
## data: Dataset6.3$Stress_Pre
## W = 0.98426, p-value = 0.8855
The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.89), so the data is normal
shapiro.test(Dataset6.3$Stress_Post)
##
## Shapiro-Wilk normality test
##
## data: Dataset6.3$Stress_Post
## W = 0.96865, p-value = 0.4072
The Shaprio-Wilk p-value for Stress_Pre normality test is greater than 0.05 (0.41), so the data is normal
cor.test(Dataset6.3$Stress_Pre, Dataset6.3$Stress_Post, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: Dataset6.3$Stress_Pre and Dataset6.3$Stress_Post
## t = 1.5355, df = 33, p-value = 0.1342
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08208415 0.54460746
## sample estimates:
## cor
## 0.258226
The Pearson Correlation test was selected because both variables were normally distributed according to the histograms and the Shapiro-Wilk tests. The p-value (probability value) is 0.13, which is above .05. This means the results are not statistically significant. The null hypothesis is supported. The correlation value is 0.26 The correlation is positive, which means as Stress_Pre increases, Stress_Post increases. The correlation value is less than 0.30, which means the relationship is weak
ggscatter(
Dataset6.3,
x = "Stress_Pre",
y = "Stress_Post",
add = "reg.line",
xlab = "Stress_Pre",
ylab = "Stress_Post"
)
The line of best fit is pointing to the top right. This means the direction of the data is positive. As Stress_Pre increases, Stress_Post increases. The dots loosely hug the line. This means there is a weak relationship between the variables. The dots form a straight-line pattern. This means the data is linear. There may be a possible outlier (a student with relatively low Stress_Pre around the high 50s but very low Stress_Post around the mid-30s). However, the point does not appear to drastically change the overall trend and does not strongly affect the relationship between the variables.
The Stress_Pre (M = 65.87, SD = 9.50) was not significantly correlated with the Stress_Pre (M = 57.91, SD = 10.17), r(33) = .26, p = .134 The relationship was positive and weak. As the independent variable increased, the dependent variable increased.