library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(effectsize)
library(rstatix)
## 
## Attaching package: 'rstatix'
## The following objects are masked from 'package:effectsize':
## 
##     cohens_d, eta_squared
## The following object is masked from 'package:stats':
## 
##     filter
  1. Dataset Information
Dataset6.4 <- read_excel("/Users/alexiaprudencio/Desktop/Applied Analytics 1/Assingment 6/Dataset6.4.xlsx")
  1. Data Separated by Condition
Before <- Dataset6.4$Stress_Pre
After <- Dataset6.4$Stress_Post

Differences <- After - Before
  1. Descriptive Statistics for Each Group
mean(Before, na.rm = TRUE)
## [1] 51.53601
median(Before, na.rm = TRUE)
## [1] 47.24008
sd(Before, na.rm = TRUE)
## [1] 17.21906
mean(After, na.rm = TRUE)
## [1] 41.4913
median(After, na.rm = TRUE)
## [1] 40.84836
sd(After, na.rm = TRUE)
## [1] 18.88901
  1. Histogram of the Difference Scores
hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

The histogram appears negatively skewed with most of the data on the right side. The kurtosis is tall and thin.

  1. Boxplot of the Difference Scores
boxplot(Differences,
        main = "Distribution of Score Differences (After - Before)",
        ylab = "Difference in Scores",
        col = "lightblue",
        border = "darkblue")

There are outliers as there are two dots outside of the boxblot. The dots are past the whiskers and they are reasonibly far away from it. The data is not normal and a Wilcoxon-Sign Rank test would most likely be used.

  1. Shapiro-Wilk Test of Normality
shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.87495, p-value = 0.0008963

The p-value = 0.0008963 < .05 shows that the data is not normal and a Wilcoxon Sign Rank will be used.

  1. Inferential Test - Wilcoxon Sign Rank
wilcox.test(Before, After, paired = TRUE)
## 
##  Wilcoxon signed rank exact test
## 
## data:  Before and After
## V = 620, p-value = 2.503e-09
## alternative hypothesis: true location shift is not equal to 0
  1. The Effect Size - Rank Biserial Correlation
df_long <- data.frame(
  id = rep(1:length(Before), 2),
  time = rep(c("Before", "After"), each = length(Before)),
  score = c(Before, After)
)

wilcox_effsize(df_long, score ~ time, paired = TRUE)
## # A tibble: 1 × 7
##   .y.   group1 group2 effsize    n1    n2 magnitude
## * <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 score After  Before   0.844    35    35 large
  1. Report the Results There was/ was not a significant difference in the dependent variable between Before/Stress-Pre (Mdn = 47.24) and After/Stress Post (Mdn = 40.84), V = 620, p = 2.503e-09. The effect size was large (r₍rb₎ = 0.844).