Load Libraries

library(readxl)
library(ggpubr)

## Loading required package: ggplot2

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Import Dataset

data <- read_excel("C:/Users/Admin/Downloads/Dataset6.4.xlsx")

Check data

head(data)

## # A tibble: 6 × 3
##   Student_ID Stress_Pre Stress_Post
##        <dbl>      <dbl>       <dbl>
## 1          1       53.5       45.5 
## 2          2       37.4       33.9 
## 3          3       35.8        9.49
## 4          4       89.0       82.8 
## 5          5       30.5       26.8 
## 6          6       42.5       26.9

str(data)

## tibble [35 × 3] (S3: tbl_df/tbl/data.frame)
##  $ Student_ID : num [1:35] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Stress_Pre : num [1:35] 53.5 37.4 35.8 89 30.5 ...
##  $ Stress_Post: num [1:35] 45.48 33.92 9.49 82.77 26.82 ...

Create Variables

Stress_Pre  <- data$Stress_Pre
Stress_Post <- data$Stress_Post

Differences <- Stress_Post - Stress_Pre
print(Differences)

##  [1]  -8.044361120  -3.529426078 -26.350753567  -6.276567859  -3.701374633
##  [6] -15.552084134  -4.356073741  -3.647770620  -0.008788193  -5.302543769
## [11]  -8.806996646  -3.200139192  -1.385585292 -19.179181634   3.899627158
## [16]  -6.612429249  -2.533382079  -1.317457905  -0.782853924 -32.441286182
## [21]  -4.132848699  -6.536275900  -6.896524983  -5.238701383 -36.697172771
## [26] -28.957287115  -9.817187881 -10.508880209 -19.386322836 -10.616491219
## [31] -13.700632408  -7.433337532 -20.087719095  -7.326665368 -15.099150128

Calculate descriptive statistics for each group

cat("Pre-Test Mean: ", mean(Stress_Pre, na.rm = TRUE), "\n")

## Pre-Test Mean:  51.53601

cat("Pre-Test Median: ", median(Stress_Pre, na.rm = TRUE), "\n")

## Pre-Test Median:  47.24008

cat("Pre-Test SD: ", sd(Stress_Pre, na.rm = TRUE), "\n\n")

## Pre-Test SD:  17.21906

cat("Post-Test Mean: ", mean(Stress_Post, na.rm = TRUE), "\n")

## Post-Test Mean:  41.4913

cat("Post-Test Median: ", median(Stress_Post, na.rm = TRUE), "\n")

## Post-Test Median:  40.84836

cat("Post-Test SD: ", sd(Stress_Post, na.rm = TRUE), "\n")

## Post-Test SD:  18.88901

Histogram of Difference Scores

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Difference (Post - Pre)",
     col = "lightblue",
     border = "black",
     breaks = 10)

#Interpretation

The histogram displays the distribution of the difference scores.If the histogram appears roughly symmetric and bell-shaped, the normality assumption is likely satisfied.If the histogram appears skewed or irregular, the normality assumption may be violated.

Boxplot

boxplot(Differences,
        main = "Boxplot of Differences",
        col = "lightgreen",
        border = "black")

Interpretation

The boxplot helps identify potential outliers and the overall spread of the data.Data points beyond the whiskers indicate potential outliers.The presence of several outliers may suggest that the normality assumption is violated.So the boxplots is not normal.

Shapiro–Wilk Test of Normality

shapiro.test(Differences)

## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.87495, p-value = 0.0008963

Interpretation

The Shapiro–Wilk test evaluates whether the difference scores are normally distributed.If p > .05 → Data are considered normal. If p < .05 → Data are not normal

Select the Correct Test

If the Shapiro–Wilk test is not significant (p > .05), a paired-samples t-test is appropriate.If the Shapiro–Wilk test is significant (p < .05), a Wilcoxon signed-rank test is appropriate.

Statistical Test

wilcox.test(Stress_Pre, Stress_Post, paired = TRUE)

## 
##  Wilcoxon signed rank exact test
## 
## data:  Stress_Pre and Stress_Post
## V = 620, p-value = 2.503e-09
## alternative hypothesis: true location shift is not equal to 0

Calculate the Effect Size (Rank Biserial Correlation for Mann-Whitney U)

library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

df_long <- data.frame(
  id = rep(1:length(Stress_Pre), 2),
  time = rep(c("Pre", "Post"), each = length(Stress_Pre)),
  stress = c(Stress_Pre, Stress_Post)
)

wilcox_effsize(df_long, stress ~ time, paired = TRUE, id = id)

## # A tibble: 1 × 7
##   .y.    group1 group2 effsize    n1    n2 magnitude
## * <chr>  <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 stress Post   Pre      0.844    35    35 large

Interpretation

There was a significant difference in the stress between Pre-Stress Group (Mdn = 47.24) and Post-Stress (Mdn = 40.84), V = 620, p < .001 The effect size was very large (r₍rb₎ = 0.84).

RQ - 4

Ismail Khan Mohammed

2026-02-18

Load Libraries

Import Dataset

Check data

Create Variables

Calculate descriptive statistics for each group

Histogram of Difference Scores

Boxplot

Interpretation

Shapiro–Wilk Test of Normality

Interpretation

Select the Correct Test

Statistical Test

Calculate the Effect Size (Rank Biserial Correlation for Mann-Whitney U)

Interpretation