DEPENDENT T-TEST & WILCOXON SIGN RANK

Used to test if there is a difference between Before Pre training and Post training After scores (comparing the means).

NULL HYPOTHESIS (H0)

There is no difference between the Pre training and Post training

ALTERNATE HYPOTHESIS (H1)

There is a difference between the Pre training and Post training

#install.packages("readxl")
library(readxl)
dataset <- read_excel("C:\\Users\\navya\\Downloads\\A6R3.xlsx")

CALCULATE THE DIFFERENCE SCORES

Before <- dataset$PreTraining
After <- dataset$PostTraining

Differences <- After - Before

HISTOGRAM

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

# DIRECTIONS:

#QUESTION 1: Is the histograms symmetrical, positively skewed, or negatively skewed?
#ANSWER: The histogram appears positively skewed.

#QUESTION 2: Did the histogram look too flat, too tall, or did it have a proper bell curve?
#ANSWER: The histogram appears irregular and flattened which does not form a perfect bell curve

SHAPIRO-WILK TEST

shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.98773, p-value = 0.21
# DIRECTIONS: Answer the questions below directly in your code.

# QUESTION 1: Was the data normally distributed or abnormally distributed?
# If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).
# If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).
# ANSWER:
# Since p value is 0.21 where p>0.05 it is normally distributed so we can continue with dependent t-test.

BOXPLOT

boxplot(Differences,
        main = "Distribution of Score Differences (After - Before)",
        ylab = "Difference in Scores",
        col = "blue",
        border = "darkblue")

# DIRECTIONS: Answer the questions below directly in your code.

#QUESTION 1: How many dots are in your boxplot?
#A) No dots.
#B) One or two dots. 
#C) Many dots.
#ANSWER: B

#QUESTION 2: Where are the dots in your boxplot?
#A) There are no dots.
#B) Very close to the whiskers (lines of the boxplot).
#C) Far from the whiskers (lines of the boxplot).
#ANSWER: B

#QUESTION 3: Based on the dots and there location, is the data normal?
#Answer: Based on dots which is only one dot is outside the data is normal

DESCRIPTIVE STATISTICS

DESCRIPTIVES FOR BEFORE SCORES

mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150

DESCRIPTIVES FOR AFTER SCORES

mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150

DEPENDENT T-TEST

t.test(Before, After, paired = TRUE)
## 
##  Paired t-test
## 
## data:  Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -10.313424  -8.699909
## sample estimates:
## mean difference 
##       -9.506667

DETERMINE STATISTICAL SIGNIFICANCE

EFFECT SIZE FOR DEPENDENT T-TEST

#install.packages("effectsize")

LOAD THE PACKAGE

library(effectsize)

CALCULATE COHEN’S D

cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d |         95% CI
## --------------------------
## -1.90     | [-2.17, -1.63]
# QUESTIONS
#QUESTION 1: What is the size of the effect?
#Since Cohen's d value is 1.90 >1.30, effect size is very large.

#QUESTION 2: Which group had the higher average score?
#YOUR ANSWER: Since the effect size was negative,before scores were higher than after scores.

Research Report on Results: Dependent t-test

#A dependent t-test was conducted to compare pre-training and post-training scores among the 150 participants. The results showed that post-training scores (69.24,SD=9.48) were significantly higher than pre-training scores(M=59.73,SD=7.97),indicating improvement after training. The analysis revealed a statistically significant difference between the two points, t(149)=-23.29, p<0.001, with an average increase of 9.51 points from before to after training. The effect size was Cohen's d=1.90, indicating a very large effect, meaning that the training produced a strong and meaningful improvement in participant scores.