R Markdown

# DEPENDENT T-TEST & WILCOXON SIGN RANK
# Used to test if there is a difference between Before scores and After scores (comparing the means).

# NULL HYPOTHESIS (H0)
# The null hypothesis is ALWAYS used.
# There is no difference between the Before scores and After scores.

# ALTERNATE HYPOTHESIS (H1)
#After scores are higher than before scores.

# Choose ONE of the three options below (based on your research scenario):

# 1) NON-DIRECTIONAL ALTERNATE: There is a difference between the Before scores and After scores.

# 2) DIRECTIONAL ALTERNATE HYPOTHESES ONE: Before scores are higher than After scores.
# 3) DIRECTIONAL ALTERNATE HYPOTHESIS TWO: After score are higher than Before scores.


# IMPORT EXCEL FILE
# Purpose: Import your Excel dataset into R to conduct analyses.

# INSTALL REQUIRED PACKAGE
# If never installed, remove the hashtag before the install code.
# If previously installed, leave the hashtag in front of the code.

#install.packages("readxl")

# LOAD THE PACKAGE
# Always reload the package you want to use. 

library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO
# Download the Excel file from One Drive and save it to your desktop.
# Right-click the Excel file and click “Copy as path” from the menu.
# In RStudio, replace the example path below with your actual path.
# Replace backslashes \ with forward slashes / or double them //:
# ✘ WRONG   "C:\Users\Joseph\Desktop\mydata.xlsx"
# ✔ CORRECT "C:/Users/Joseph/Desktop/mydata.xlsx"
# ✔ CORRECT "C:\\Users\\Joseph\\Desktop\\mydata.xlsx"
# Replace "dataset" with the name of your excel data (without the .xlsx)

A6R3 <- read_excel("C:/Users/Sindhu/Downloads/A6R3.xlsx")


# CALCULATE THE DIFFERENCE SCORES
# Calculate the difference between the Before scores versus the after scores.

# RENAME THE VARIABLES
# Replace "dataset" with your dataset name (without .xlsx)
# Replace "pre" with name of your variable for before scores.
# Replace "post" with name of your variable for after scores.

Before <- A6R3$PreTraining
After <- A6R3$PostTraining

Differences <- After - Before


# HISTOGRAM
# Create a histogram for difference scores to visually check skewness and kurtosis.


# CREATE THE HISTOGRAMS
# You do not need to edit this code.

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

# WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Is the histograms symmetrical, positively skewed, or negatively skewed?
#negatively skewed
# Q2) Did the histogram look too flat, too tall, or did it have a proper bell curve?
#bell curve


# SHAPIRO-WILK TEST
# Check the normality for the difference between the groups.

# CONDUCT SHAPIRO-WILK TEST
# You do not need to edit the code.

shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.98773, p-value = 0.21
# QUESTIONS
# Answer the questions below as a comment within the R script:
# Q1)Was the data normally distributed or abnormally distributed?
# If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).
#normal
# If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).
#

# BOXPLOT
# Check for any outliers impacting the mean. 

# CREATE THE BOXPLOT
# You do not need to edit this code

boxplot(Before, After,
        names = c("Before", "After"),
        main = "Boxplot of Before and After Scores",
        col = c("lightblue", "lightgreen"))

# QUESTIONS
# Answer the questions below as a comment within the R script:
# Q1) Were there any dots outside of the boxplots? These dots represent participants with extreme scores.
#yess
# Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score?
#no
# Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank. 
#stay in dependent t test

# If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.

# DESCRIPTIVE STATISTICS
# Calculate the mean, median, SD, and sample size for each group.
#Before scores: mean(59.73), median(60), sd(7.96)  and sample size(150).
#After scores:  mean(69.24), median(69.5), sd(9.48)  and sample size(150).

# DESCRIPTIVES FOR BEFORE SCORES
# You do not need to edit this code

mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150
# DESCRIPTIVES FOR AFTER SCORES
# You do not need to edit this code

mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150
# DEPENDENT T-TEST
# Note: The Dependent t-test is also called the Paired Samples t-test.
# Remove the hashtag to use the code
# There are no other edits you need to make to the code.

t.test(Before, After, paired = TRUE)
## 
##  Paired t-test
## 
## data:  Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -10.313424  -8.699909
## sample estimates:
## mean difference 
##       -9.506667
# DETERMINE STATISTICAL SIGNIFICANCE
# If results were statistically significant (p < .05), continue to effect size section below.
# If results were NOT statistically significant (p > .05), skip to reporting section below.
# NOTE: Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.
# The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.


# EFFECT SIZE FOR DEPENDENT T-TEST

# Purpose: Determine how big of a difference there was between the group means.

# INSTALL REQUIRED PACKAGE
# If never installed, remove the hashtag before the install code.
# If previously installed, leave the hashtag in front of the code.

#install.packages("effectsize")

# LOAD THE PACKAGE
# Always reload the package you want to use. 

library(effectsize)

# CALCULATE COHEN’S D
# You do not need to edit the code.
# Just remove the hashtag.

cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d |         95% CI
## --------------------------
## -1.90     | [-2.17, -1.63]
# QUESTIONS
# Answer the questions below as a comment within the R script:
#
# Q1) What is the size of the effect?
# The effect means how big or small was the difference between the group averages.
# ± 0.00 to 0.19 = ignore
# ± 0.20 to 0.49 = small
# ± 0.50 to 0.79 = moderate
# ± 0.80 to 1.29 = large
# ± 1.30 to +   = very large
# Examples:
# A Cohen's D of 0.10 indicates the difference between the group averages was not truly meaningful. There was no effect.
# A Cohen's D of 0.22 indicates the difference between the group averages was small.
#There is a large difference between the groups(before, after).

# Q2) Which group had the higher average score?
# With the way we calculated differences (After minus Before), if it is positive, it means the After scores were higher.
# If it is negative, it means the Before scores were higher.
# You can also easily look at the means and tell which scores were higher.

#after scores were higher than before.


# Research Report on Results: Dependent t-test
# Goal: Write a paragraph summarizing your findings

# Directions:

# For your results summary, you should report the following information:
# 1. The name of the inferential test used (Dependent t-test or Paired Samples t-test)
# 2. The names of the two related conditions or time points you analyzed (use proper labels)
# 3. The sample size (n)
# 4. Whether the test was statistically significant (p < .05) or not (p > .05)
# 5. The mean (M) and standard deviation (SD) for each condition
# 6. Whether scores significantly increased, decreased, or stayed the same across time/conditions
# 7. Degrees of freedom (df)
# 8. t-value
# 9. EXACT p-value to three decimals. NOTE: If p > .05, just report p > .05 If p < .001, just report p < .001
# 10. If there was a significant difference, report the effect size (Cohen’s d) and interpretation (small, medium, large)

# Example:
# A dependent t-test was conducted to compare driving anxiety levels before and after taking a defensive driving course 
# among 30 participants. Results showed that post-course anxiety scores (M = 3.12, SD = 0.85) were significantly lower 
# than pre-course scores (M = 4.20, SD = 1.01), t(29) = 4.57, p < .001. The effect size was Cohen’s d = 0.83, indicating 
# a large effect. These results suggest that the defensive driving course significantly reduced driving anxiety.