Independent t-test

This analysis is for Research Scenario 1 from Assignment 6. The purpose of this analysis was to determine whether there is a statistically significant difference in the mean number of headache days between two independent groups: participants taking Medication A and participants taking Medication B.

Hypotheses

Final Report

An Independent t-test was conducted to compare the number of headache days between participants taking Medication A (n = 50) and Medication B (n = 50). Participants taking Medication A reported fewer headache days (M = 8.10, SD = 2.81) compared to those taking Medication B (M = 12.60, SD = 3.59), t(98) = -6.99, p < .001. The effect size was very large (Cohen’s d = -1.40), indicating that Medication A was much more effective at reducing headaches than Medication B. Overall, Medication A substantially reduced headache frequency compared to Medication B.

R code and Analysis

Import Excel File

Purpose:Import your Excel dataset into R to conduct analyses.

# INSTALL REQUIRED PACKAGE

# install.packags("readxl")

# LOAD THE PACKAGE

library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO

A6R1 <- read_excel("C:/Users/konifade/Downloads/A6R1.xlsx")

head(A6R1)
## # A tibble: 6 Ă— 3
##   ParticipantID Medication HeadacheDays
##           <dbl> <chr>             <dbl>
## 1             1 A                     6
## 2             2 A                     7
## 3             3 A                    13
## 4             4 A                     8
## 5             5 A                     8
## 6             6 A                    13

Calculate The Descriptive Statistics

# install.packages("dplyr")

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
A6R1 %>%
  group_by(Medication) %>%
  summarise(
    Mean = mean(HeadacheDays, na.rm = TRUE),
    Median = median(HeadacheDays, na.rm = TRUE),
    SD = sd(HeadacheDays, na.rm = TRUE),
    N = n()
  )
## # A tibble: 2 Ă— 5
##   Medication  Mean Median    SD     N
##   <chr>      <dbl>  <dbl> <dbl> <int>
## 1 A            8.1    8    2.81    50
## 2 B           12.6   12.5  3.59    50

Check Normal Distribution

# HISTOGRAMS
# Purpose: Visually check the normality of the scores for each group.

hist(A6R1$HeadacheDays[A6R1$Medication == "A"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

 hist(A6R1$HeadacheDays[A6R1$Medication == "B"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

# SHAPIRO-WILK TEST
# Purpose: Check the normality for each group's score statistically.
 
shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "A"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R1$HeadacheDays[A6R1$Medication == "A"]
## W = 0.97852, p-value = 0.4913
shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "B"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R1$HeadacheDays[A6R1$Medication == "B"]
## W = 0.98758, p-value = 0.8741
# BOXPLOT
# Purpose: Check for any outliers impacting the mean for each group's scores.

# install.packages("ggplot2")

# install.packages("ggpubr")

library(ggplot2)

library(ggpubr)

ggboxplot(A6R1, x = "Medication", y = "HeadacheDays",
          color = "Medication",
          palette = "jco",
          add = "jitter")

Histogram Interpretation

  • Q1: The histogram for Variable 1 appears symmetrical, indicating no noticeable skewness.

  • Q2: The histogram for Variable 1 has a proper bell-shaped curve, suggesting normal distribution.

  • Q3: The histogram for Variable 2 appears symmetrical, indicating no noticeable skewness.

  • Q4: The histogram for Variable 2 has a nearly proper bell-shaped curve, suggesting approximate normality.

Shapiro-Wilk Normality Test Interpretation

  • Both variables have p-values greater than 0.05, indicating that the data for Medication A and Medication B is approximately normally distributed. Therefore, the assumption of normality is met, and an Independent t-test is appropriate.

Boxplot Interpretation

  • There were two outliers outside the boxplot, and they were close to the whiskers. This suggests that the outliers are not extreme and do not significantly affect the overall distribution.

Independent t-test

Purpose:

Test if there was a difference between the means of the two groups.

 t.test(HeadacheDays ~ Medication, data = A6R1, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B 
##             8.1            12.6
# DETERMINE STATISTICAL SIGNIFICANCE

Effect Size

Purpose:

Determine how big of a difference there was between the group means.

# install.packages("effectsize")

library(effectsize)


cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = A6R1, pooled_sd = TRUE)
print(cohens_d_result)
## Cohen's d |         95% CI
## --------------------------
## -1.40     | [-1.83, -0.96]
## 
## - Estimated using pooled SD.

Effect Size Interpretation

  • A Cohen’s d of -1.40 indicates a very large effect size, meaning the difference between the group averages is substantial.
  • Group B had the higher mean headache days (M = 12.60), which implies that Medication A was more effective at reducing headaches compared to Medication B.