Research Scenario 1

A medical research team created a new medication to reduce headaches (Medication A). They want to determine if Medication A is more effective at reducing headaches than the current medication on the market (Medication B). A group of participants were randomly assigned to either take Medication A or Medication B. Data was collected for 30 days through an app and participants reported each day if they did or did not have a headache. Was there a difference in the number of headaches between the groups?


Hypotheses

Null Hypothesis (H₀)

There is no difference in the number of headache days between participants taking Medication A and those taking Medication B.

Alternate Hypothesis (H₁)

There is a difference in the number of headache days between participants taking Medication A and those taking Medication B.


Results Paragraph

An independent samples t-test was conducted to compare the number of headache days between participants taking Medication A and participants taking Medication B. Participants taking Medication A (M = 8.10, SD = 2.81, n = 50) reported significantly fewer headache days over the 30-day period than participants taking Medication B (M = 12.60, SD = 3.59, n = 50), t = –6.99, p < .001. The effect size was very large (Cohen’s d = -1.40), indicating that Medication A substantially reduced headache frequency compared to Medication B.


R Code

# INSTALL REQUIRED PACKAGE
# install.packages("readxl")

# LOAD THE PACKAGE
library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO
A6R1 <- read_excel("C:/Users/Nithin Kumar Adki/Downloads/A6R1.xlsx")

# INSTALL REQUIRED PACKAGE
# install.packages("dplyr")

# LOAD THE PACKAGE
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# CALCULATE THE DESCRIPTIVE STATISTICS
A6R1 %>%
  group_by(Medication) %>%
  summarise(
    Mean = mean(HeadacheDays, na.rm = TRUE),
    Median = median(HeadacheDays, na.rm = TRUE),
    SD = sd(HeadacheDays, na.rm = TRUE),
    N = n()
  )
## # A tibble: 2 × 5
##   Medication  Mean Median    SD     N
##   <chr>      <dbl>  <dbl> <dbl> <int>
## 1 A            8.1    8    2.81    50
## 2 B           12.6   12.5  3.59    50
# CREATE THE HISTOGRAMS 

hist(A6R1$HeadacheDays[A6R1$Medication == "A"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(A6R1$HeadacheDays[A6R1$Medication == "B"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

# QUESTIONS

# Q1) Check the SKEWNESS of the VARIABLE 1 histogram.
#     The histogram for Medication A looks fairly symmetrical.

# Q2) Check the KURTOSIS of the VARIABLE 1 histogram. 
#     It appears close to a normal bell curve.

# Q3) Check the SKEWNESS of the VARIABLE 2 histogram.
#     The histogram for Medication B is slightly positively skewed.

# Q4) Check the KUROTSIS of the VARIABLE 2 histogram.
#     It looks slightly tall, meaning mildly leptokurtic, but still close to normal.

# SHAPIRO-WILK TEST

shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "A"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R1$HeadacheDays[A6R1$Medication == "A"]
## W = 0.97852, p-value = 0.4913
shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "B"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R1$HeadacheDays[A6R1$Medication == "B"]
## W = 0.98758, p-value = 0.8741
# QUESTION

# Was the data normally distributed for Variable 1?
# Variable 1 (Medication A): Data IS normal (p = 0.4913).

# Was the data normally distributed for Variable 2?
# Variable 2 (Medication B): Data IS normal (p = 0.8741).

# INSTALL REQUIRED PACKAGE
# install.packages("ggplot2")
# install.packages("ggpubr")

# LOAD THE PACKAGE
library(ggplot2)
library(ggpubr)

# CREATE THE BOXPLOT
ggboxplot(A6R1, x = "Medication", y = "HeadacheDays",
          color = "Medication",
          palette = "jco",
          add = "jitter")

# QUESTION

# Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot (check if there are any dots past the lines on the boxes) or are they very far away?
# Medication A: maybe 1 small outlier, near whisker
# Medication B: maybe 1–2 small outliers, but close to whiskers

# Only 1–2 small outliers, close to whiskers. Independent t-test is acceptable.

# INDEPENDENT T-TEST 
t.test(HeadacheDays ~ Medication, data = A6R1, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B 
##             8.1            12.6
# EFFECT-SIZE
# install.packages("effectsize")

# LOAD THE PACKAGE
library(effectsize)

# CALCULATE COHEN’S D
cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = A6R1, pooled_sd = TRUE)
print(cohens_d_result)
## Cohen's d |         95% CI
## --------------------------
## -1.40     | [-1.83, -0.96]
## 
## - Estimated using pooled SD.
# QUESTIONS

# Q1) What is the size of the effect?
# Cohen's d = -1.40, which is a VERY LARGE effect size.

# Q2) Which group had the higher average score?
# Medication B had the higher average score (Mean = 12.6 vs 8.1).