This analysis is for Research Scenario 1 from Assignment 6. The purpose of this analysis was to determine whether there is a statistically significant difference in the mean number of headache days between two independent groups: participants taking Medication A and participants taking Medication B.
H0(Null):There is no difference in the mean number of headache days between participants taking Medication A and Medication B.
H1(Alternative):There is a difference in the mean number of headache days between participants taking Medication A and Medication B.
An Independent t-test was conducted to compare the number of headache days between participants taking Medication A (n = 50) and Medication B (n = 50). Participants taking Medication A reported fewer headache days (M = 8.10, SD = 2.81) compared to those taking Medication B (M = 12.60, SD = 3.59), t(98) = -6.99, p < .001. The effect size was very large (Cohen’s d = -1.40), indicating that Medication A was much more effective at reducing headaches than Medication B. Overall, Medication A substantially reduced headache frequency compared to Medication B.
Purpose:Import your Excel dataset into R to conduct analyses.
# INSTALL REQUIRED PACKAGE
# install.packags("readxl")
# LOAD THE PACKAGE
library(readxl)
# IMPORT EXCEL FILE INTO R STUDIO
A6R1 <- read_excel("C:/Users/konifade/Downloads/A6R1.xlsx")
head(A6R1)
## # A tibble: 6 Ă— 3
## ParticipantID Medication HeadacheDays
## <dbl> <chr> <dbl>
## 1 1 A 6
## 2 2 A 7
## 3 3 A 13
## 4 4 A 8
## 5 5 A 8
## 6 6 A 13
# install.packages("dplyr")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
A6R1 %>%
group_by(Medication) %>%
summarise(
Mean = mean(HeadacheDays, na.rm = TRUE),
Median = median(HeadacheDays, na.rm = TRUE),
SD = sd(HeadacheDays, na.rm = TRUE),
N = n()
)
## # A tibble: 2 Ă— 5
## Medication Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 A 8.1 8 2.81 50
## 2 B 12.6 12.5 3.59 50
# HISTOGRAMS
# Purpose: Visually check the normality of the scores for each group.
hist(A6R1$HeadacheDays[A6R1$Medication == "A"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(A6R1$HeadacheDays[A6R1$Medication == "B"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
# SHAPIRO-WILK TEST
# Purpose: Check the normality for each group's score statistically.
shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "A"])
##
## Shapiro-Wilk normality test
##
## data: A6R1$HeadacheDays[A6R1$Medication == "A"]
## W = 0.97852, p-value = 0.4913
shapiro.test(A6R1$HeadacheDays[A6R1$Medication == "B"])
##
## Shapiro-Wilk normality test
##
## data: A6R1$HeadacheDays[A6R1$Medication == "B"]
## W = 0.98758, p-value = 0.8741
# BOXPLOT
# Purpose: Check for any outliers impacting the mean for each group's scores.
# install.packages("ggplot2")
# install.packages("ggpubr")
library(ggplot2)
library(ggpubr)
ggboxplot(A6R1, x = "Medication", y = "HeadacheDays",
color = "Medication",
palette = "jco",
add = "jitter")
Q1: The histogram for Variable 1 appears symmetrical, indicating no noticeable skewness.
Q2: The histogram for Variable 1 has a proper bell-shaped curve, suggesting normal distribution.
Q3: The histogram for Variable 2 appears symmetrical, indicating no noticeable skewness.
Q4: The histogram for Variable 2 has a nearly proper bell-shaped curve, suggesting approximate normality.
Test if there was a difference between the means of the two groups.
t.test(HeadacheDays ~ Medication, data = A6R1, var.equal = TRUE)
##
## Two Sample t-test
##
## data: HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B
## 8.1 12.6
# DETERMINE STATISTICAL SIGNIFICANCE
Determine how big of a difference there was between the group means.
# install.packages("effectsize")
library(effectsize)
cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = A6R1, pooled_sd = TRUE)
print(cohens_d_result)
## Cohen's d | 95% CI
## --------------------------
## -1.40 | [-1.83, -0.96]
##
## - Estimated using pooled SD.