Scenario 1: Medication A vs Medication B
A medical research team created a new medication to reduce headaches (Medication A). They want to determine if Medication A is more effective at reducing headaches than the current medication on the market (Medication B). A group of participants were randomly assigned to either take Medication A or Medication B. Data was collected for 30 days through an app and participants reported each day if they did or did not have a headache. Was there a difference in the number of headaches between the groups?
HYPOTHESIS TESTED
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("readxl")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'readxl' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpgvB2OB\downloaded_packages
library(readxl)
dataset <- read_excel("C:\\Users\\N Geetha Shivani\\Downloads\\A6R1.xlsx")
install.packages("dplyr")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## also installing the dependencies 'withr', 'generics', 'tidyselect'
## package 'withr' successfully unpacked and MD5 sums checked
## package 'generics' successfully unpacked and MD5 sums checked
## package 'tidyselect' successfully unpacked and MD5 sums checked
## package 'dplyr' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpgvB2OB\downloaded_packages
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
dataset%>%
group_by(Medication) %>%
summarise(
Mean = mean(HeadacheDays, na.rm = TRUE),
Median = median(HeadacheDays, na.rm = TRUE),
SD = sd(HeadacheDays, na.rm = TRUE),
N = n()
)
## # A tibble: 2 × 5
## Medication Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 A 8.1 8 2.81 50
## 2 B 12.6 12.5 3.59 50
hist(dataset$HeadacheDays[dataset$Medication == "A"],
main = "Histogram of Medication Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(dataset$HeadacheDays[dataset$Medication == "B"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
Purpose: Check the normality for each group’s score statistically. The Shapiro-Wilk Test is a test that checks skewness and kurtosis at the same time. The test is checking “Is this variable the SAME as normal data (null hypothesis) or DIFFERENT from normal data (alternate hypothesis)?” For this test, if p is GREATER than .05 (p > .05), the data is NORMAL. If p is LESS than .05 (p < .05), the data is NOT normal.
shapiro.test(dataset$HeadacheDays[dataset$Medication == "A"])
##
## Shapiro-Wilk normality test
##
## data: dataset$HeadacheDays[dataset$Medication == "A"]
## W = 0.97852, p-value = 0.4913
shapiro.test(dataset$HeadacheDays[dataset$Medication == "B"])
##
## Shapiro-Wilk normality test
##
## data: dataset$HeadacheDays[dataset$Medication == "B"]
## W = 0.98758, p-value = 0.8741
A)Yes, the data is Normally distributed for Group B
NOTE: If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL. Continue to the box-plot test below. If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Mann-Whitney U).
Purpose: Check for any outliers impacting the mean for each group’s scores.
install.packages("ggplot2")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## also installing the dependencies 'farver', 'labeling', 'RColorBrewer', 'viridisLite', 'gtable', 'isoband', 'S7', 'scales'
## package 'farver' successfully unpacked and MD5 sums checked
## package 'labeling' successfully unpacked and MD5 sums checked
## package 'RColorBrewer' successfully unpacked and MD5 sums checked
## package 'viridisLite' successfully unpacked and MD5 sums checked
## package 'gtable' successfully unpacked and MD5 sums checked
## package 'isoband' successfully unpacked and MD5 sums checked
## package 'S7' successfully unpacked and MD5 sums checked
## package 'scales' successfully unpacked and MD5 sums checked
## package 'ggplot2' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpgvB2OB\downloaded_packages
install.packages("ggpubr")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## also installing the dependencies 'rbibutils', 'Deriv', 'modelr', 'microbenchmark', 'Rdpack', 'numDeriv', 'doBy', 'SparseM', 'MatrixModels', 'minqa', 'nloptr', 'reformulas', 'RcppEigen', 'backports', 'carData', 'abind', 'Formula', 'pbkrtest', 'quantreg', 'lme4', 'broom', 'corrplot', 'car', 'ggrepel', 'ggsci', 'tidyr', 'purrr', 'cowplot', 'ggsignif', 'gridExtra', 'polynom', 'rstatix'
## package 'rbibutils' successfully unpacked and MD5 sums checked
## package 'Deriv' successfully unpacked and MD5 sums checked
## package 'modelr' successfully unpacked and MD5 sums checked
## package 'microbenchmark' successfully unpacked and MD5 sums checked
## package 'Rdpack' successfully unpacked and MD5 sums checked
## package 'numDeriv' successfully unpacked and MD5 sums checked
## package 'doBy' successfully unpacked and MD5 sums checked
## package 'SparseM' successfully unpacked and MD5 sums checked
## package 'MatrixModels' successfully unpacked and MD5 sums checked
## package 'minqa' successfully unpacked and MD5 sums checked
## package 'nloptr' successfully unpacked and MD5 sums checked
## package 'reformulas' successfully unpacked and MD5 sums checked
## package 'RcppEigen' successfully unpacked and MD5 sums checked
## package 'backports' successfully unpacked and MD5 sums checked
## package 'carData' successfully unpacked and MD5 sums checked
## package 'abind' successfully unpacked and MD5 sums checked
## package 'Formula' successfully unpacked and MD5 sums checked
## package 'pbkrtest' successfully unpacked and MD5 sums checked
## package 'quantreg' successfully unpacked and MD5 sums checked
## package 'lme4' successfully unpacked and MD5 sums checked
## package 'broom' successfully unpacked and MD5 sums checked
## package 'corrplot' successfully unpacked and MD5 sums checked
## package 'car' successfully unpacked and MD5 sums checked
## package 'ggrepel' successfully unpacked and MD5 sums checked
## package 'ggsci' successfully unpacked and MD5 sums checked
## package 'tidyr' successfully unpacked and MD5 sums checked
## package 'purrr' successfully unpacked and MD5 sums checked
## package 'cowplot' successfully unpacked and MD5 sums checked
## package 'ggsignif' successfully unpacked and MD5 sums checked
## package 'gridExtra' successfully unpacked and MD5 sums checked
## package 'polynom' successfully unpacked and MD5 sums checked
## package 'rstatix' successfully unpacked and MD5 sums checked
## package 'ggpubr' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpgvB2OB\downloaded_packages
library(ggplot2)
library(ggpubr)
ggboxplot(dataset, x = "Medication", y = "HeadacheDays",
color = "Medication",
palette = "jco",
add = "jitter")
t.test(HeadacheDays ~ Medication, data = dataset, var.equal = TRUE)
##
## Two Sample t-test
##
## data: HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B
## 8.1 12.6
install.packages("effectsize")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## also installing the dependencies 'bayestestR', 'insight', 'parameters', 'performance', 'datawizard'
## package 'bayestestR' successfully unpacked and MD5 sums checked
## package 'insight' successfully unpacked and MD5 sums checked
## package 'parameters' successfully unpacked and MD5 sums checked
## package 'performance' successfully unpacked and MD5 sums checked
## package 'datawizard' successfully unpacked and MD5 sums checked
## package 'effectsize' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpgvB2OB\downloaded_packages
library(effectsize)
cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = dataset, pooled_sd = TRUE)
print(cohens_d_result)
## Cohen's d | 95% CI
## --------------------------
## -1.40 | [-1.83, -0.96]
##
## - Estimated using pooled SD.