w6 Research question1

What are the null and alternate hypotheses for YOUR research scenario?

Null Hypothesis:There is no difference in the number of headaches between participants taking the Medication A and Medication B.

Alternate Hypothesis: There is a difference in the number of headaches between participants taking the Medication A and Medication B.

Result:

An Independent t-test was conducted to check whether there is difference in the number of headaches between participants taking the Medication A and Medication B (N = 100). The significant satistics where group of participants were randomly assigned to take Medication A (M = 8.1, SD = 2.81) is less than group of participants were randomly assigned to take Medication B (M = 12.6, SD = 3.59). Number of participants HeadacheDays by Medication (t = -6.9862, df = 98.682, p < 0.001). The Cohen’s d value is was very large (d = 1.40 ), indicating a very large difference between Medication A and Medication B. The participants who take medication B as higher chance of reduction headache than participants who take medication A. Thus alternation hypotheses is supported that there is a difference in the number of headaches between participants taking the Medication A and Medication B.

#install.packages("readxl")

library(readxl)

dataset <- read_excel("~/Downloads/A6R1.xlsx")

#install.packages("dplyr")

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

dataset %>%
  group_by(Medication) %>%
  summarise(
    Mean = mean(HeadacheDays, na.rm = TRUE),
    Median = median(HeadacheDays, na.rm = TRUE),
    SD = sd(HeadacheDays, na.rm = TRUE),
    N = n()
  )

## # A tibble: 2 × 5
##   Medication  Mean Median    SD     N
##   <chr>      <dbl>  <dbl> <dbl> <int>
## 1 A            8.1    8    2.81    50
## 2 B           12.6   12.5  3.59    50

hist(dataset$HeadacheDays[dataset$Medication == "A"],
main = "Histogram of A Headachedays",
xlab = "Headache Days",
ylab = "Number of Participants",
col = "lightblue",
border = "black",
breaks = 20)

hist(dataset$HeadacheDays[dataset$Medication == "B"],
main = "Histogram of B Headachedays",
xlab = "Headache Days",
ylab = "Number of Participants",
col = "lightgreen",
border = "black",
breaks = 20)

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

The SKEWNESS of the VARIABLE 1 histogram is symmetrical skewed

Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

The KURTOSIS of the VARIABLE 1 histogram is to tall.

Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

the SKEWNESS of the VARIABLE 2 histogram is symmetrical skewed

Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

the KUROTSIS of the VARIABLE 2 histogram is flat

shapiro.test(dataset$HeadacheDays[dataset$Medication == "A"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$HeadacheDays[dataset$Medication == "A"]
## W = 0.97852, p-value = 0.4913

shapiro.test(dataset$HeadacheDays[dataset$Medication == "B"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$HeadacheDays[dataset$Medication == "B"]
## W = 0.98758, p-value = 0.8741

Was the data normally distributed for Variable 1?

Yes the data normally distributed for Variable 1

Was the data normally distributed for Variable 2?

Yes the data normally distributed for Variable 2

#install.packages("ggplot2")
#install.packages("ggpubr")

library(ggplot2)
library(ggpubr)

ggboxplot(dataset, x = "Medication", y = "HeadacheDays",
          color = "Medication",
          palette = "jco",
          add = "jitter")

Q1) Were there any dots outside of the boxp? Are these dots close to the whiskers of the boxplot or are they very far away?

If there are a few dots (two or less), and they are close to the whiskers, continue with the Independent t-test.

t.test(HeadacheDays ~ Medication, data = dataset, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B 
##             8.1            12.6

#install.packages("effectsize")

library(effectsize)

cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = dataset, pooled_sd = TRUE)
print(cohens_d_result)

## Cohen's d |         95% CI
## --------------------------
## -1.40     | [-1.83, -0.96]
## 
## - Estimated using pooled SD.

QUESTIONS Answer the questions below as a comment within the R script:

Q1) What is the size of the effect?

The effect size is very large.

Q2) Which group had the higher average score?

The participants who take medication B as higher chance of reduction headache than participants who take medication A.

w6 Research question1

group 13

2025-11-19