Team17_Week6_HW

R code and Analysis

CHECK NORMAL DISTRIBUTION

IMPORT EXCEL FILE Purpose: Import your Excel dataset into R to conduct analyses.

# INSTALL REQUIRED PACKAGE

# install.packages("readxl")

# LOAD THE PACKAGE

library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO

dataset <- read_excel("//apporto.com/dfs/SLU/Users/minhoku_slu/Downloads/A6R1.xlsx")

DESCRIPTIVE STATISTICS PURPOSE: Calculate the mean, median, SD, and sample size for each group.

# INSTALL REQUIRED PACKAGE

# install.packages("dplyr")

# LOAD THE PACKAGE

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# CALCULATE THE DESCRIPTIVE STATISTICS

dataset %>%
  group_by(Medication) %>%
  summarise(
    Mean = mean(HeadacheDays, na.rm = TRUE),
    Median = median(HeadacheDays, na.rm = TRUE),
    SD = sd(HeadacheDays, na.rm = TRUE),
    N = n()
  )

## # A tibble: 2 × 5
##   Medication  Mean Median    SD     N
##   <chr>      <dbl>  <dbl> <dbl> <int>
## 1 A            8.1    8    2.81    50
## 2 B           12.6   12.5  3.59    50

HISTOGRAMS Purpose: Visually check the normality of the scores for each group.

# CREATE THE HISTOGRAMS 

hist(dataset$HeadacheDays[dataset$Medication == "A"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(dataset$HeadacheDays[dataset$Medication == "B"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

QUESTIONS

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
The histogram looks symmetrical.
Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
The histogram have a proper bell curve.
Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
The histogram looks symmetrical.
Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
The histogram have a nearly proper bell curve.

SHAPIRO-WILK TEST Purpose: Check the normality for each group’s score statistically.

# CONDUCT THE SHAPIRO-WILK TEST

shapiro.test(dataset$HeadacheDays[dataset$Medication == "A"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$HeadacheDays[dataset$Medication == "A"]
## W = 0.97852, p-value = 0.4913

shapiro.test(dataset$HeadacheDays[dataset$Medication == "B"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$HeadacheDays[dataset$Medication == "B"]
## W = 0.98758, p-value = 0.8741

QUESTION

Was the data normally distributed for Variable 1?
Yes, the data is normally distributed for variable 1.(P>0.05)
Was the data normally distributed for Variable 2?
Yes, the data is normally distributed for variable 2.(P>0.05)

BOXPLOT Purpose: Check for any outliers impacting the mean for each group’s scores.

# INSTALL REQUIRED PACKAGE

# install.packages("ggplot2")
# install.packages("ggpubr")

# LOAD THE PACKAGE

library(ggplot2)
library(ggpubr)

# CREATE THE BOXPLOT

ggboxplot(dataset, x = "Medication", y = "HeadacheDays",
          color = "Medication",
          palette = "jco",
          add = "jitter")

QUESTION

Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot or are they very far away?
There were just two dots outside of the boxplot and they are close to the whiskers.

INDEPENDENT T-TEST

PURPOSE: Test if there was a difference between the means of the two groups.

t.test(HeadacheDays ~ Medication, data = dataset, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  HeadacheDays by Medication
## t = -6.9862, df = 98, p-value = 3.431e-10
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -5.778247 -3.221753
## sample estimates:
## mean in group A mean in group B 
##             8.1            12.6

# DETERMINE STATISTICAL SIGNIFICANCE

EFFECT-SIZE PURPOSE: Determine how big of a difference there was between the group means.

# INSTALL REQUIRED PACKAGE

# install.packages("effectsize")

# LOAD THE PACKAGE

library(effectsize)

# CALCULATE COHEN’S D

cohens_d_result <- cohens_d(HeadacheDays ~ Medication, data = dataset, pooled_sd = TRUE)
print(cohens_d_result)

## Cohen's d |         95% CI
## --------------------------
## -1.40     | [-1.83, -0.96]
## 
## - Estimated using pooled SD.

QUESTIONS

Q1) What is the size of the effect?
A Cohen’s D of -1.40 indicates the difference between the group averages was very large.
Q2) Which group had the higher average score?
Group B had the higher score.

Team17_Week6_HW_Scenario1

Min-Ho Ku

2025-11-19

INDEPENDENT T-TEST

Hypotheses

Result paragraph

R code and Analysis

CHECK NORMAL DISTRIBUTION

INDEPENDENT T-TEST