RESEARCH QUESTION:
Can we say that the average scores obtained in math and reading
comprehention are different?
mydata <- read.table("./StudentsPerformance.csv", header = TRUE, sep = ",")
head(mydata)
## gender race.ethnicity parental.level.of.education lunch test.preparation.course math.score reading.score
## 1 female group B bachelor's degree standard none 72 72
## 2 female group C some college standard completed 69 90
## 3 female group B master's degree standard none 90 95
## 4 male group A associate's degree free/reduced none 47 57
## 5 male group C some college standard none 76 78
## 6 female group B associate's degree standard none 71 83
## writing.score
## 1 74
## 2 88
## 3 93
## 4 44
## 5 75
## 6 78
mydata$race.ethnicity <- NULL
mydata$parental.level.of.education <- NULL
mydata$lunch <- NULL
mydata$test.preparation.course <- NULL
mydata$writing.score <- NULL
˙˙
head(mydata, 10)
## gender math.score reading.score
## 1 female 72 72
## 2 female 69 90
## 3 female 90 95
## 4 male 47 57
## 5 male 76 78
## 6 female 71 83
## 7 female 88 95
## 8 male 40 43
## 9 male 64 64
## 10 female 38 60
Unit of observation: one student
I have found this dataset on Kaggle.com with the title Students Performance on Exams. Since the sample size (1000) is way to large, i will choose a random sample of 150 students.
set.seed(1)
mydata <- mydata[sample(nrow(mydata), 150), ]
head(mydata)
## gender math.score reading.score
## 836 female 60 64
## 679 male 81 75
## 129 male 82 82
## 930 female 48 56
## 509 male 79 78
## 471 female 83 85
#install.packages("rstatix")
library("rstatix")
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
#install.packages("tidyverse")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks rstatix::filter(), stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::recode() masks car::recode()
## ✖ purrr::some() masks car::some()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
H0: difference between the variables is normally
distributed
H1: difference between the variables is not normally
distributed
mydata$Difference <- mydata$math.score - mydata$reading.score
library(ggplot2)
ggplot(mydata, aes(x = Difference)) +
geom_histogram(binwidth = 4, color = "black") +
xlab("Differences")
Based on this ggplot, i can assume that distribution is not normal and it is actually skewed to the right, But just to make sure, i will perform a shapiro wilk test.
shapiro.test(mydata$Difference)
##
## Shapiro-Wilk normality test
##
## data: mydata$Difference
## W = 0.97874, p-value = 0.02004
Based on the p value, i can reject the null hypothesis (at p=0,021), therefore i will perform Wilcoxson sign ranked test. But first i will also do the t-test, just to see the comparison.
First I will show some descriptive statistics for my data.
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
## The following object is masked from 'package:car':
##
## logit
describe(mydata[ , -1])
## vars n mean sd median trimmed mad min max range skew kurtosis se
## math.score 1 150 65.55 15.76 66.5 66.07 15.57 23 99 76 -0.31 -0.20 1.29
## reading.score 2 150 68.71 14.95 69.5 69.06 16.31 24 100 76 -0.22 -0.35 1.22
## Difference 3 150 -3.16 9.45 -4.0 -3.50 10.38 -21 20 41 0.30 -0.60 0.77
Now i will perform a parametric test: t-test.
HO: the difference between the two means is
zero
H1: the difference between the two means is not
zero
t.test(mydata$math.score, mydata$reading.score,
paired = TRUE,
alternative = "two.sided")
##
## Paired t-test
##
## data: mydata$math.score and mydata$reading.score
## t = -4.0939, df = 149, p-value = 6.93e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -4.685248 -1.634752
## sample estimates:
## mean difference
## -3.16
Based on the p-value (p<o.oo1) I can conclude, that the difference between the two means is not equal to zero. I can reject the HO.
#install.packages("effectsize")
library(effectsize)
##
## Attaching package: 'effectsize'
## The following object is masked from 'package:psych':
##
## phi
## The following objects are masked from 'package:rstatix':
##
## cohens_d, eta_squared
cohens_d(mydata$Difference)
## Cohen's d | 95% CI
## --------------------------
## -0.33 | [-0.50, -0.17]
interpret_cohens_d(0.27, rules = "sawilowsky2009")
## [1] "small"
## (Rules: sawilowsky2009)
Based on the analysis of the effect size, we can conclude that it is small.
Based on the sample data I found that there is a difference between average points received at math and average points received at reading. The difference is statistically significant at p<o.oo1 (effect size is small, d=0,33).
SInce normality with shapiro wilk test is violated, I will also perform wilcoxson signed rank test.
HO: location distribution is the same for both
courses.
H1: location distribution is not the same for both
courses.
wilcox.test(mydata$math.score, mydata$reading.score,
paired = TRUE,
correct = FALSE,
exact = FALSE,
alternative = "two.sided")
##
## Wilcoxon signed rank test
##
## data: mydata$math.score and mydata$reading.score
## V = 3199, p-value = 8.516e-05
## alternative hypothesis: true location shift is not equal to 0
Based on the p-value, I can reject the null hypothesis at p<o.oo1.
Based on this coresponding non-parametric test, i can conclude that the null hypothesis is being rejected, at p-value p<0,001. This means that the location distribution between this two courses is not the same.
In this case, both of the performed test indicate the same results.