library(readr)
df <- read.csv("~/Downloads/Data for Assignment 02.csv")
performance <- boxplot(df$Quiz.1 , df$Quiz.2, df$Quiz.3,
xlab = "Quiz",
ylab = "Score",
names = c("quiz1","quiz2","quiz3"))
the average quiz score increase from quiz 1 to quiz 2, from quiz 2 to quiz 3
Paired Sample T-Test between Quiz 1 and Quiz 2 H0: The true mean difference (mud) between quizzes 1 and 2 is zero H0: mud = 0 HA: The true mean difference between quizzes 1 and 2 is not equal to zero HA: mud <> 0
ttest_q1q2 <- t.test(df$Quiz.1 , df$Quiz.2, paired = TRUE)
ttest_q1q2
##
## Paired t-test
##
## data: df$Quiz.1 and df$Quiz.2
## t = -1.8061, df = 37, p-value = 0.07904
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.39594973 0.08016026
## sample estimates:
## mean of the differences
## -0.6578947
Analysis Given that the p-value is 0.07904 which is greater than 0.05 error allowed for type I error, we fail to reject the null hypothesis that the mean differences between quiz 1 and 2 is zero. Therefore, we are 95% confident that there is a not signficiant difference between quiz 1 and 2.
Paired Sample T-Test between Quiz 2 and Quiz 3 H0: The true mean difference (mud) between quizzes 2 and 3 is zero H0: mud = 0 HA: The true mean difference between quizzes 2 and 3 is not equal to zero HA: mud <> 0
ttest_q2q3 <- t.test(df$Quiz.2 , df$Quiz.3, paired = TRUE)
ttest_q2q3
##
## Paired t-test
##
## data: df$Quiz.2 and df$Quiz.3
## t = -4.7316, df = 37, p-value = 3.223e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.2175118 -0.8877513
## sample estimates:
## mean of the differences
## -1.552632
Analysis Given that the p-value is 0.00003223 which is less than 0.05 error allowed for type I error, we can reject the null hypothesis in favor of the alternative hypothesis. Therefore, we are 95% confident that there is a signficiant difference between quiz 2 and 3.
Sections A and C have a steady increase in scoring over time. Section B differs because the scores fluctuate in an irregular pattern. Overall, the quiz section makes a difference in scoring.
sectionA <- subset(df,df$Section == "A")
sectionB <- subset(df,df$Section == "B")
sectionC <- subset(df,df$Section == "C")
performance_section1 <- boxplot(sectionA$Quiz.1, sectionA$Quiz.2, sectionA$Quiz.3,
sectionB$Quiz.1, sectionB$Quiz.2, sectionB$Quiz.3,
sectionC$Quiz.1, sectionC$Quiz.2, sectionC$Quiz.3,
xlab = "Quiz",
ylab = "Score",
names = c("1A","2A","3A","1B","2B","3B","1C","2C","3C")
)
For quiz 1, the mean for females and males are the same. For quiz 2, the mean for females and males are the same. For quiz 3, the means differ between females and males where females have a higher mean score.
female <- subset(df, df$Sex == "F")
male <- subset(df, df$Sex == "M")
performance_gender <- boxplot( female$Quiz.1, male$Quiz.1,
female$Quiz.2, male$Quiz.2,
female$Quiz.3, male$Quiz.3,
xlab = "Quiz",
ylab = "Score",
names = c("1F","1M","2F","2M","3F","3M"))
summary(female)
## Section Sex Quiz.1 Quiz.2 Quiz.3
## A:6 F:18 Min. :2.000 Min. :2.000 Min. :5
## B:7 M: 0 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:6
## C:5 Median :4.000 Median :5.000 Median :7
## Mean :4.444 Mean :4.722 Mean :7
## 3rd Qu.:5.750 3rd Qu.:5.750 3rd Qu.:8
## Max. :8.000 Max. :7.000 Max. :9
summary(male)
## Section Sex Quiz.1 Quiz.2 Quiz.3
## A:6 F: 0 Min. :2.0 Min. :3.00 Min. :2.0
## B:6 M:20 1st Qu.:3.0 1st Qu.:4.00 1st Qu.:4.0
## C:8 Median :4.0 Median :5.00 Median :6.0
## Mean :3.8 Mean :4.80 Mean :5.7
## 3rd Qu.:5.0 3rd Qu.:5.25 3rd Qu.:7.0
## Max. :6.0 Max. :7.00 Max. :9.0
As seen in the graph below, females have a higher average score in section B and C for quiz 1. Males have a slightly higher average score in section A for quiz 1. Sex has little influence on the average score in section A, and females are significantly better than males in sections B and C in quiz 1.
interaction.plot(x.factor = df$Section,
trace.factor = df$Sex,
response = df$Quiz.1,
fun = mean,
type="b",
xlab = "Section",
ylab = "mean of Quiz1",
col=c("red","blue"),
pch=c(19, 17, 15),
fixed=TRUE)
As seen in the graph below, the mean for males is signficantly lower in sections A and B than the mean for females in quiz 2. The mean score for males in section C was signficantly higher than the mean score for females in quiz 2. Sex had the most influence for females in section A and for males in section B in quiz 2.
interaction.plot(x.factor = df$Section,
trace.factor = df$Sex,
response = df$Quiz.2,
fun = mean,
type="b",
xlab = "Section",
ylab = "mean of Quiz2",
col=c("red","blue"),
pch=c(19, 17, 15),
fixed=TRUE)
As seen in the graph below, the mean for females is signficantly higher than the mean for males in sections A and C in quiz 3. The mean for females and males is almost the same in section B in quiz 3. Sex had the most influence for females in sections A and C in quiz 3.
interaction.plot(x.factor = df$Section,
trace.factor = df$Sex,
response = df$Quiz.3,
fun = mean,
type="b",
xlab = "Section",
ylab = "mean of Quiz3",
col=c("red","blue"),
pch=c(19, 17, 15),
fixed=TRUE)