Import data

library(readr)
df <- read.csv("~/Downloads/Data for Assignment 02.csv")

Part A

performance <- boxplot(df$Quiz.1 , df$Quiz.2, df$Quiz.3, 
                       xlab = "Quiz", 
                       ylab = "Score", 
                       names = c("quiz1","quiz2","quiz3"))

the average quiz score increase from quiz 1 to quiz 2, from quiz 2 to quiz 3

Pair t.test

Paired Sample T-Test between Quiz 1 and Quiz 2 H0: The true mean difference (mud) between quizzes 1 and 2 is zero H0: mud = 0 HA: The true mean difference between quizzes 1 and 2 is not equal to zero HA: mud <> 0

ttest_q1q2 <- t.test(df$Quiz.1 , df$Quiz.2, paired = TRUE)
ttest_q1q2
## 
##  Paired t-test
## 
## data:  df$Quiz.1 and df$Quiz.2
## t = -1.8061, df = 37, p-value = 0.07904
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.39594973  0.08016026
## sample estimates:
## mean of the differences 
##              -0.6578947

Analysis Given that the p-value is 0.07904 which is greater than 0.05 error allowed for type I error, we fail to reject the null hypothesis that the mean differences between quiz 1 and 2 is zero. Therefore, we are 95% confident that there is a not signficiant difference between quiz 1 and 2.

Pair t.test

Paired Sample T-Test between Quiz 2 and Quiz 3 H0: The true mean difference (mud) between quizzes 2 and 3 is zero H0: mud = 0 HA: The true mean difference between quizzes 2 and 3 is not equal to zero HA: mud <> 0

ttest_q2q3 <- t.test(df$Quiz.2 , df$Quiz.3, paired = TRUE)
ttest_q2q3
## 
##  Paired t-test
## 
## data:  df$Quiz.2 and df$Quiz.3
## t = -4.7316, df = 37, p-value = 3.223e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.2175118 -0.8877513
## sample estimates:
## mean of the differences 
##               -1.552632

Analysis Given that the p-value is 0.00003223 which is less than 0.05 error allowed for type I error, we can reject the null hypothesis in favor of the alternative hypothesis. Therefore, we are 95% confident that there is a signficiant difference between quiz 2 and 3.

Part B

Does section make difference?

Sections A and C have a steady increase in scoring over time. Section B differs because the scores fluctuate in an irregular pattern. Overall, the quiz section makes a difference in scoring.

sectionA <- subset(df,df$Section == "A")
sectionB <- subset(df,df$Section == "B")
sectionC <- subset(df,df$Section == "C")

performance_section1 <- boxplot(sectionA$Quiz.1, sectionA$Quiz.2, sectionA$Quiz.3, 
                                sectionB$Quiz.1, sectionB$Quiz.2, sectionB$Quiz.3, 
                                sectionC$Quiz.1, sectionC$Quiz.2, sectionC$Quiz.3,
                                xlab = "Quiz",
                                ylab = "Score",
                            names = c("1A","2A","3A","1B","2B","3B","1C","2C","3C")
                                )

Does sex make a difference?

For quiz 1, the mean for females and males are the same. For quiz 2, the mean for females and males are the same. For quiz 3, the means differ between females and males where females have a higher mean score.

female <- subset(df, df$Sex == "F")
male <- subset(df, df$Sex == "M")

performance_gender <- boxplot( female$Quiz.1, male$Quiz.1,
                               female$Quiz.2, male$Quiz.2,
                               female$Quiz.3, male$Quiz.3,
                               xlab = "Quiz", 
                               ylab = "Score", 
                               names = c("1F","1M","2F","2M","3F","3M"))

summary(female)
##  Section Sex        Quiz.1          Quiz.2          Quiz.3 
##  A:6     F:18   Min.   :2.000   Min.   :2.000   Min.   :5  
##  B:7     M: 0   1st Qu.:3.000   1st Qu.:4.000   1st Qu.:6  
##  C:5            Median :4.000   Median :5.000   Median :7  
##                 Mean   :4.444   Mean   :4.722   Mean   :7  
##                 3rd Qu.:5.750   3rd Qu.:5.750   3rd Qu.:8  
##                 Max.   :8.000   Max.   :7.000   Max.   :9
summary(male)
##  Section Sex        Quiz.1        Quiz.2         Quiz.3   
##  A:6     F: 0   Min.   :2.0   Min.   :3.00   Min.   :2.0  
##  B:6     M:20   1st Qu.:3.0   1st Qu.:4.00   1st Qu.:4.0  
##  C:8            Median :4.0   Median :5.00   Median :6.0  
##                 Mean   :3.8   Mean   :4.80   Mean   :5.7  
##                 3rd Qu.:5.0   3rd Qu.:5.25   3rd Qu.:7.0  
##                 Max.   :6.0   Max.   :7.00   Max.   :9.0

Part C

Quiz 1

As seen in the graph below, females have a higher average score in section B and C for quiz 1. Males have a slightly higher average score in section A for quiz 1. Sex has little influence on the average score in section A, and females are significantly better than males in sections B and C in quiz 1.

interaction.plot(x.factor     = df$Section,
                 trace.factor = df$Sex,
                 response     = df$Quiz.1,
                 fun = mean,
                 type="b",
                 xlab = "Section",
                 ylab = "mean of Quiz1",
                 col=c("red","blue"),  
                 pch=c(19, 17, 15),             
                 fixed=TRUE)

Quiz 2

As seen in the graph below, the mean for males is signficantly lower in sections A and B than the mean for females in quiz 2. The mean score for males in section C was signficantly higher than the mean score for females in quiz 2. Sex had the most influence for females in section A and for males in section B in quiz 2.

interaction.plot(x.factor     = df$Section,
                 trace.factor = df$Sex,
                 response     = df$Quiz.2,
                 fun = mean,
                 type="b",
                 xlab = "Section",
                 ylab = "mean of Quiz2",
                 col=c("red","blue"),  
                 pch=c(19, 17, 15),             
                 fixed=TRUE)

Quiz 3

As seen in the graph below, the mean for females is signficantly higher than the mean for males in sections A and C in quiz 3. The mean for females and males is almost the same in section B in quiz 3. Sex had the most influence for females in sections A and C in quiz 3.

interaction.plot(x.factor     = df$Section,
                 trace.factor = df$Sex,
                 response     = df$Quiz.3,
                 fun = mean,
                 type="b",
                 xlab = "Section",
                 ylab = "mean of Quiz3",
                 col=c("red","blue"),  
                 pch=c(19, 17, 15),             
                 fixed=TRUE)