Sample: Using t-test with Student Performance Data

Evaluating if there is a significant difference in final grades between the tutored and non-tutored groups of students

Megan Georges
2022-07-30

Student Performance Dataset

For this example, I’ll be using a publicly available student performance dataset from Kaggle.

# Read data in
educMETA <- read_csv("../student_data.csv")

# Display first few rows
head(educMETA)
# A tibble: 6 x 33
  school sex     age address famsize Pstatus  Medu  Fedu Mjob    Fjob 
  <chr>  <chr> <dbl> <chr>   <chr>   <chr>   <dbl> <dbl> <chr>   <chr>
1 GP     F        18 U       GT3     A           4     4 at_home teac~
2 GP     F        17 U       GT3     T           1     1 at_home other
3 GP     F        15 U       LE3     T           1     1 at_home other
4 GP     F        15 U       GT3     T           4     2 health  serv~
5 GP     F        16 U       GT3     T           3     3 other   other
6 GP     M        16 U       LE3     T           4     3 servic~ other
# ... with 23 more variables: reason <chr>, guardian <chr>,
#   traveltime <dbl>, studytime <dbl>, failures <dbl>,
#   schoolsup <chr>, famsup <chr>, paid <chr>, activities <chr>,
#   nursery <chr>, higher <chr>, internet <chr>, romantic <chr>,
#   famrel <dbl>, freetime <dbl>, goout <dbl>, Dalc <dbl>,
#   Walc <dbl>, health <dbl>, absences <dbl>, G1 <dbl>, G2 <dbl>,
#   G3 <dbl>
# i Use `colnames()` to see all variable names

Analysis Goal

Let’s say we’ve been asked to determine whether participation in extra tutoring has an effect on final grades.

First, let’s get some descriptive statistics of each variable:

educMETA %>%
  group_by(paid) %>%
  summarise(counts = n())
# A tibble: 2 x 2
  paid  counts
  <chr>  <int>
1 no       214
2 yes      181

214 students did not receive tutoring and 181 did receive tutoring.

summary(educMETA$G3)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    8.00   11.00   10.42   14.00   20.00 

The lowest score in the sample is 0 and the highest is 20, with a median of 11 and mean of 10.42. This may mean the data is relatively normally distributed but perhaps some more extreme scores in the lower half of the grades. Let’s see how a histogram looks.

hist(educMETA$G3, 
     main = "Final Grades Histogram", 
     xlab = "Final Grade")

As suspected, we can see that there’s a higher than ‘normal’ frequency of grades that are 0, but other than that the data is relatively normally distributed.

Running t-test

Now, I’ll run a t-test to determine if there’s a statistically significant difference in mean final grades between those with and without extra tutoring. We’ll use p = 0.05 as the threshold.

t.test(G3 ~ paid, data = educMETA)

    Welch Two Sample t-test

data:  G3 by paid
t = -2.0831, df = 386.36, p-value = 0.0379
alternative hypothesis: true difference in means between group no and group yes is not equal to 0
95 percent confidence interval:
 -1.82075017 -0.05259108
sample estimates:
 mean in group no mean in group yes 
         9.985981         10.922652 

The average final grade for those without tutoring is 9.99 and with tutoring is 10.92. There is sufficient evidence to suggest that the mean final grade differs in a statistically significant way among those with and without tutoring, with t = -2.08, df = 386, p = 0.038. The 95% confidence interval indicates that the true difference in mean final grades is between 0.05 and 1.82 points.

educMETA %>%
    group_by(paid) %>%
    mutate(agg = mean(G3)) %>%
    ggplot(., aes(paid, agg, fill = paid))+
    geom_col(position = "dodge") +
  labs(title = "Average Final Grade for Tutored vs. Non-Tutored Group",
       x = "Group", y = "Average Final Grade") +
  scale_fill_manual(name="Group",labels=c("Not Tutored", "Tutored"), 
                    values=c("tomato1", "dodgerblue4")) +
  theme_bw() +
  theme(axis.text.x=element_blank(),
        axis.ticks.x=element_blank()) 

While we knew there was a higher average score for the group of students who participated in tutoring, we can now say it’s a statistically significant difference. However, both groups’ grades remain pretty low if 20 is the maximum, so it is recommended that other variables are explored to seek additional factors that can be addressed or tools that can be utilized to increase final grades.