During the 2024 cohort admissions process for the Master’s program in Infectious Diseases and Tropical Medicine at the Federal University of Minas Gerais (UFMG, Brazil), candidates whose projects were approved in the first phase were scheduled for the oral presentations into two groups. One group presented on December 4th, while the other presented on December 6th. This study aims to analyze the grade distributions of these student groups, explore their discrepancies, and formulate hypotheses to explain these differences.
Data was extracted directly from the webpage of the admission process for the 2024 cohort (in portuguese). The presentation schedules can be acessed directly here (12/4) and here (12/6) and loaded into R using the following commands.
## ID of the students that presented, M06 wasn't included as the student gave up
## on the admissions process before presenting and didn't end presenting
ID <- c("IMT-M04", "IMT-M05", "IMT-M07", "IMT-M08", "IMT-M09","IMT-M10",
"IMT-M11", "IMT-M12", "IMT-M13", "IMT-M14","IMT-M15","IMT-M16",
"IMT-M17", "IMT-M19", "IMT-M20", "IMT-M21", "IMT-M22",
"IMT-M23", "IMT-M24", "IMT-M25", "IMT-M27")
## Respective grades
grade <- c(81.00, 76.50, 74.50, 79.00, 76.00, 62.50, 76.00, 71.00, 65.00, 73.00,
90.80, 97.35, 91.09, 90.69, 78.14, 88.45, 85.50, 97.65, 82.10, 71.50,
92.85)
## Grades for the first phase
gradeFirstPhase <- c(78.01, 78.06, 73.04, 83.65, 73.88, 75.27, 80.16, 73.02,
75.11, 77.54, 76.60, 77.00, 80.16, 82.87, 74.54, 74.87,
79.80, 82.28, 80.55, 80.14, 80.52)
## factor variable describing which students presented on which day, the first
## ten students presented on the first day and the latter eleven on the second day
day <- factor(rep(c("day 1", "day 2"), c(10, 11)))
## loading dplyr to construct a tibble - which is a easier way to present the data
library(dplyr)
## creating the tibble
tbl1<-tibble(ID = ID, Grade = grade, Day = day, GradeFirstPhase = gradeFirstPhase)
## printing it in its entirety
print(tbl1, n=21)
## # A tibble: 21 × 4
## ID Grade Day GradeFirstPhase
## <chr> <dbl> <fct> <dbl>
## 1 IMT-M04 81 day 1 78.0
## 2 IMT-M05 76.5 day 1 78.1
## 3 IMT-M07 74.5 day 1 73.0
## 4 IMT-M08 79 day 1 83.6
## 5 IMT-M09 76 day 1 73.9
## 6 IMT-M10 62.5 day 1 75.3
## 7 IMT-M11 76 day 1 80.2
## 8 IMT-M12 71 day 1 73.0
## 9 IMT-M13 65 day 1 75.1
## 10 IMT-M14 73 day 1 77.5
## 11 IMT-M15 90.8 day 2 76.6
## 12 IMT-M16 97.4 day 2 77
## 13 IMT-M17 91.1 day 2 80.2
## 14 IMT-M19 90.7 day 2 82.9
## 15 IMT-M20 78.1 day 2 74.5
## 16 IMT-M21 88.4 day 2 74.9
## 17 IMT-M22 85.5 day 2 79.8
## 18 IMT-M23 97.6 day 2 82.3
## 19 IMT-M24 82.1 day 2 80.6
## 20 IMT-M25 71.5 day 2 80.1
## 21 IMT-M27 92.8 day 2 80.5
The tibble constructed can be compared with the results table available directly from the website to be assured of its validity.
A short analysis was undertaken to describe the data and to make inferences about the differences among the two groups.
Mean, median, standard deviation and interquartile ranges were calculated for the grade variable in each group, as well as for the overall distribution.
## grouping by day and then employing summary statistics
tbl1 %>% group_by(Day) %>% summarise(N = n(), Grade.mean = mean(Grade),
Grade.median = median(Grade),
Grade.std.dev = sd(Grade),
Grade.IQR = IQR(Grade),
Grade.minimal = min(Grade),
Grade.maximum = max(Grade))-> summarised
## creating summary statistics for the entire distribution
tbl1 %>% summarise(N = n(), Grade.mean = mean(Grade),
Grade.median = median(Grade),
Grade.std.dev = sd(Grade),
Grade.IQR = IQR(Grade),
Grade.minimal = min(Grade),
Grade.maximum = max(Grade)) %>%
mutate (Day = "Overall") -> summarisedTotal
## printing the descriptive statistics by group and overall
rbind(summarised, summarisedTotal)
## # A tibble: 3 × 8
## Day N Grade.mean Grade.median Grade.std.dev Grade.IQR Grade.minimal
## <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 day 1 10 73.4 75.2 5.86 4.88 62.5
## 2 day 2 11 87.8 90.7 7.99 8.17 71.5
## 3 Overall 21 81.0 79 10.1 16.2 62.5
## # ℹ 1 more variable: Grade.maximum <dbl>
From these values, one can already see that the mean of grades in day 2 is considerably higher than the maximum value of grades in day 1.
Density plots were built for the distribution of grades as a whole and then divided by groups. Dotted lines represent the mean and dashed lines represent the median.
## Loading ggplot2 to build plots
library(ggplot2)
## creating a density plot
ggplot(tbl1, aes(x=Grade)) + geom_density() +
## adding a vertical solid line for the mean
geom_vline(aes(xintercept = mean(Grade))) +
## adding a vertical dashed line for the median
geom_vline(aes(xintercept = median(Grade)),
linetype = "dashed")
## creating a density plot for the distributions in day 1 and 2 separately
ggplot(tbl1, aes(x=Grade, color = Day)) + geom_density() +
## adding the means and medians for each group
geom_vline(data = summarised, aes(xintercept = Grade.mean, color = Day)) +
geom_vline(data = summarised, aes(xintercept = Grade.median, color = Day),
linetype = "dashed") +
geom_text(aes(x=73.45000, label="\nMean", y=0.0375, family = "sans"),
colour="red", angle=90) +
geom_text(aes(x=75.25, label="\nMedian", y=0.0625, family = "sans"),
colour="red", angle=90) +
geom_text(aes(x=87.82909, label="\nMean", y=0.075, family = "sans"),
colour="cyan", angle=90) +
geom_text(aes(x=90.69, label="\nMedian", y=0.0625, family = "sans"),
colour="cyan", angle=90)
The density plots for the overall distribution show a pattern that resembles a bimodal distribution, which can be seen when the groups are divided.
In order to consider whether applying parametrical or non-parametrical inferential tests was appropriate, the Shapiro-Wilk test for normality was conducted on the entire distribution.
tbl1 %>% pull(Grade) %>% shapiro.test
##
## Shapiro-Wilk normality test
##
## data: .
## W = 0.96377, p-value = 0.5951
As the results did not negate the null-hypothesis, a student’s t-test was conducted to determine whether the differences observed between the groups were due to chance. The test conducted was two-sided with an alpha level for significance set at 0.05.
t.test(data = tbl1, Grade ~ Day)
##
## Welch Two Sample t-test
##
## data: Grade by Day
## t = -4.7303, df = 18.232, p-value = 0.0001616
## alternative hypothesis: true difference in means between group day 1 and group day 2 is not equal to 0
## 95 percent confidence interval:
## -20.759616 -7.998566
## sample estimates:
## mean in group day 1 mean in group day 2
## 73.45000 87.82909
The null-hypothesis of no difference between the groups was rejected, as per the parameters above. We can also infer with 95% certainty that the true difference between the groups lies between 8 and 20.8 points in the grades.
When considering the grades that the students were given on the first phase, in which the text version of the project was examined, the student’s t-test that compares both is the following:
t.test(data = tbl1, GradeFirstPhase ~ day)
##
## Welch Two Sample t-test
##
## data: GradeFirstPhase by day
## t = -1.6426, df = 17.639, p-value = 0.1182
## alternative hypothesis: true difference in means between group day 1 and group day 2 is not equal to 0
## 95 percent confidence interval:
## -5.1456523 0.6336523
## sample estimates:
## mean in group day 1 mean in group day 2
## 76.774 79.030
The result is not significant, which points towards the differences arising in the evaluation of the second phase.
In order to demonstrate that the results are robust to the assumption that the distribution of grades in the second phase is roughly normal, a Mann-Whitney test was also conducted to analyse if the results were any different.
wilcox.test(data = tbl1, Grade ~ Day)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Grade by Day
## W = 9, p-value = 0.00135
## alternative hypothesis: true location shift is not equal to 0
The null-hypothesis was also rejected with this inferential test which does not rely on the assumption of normality.
With the results described above we can affirm that the grades of the students which participated in the admissions process of the 2024 cohort of the Infectious Disease and Tropical Medicine Master’s Program at UFMG were significantly different depending on the day that the oral presentation took place.
Further investigation is necessary to ascertain the underlying reasons for this discrepancy. Appropriate hypothesis should consider: the effect that being on a different day of the week can have on the presenter and the evaluators, whether the ID’s for the students were randomly attributed or if there was some sort factor that led to certain students having the lowest numbers and other students having the highest numbers and, perhaps most importantly, whether different evaluators were involved between the two presentation days.
Given that the admissions process aims to select the most suitable candidates for the program, an in-depth analysis of the aforementioned factors contributing to the observed differences would be of considerable merit.