Background

Exams are a commonly used assessments designed to test learning of a student over a semester. However, where exam are high stakes (i.e. of a heavy weighting) this may encourage undesirable behaviours in students such as cheating, collusion and cramming. We examine how common high stakes assessment are and how this varies across faculties using the University of Melbourne as a case study.

Approach

Using a data set provided by the University of Melbourne we examining the weighting of final assessments across faculties.

About the data

The data are largely complete. There were some missing data, however these don’t seem like they will pose much of a problem:

Timing - There is 5% (1,188 of 22,572 records) of the timing data missing when considering all revelant timing fields (‘Timing’, ‘Timing (in/on/to)’, ‘Other Timing Details’). Of the missing data, over half of these records fall under ‘attendance’, participation’ or ‘research project units’ where timing is not applicable. For the formats that are likely important for this analysis there is very little timing information missing (e.g. < 1% of written exams; < 1% of these are practical exams).

To capture the data present in a non-standard manner, I created a new column “Timing_Endofsemester” which collates all assessments which are held in week 12 onwards (from the ‘Timing’, ‘Timing (in/on/to)’ and ‘Other timing details’ columns). This gained an additional ~ 600 records. Note I excluded all assessments which had a duration longer than a week. i.e. if an assessment was due in week 12, but began in any other week it was not included.

Format - Just under 2% (439 of 22,572 records) of the format data is missing. However, 65% of the missing records are postgraduate subjects which largely consist of research units.

It seems there are less consistencys in the postgraduate data in general, so I’ll examine the trends for each level seperately as well as together.

# Import the case study data set
UMdata <- read.csv("23.08_subject_assessment_list_timingfixed.csv")
UMdata_summary <- read.csv("Proportion_highstakes.csv")
# Show a summary of the data in each column 
summary(UMdata)
# Check the column names in the data set
names(UMdata)

# Helper function for string wrapping (12 character target width)
swr = function(string, nwrap=12) {
  paste(strwrap(string, width=nwrap), collapse="\n")
}
swr = Vectorize(swr)
# Create line breaks in Faculty.company
UMdata$Faculty.company = swr(UMdata$Faculty.company)

library(ggplot2) # for creating graphs
library(plyr) # allows for generating summary statistics across groups and creating a new summary dataset
library(knitr) # allows for printing tables in RMarkdown
library(dplyr)

Data subsetting

Assessment formats being examined include: Oral Exams, Take-Home Exams, Practical Exam’s along with Written Exams. For Take-home exams, we will only consider those that are given and due after the end of the teaching semester.

To limit aseessment that only takes place at the end of semester I have created a new column in Excel which includes the “Timing” options: Week 12, during the examination period, end of semester, at the end of the assessment period, end of term, end of the assessment period, end of the teaching period, X weeks after the end of teaching and during the assessment period. As some subjects listed “Week X” I also inlucded all assignments listed as ending in week 12 in the column Timing (in/on/to), but excluded those that had a long duration. i.e. assessments which students work on throughout the semester that are due in week 12. I also manually checked the “Other Timing” column for assessments due in week 12 and included them.

UMdata_all_exams_raw <- subset(UMdata, Format == "Oral Exam (Viva Voce)" | Format == "Take-home Exam" | Format == "Practical Exam" | Format == "Written Exam")
summary(UMdata_all_exams_raw$Format)

# Now subset the data so that they only include assessments that fall within the last week of the semester or within the exam period
summary(UMdata_all_exams_raw$Timing)
UMdata_all_exams <- subset(UMdata_all_exams_raw, Timing_Endofsemester == "Yes")
UMdata_all_exams

Descriptive statistics

# Calculate the summary stats - entered into Excel sheet, keep code for re-calculating if necessary 
mean(UMdata_all_exams$Percentage, na.rm=TRUE)

## [1] 53.8583

UMdata_all_exams %>% count(Faculty.company)

##                               Faculty.company   n
## 1      Architecture,\nBuilding\nand\nPlanning  60
## 2                                        Arts 350
## 3                    Business\nand\nEconomics 469
## 4                         Engineering\nand IT 252
## 5                        Fine Arts\nand Music  99
## 6                                         Law 335
## 7  Medicine,\nDentistry\nand Health\nSciences 272
## 8   Melbourne\nGraduate\nSchool of\nEducation   2
## 9                                     Science 281
## 10    Veterinary\nand\nAgricultural\nSciences 162

aggregate(UMdata_all_exams$Percentage, list(UMdata_all_exams$Faculty.company), FUN=mean, na.rm = TRUE)

##                                       Group.1        x
## 1      Architecture,\nBuilding\nand\nPlanning 45.25000
## 2                                        Arts 35.75714
## 3                    Business\nand\nEconomics 55.66524
## 4                         Engineering\nand IT 51.98016
## 5                        Fine Arts\nand Music 40.90909
## 6                                         Law 87.76119
## 7  Medicine,\nDentistry\nand Health\nSciences 40.43019
## 8   Melbourne\nGraduate\nSchool of\nEducation 65.00000
## 9                                     Science 56.17687
## 10    Veterinary\nand\nAgricultural\nSciences 49.46273

# Examine how many exams are high stakes
HS_allformats <- subset(UMdata_all_exams, Percentage > 49)
HS_allformats %>% count(Faculty.company)

##                               Faculty.company   n
## 1      Architecture,\nBuilding\nand\nPlanning  26
## 2                                        Arts 115
## 3                    Business\nand\nEconomics 388
## 4                         Engineering\nand IT 177
## 5                        Fine Arts\nand Music  34
## 6                                         Law 323
## 7  Medicine,\nDentistry\nand Health\nSciences 110
## 8   Melbourne\nGraduate\nSchool of\nEducation   2
## 9                                     Science 197
## 10    Veterinary\nand\nAgricultural\nSciences 113

aggregate(HS_allformats$Percentage, list(HS_allformats$Faculty.company), FUN=mean, na.rm = TRUE)

##                                       Group.1        x
## 1      Architecture,\nBuilding\nand\nPlanning 55.57692
## 2                                        Arts 51.56522
## 3                    Business\nand\nEconomics 59.26546
## 4                         Engineering\nand IT 59.40678
## 5                        Fine Arts\nand Music 72.35294
## 6                                         Law 89.81424
## 7  Medicine,\nDentistry\nand Health\nSciences 58.27273
## 8   Melbourne\nGraduate\nSchool of\nEducation 65.00000
## 9                                     Science 65.00000
## 10    Veterinary\nand\nAgricultural\nSciences 54.84071

# Summary statistics for all exams formats
All_exams_summary <- UMdata_summary[,c(1,6:9)]
kable(All_exams_summary)

Faculty	MeanWeightingExams	NumberOfExams	NumberOfHSExams	PropHSExams
Architecture, Building and Planning	45.25	60	26	0.43
Arts	35.76	350	115	0.33
Business and Economics	55.67	469	388	0.83
Engineering and IT	51.98	252	177	0.70
Fine Arts and Music	40.91	99	34	0.34
Law	87.76	335	323	0.96
Medicine, Dentistry and Health Sciences	40.43	272	110	0.40
Melbournen Graduate School of Education	65.00	2	2	1.00
Science	56.18	281	197	0.70
Veterinary and Agricultural Sciences	49.46	162	113	0.70
Overall	53.86	2282	1485	0.65

Patterns in the data

A plot of the overall frequency distribution for all end of semester assessments:

# frequency distribution of end of semester written exam weights across all faculties at The University of Melbourne 
OverallWeighting_allformats <- ggplot(UMdata_all_exams, aes(x = Percentage)) + 
    geom_histogram(color = "black", fill = "grey", bins = 15) +
    scale_x_continuous(breaks = seq(0, 100, 10), lim = c(0, 100)) +
    theme_classic() +
    ggtitle(label = "Frequency distribution of final assessment weightings (all formats)") +
    xlab(label = "Final assessment weightings") +
    ylab(label = "Frequency")
OverallWeighting_allformats

Density distribution of the weighting of final assessment by format:

# frequency distribution of end of semester assessments weights grouped by the format of the final assessment at The University of Melbourne 
# code adapted from https://rpubs.com/rmulder/765674

# Check the listed faculties for consistency 
summary(UMdata_all_exams$Format)
# calculate means for each faculty 
# Note, not every subject has indcluded data for percentage, so I've added "na.rm = TRUE" to tell R to ignore those subjects
means <- ddply(UMdata_all_exams, "Format", summarise, AvePercentage = round(mean(Percentage, na.rm = TRUE), digits = 1), N = length(Format))
means
# density plot grouped by faculty
ggplot(UMdata_all_exams, aes(x = Percentage, fill = Format)) + 
  geom_density(alpha = 0.3, adjust = 2) + 
  xlim(0, 100) + 
  theme(text = element_text(size=30)) +
  geom_vline(data = means, aes(xintercept = AvePercentage, colour = Format), linetype = "longdash", size = 1) + 
  labs(title = "Density distribution of final assessment weightings by format \n(dashed lines are means)", 
       y = "Density", x = "Final assessment weightings", caption = "")

Oral Exams (which largely includes language subjects and some medical subjects) are relatively low weighted compared to other final assessment types.

Returning to looking at the data overall, we plot density frequencys plots split by faculty:

# frequency distribution of end of semester assessments (all formats) weights grouped by faculties at The University of Melbourne 
# code adapted from https://rpubs.com/rmulder/765674

# Check the listed faculties for consistency 
summary(UMdata_all_exams$Faculty.company)
# calculate means for each faculty 
# Note, not every subject has indcluded data for percentage, so I've added "na.rm = TRUE" to tell R to ignore those subjects
means <- ddply(UMdata_all_exams, "Faculty.company", summarise, AvePercentage = round(mean(Percentage, na.rm = TRUE), digits = 1), N = length(Faculty.company))
means
# density plot grouped by faculty
ggplot(UMdata_all_exams, aes(x = Percentage, fill = Faculty.company)) + 
  geom_density(alpha = 0.3, adjust = 2) + 
  xlim(0, 100) + 
  theme(text = element_text(size=30)) +
  geom_vline(data = means, aes(xintercept = AvePercentage, colour = Faculty.company), linetype = "longdash", size = 1) + 
  labs(title = "Density distribution of final assessment weightings (all formats) by faculty \n(dashed lines are means)", 
       y = "Density", x = "Final assessment weightings", caption = "")

Weighting of final assessments for Enginerring and IT have now increased to be approximatley similar to other faculties. Law still has relatively high weightings compared to other faculties.

Again, it’s difficult to match the colours to the faculties given the numbers. Therefore I will create a density distribution presented as a facet graphs for easier comparisons.

# density plot for weighting of written exams across all subject
OverallWeighting_allformats_density <- ggplot(UMdata_all_exams, aes(x = Percentage)) + 
  geom_density(adjust = 2) + 
  xlim(0, 100) + 
  labs(title = "Density distribution of final assessments weightings \n(all formats, all faculties)", 
       y = "Density", x = "Final assessment weightings (all formats)", caption = "")

# Facet graphs vertially 
OverallWeighting_allformats_density + facet_grid(Faculty.company ~ .)

summary(UMdata_all_exams$Faculty.company)

I’ll have to sort out how to fix the labels, in order from top to bottom they should read: Architecture, Building and Planning (n = 58) Arts (n = 350) Business and Economics (n = 467) Engineering and IT (n = 251) Fine Arts and Music (n = 98) Law (n = 64) Medicine, Dentistry and Health Sciences (n = 266) Melbourne Graduate School of Education (n = 2) Science (n = 271) Veterinary and Agricultural Sciences (n = 161)

Overall weighting split by level (undergraduate/postgraduate):

The postgraduate subject data is less consistent and there may be different expectations at different levels. There is a fairly even split in numbers (n undergraduate subject = 1,044; n postgraduate subject = 944) Further there is likely to be differences in average class sizes too which may also effect the overall patterns.

Overall weighting split by level (undergraduate/postgraduate):

# frequency distribution of end of semester assessments weights grouped by the level (undergraduate/postgraduate) at The University of Melbourne 
# code adapted from https://rpubs.com/rmulder/765674

# Check the listed faculties for consistency 
names(UMdata_all_exams)
summary(UMdata_all_exams$SubjectLevel)

# calculate means for each faculty 
# Note, not every subject has indcluded data for percentage, so I've added "na.rm = TRUE" to tell R to ignore those subjects
means_level <- ddply(UMdata_all_exams, "SubjectLevel", summarise, AvePercentage = round(mean(Percentage, na.rm = TRUE), digits = 1), N = length(SubjectLevel))
means_level

# density plot grouped by faculty
ggplot(UMdata_all_exams, aes(x = Percentage, fill = SubjectLevel)) + 
  geom_density(alpha = 0.3, adjust = 2) + 
  xlim(0, 100) + 
  theme(text = element_text(size=30)) +
  geom_vline(data = means_level, aes(xintercept = AvePercentage, colour = SubjectLevel), linetype = "longdash", size = 1) + 
  labs(title = "Density distribution of final assessment weightings by subject level \n(dashed lines are means)", 
       y = "Density", x = "Final assessment weightings", caption = "")

Weighting is very consistent regardless of level. Undergraduate subjects are very slightly less weighted.

Examine whether this is consistent across faculty:

# Change the order of the subject level factors so the graph presents in the order we want i.e. undergrad then postgrad, instead of the opposite
UMdata_all_exams$SubjectLevel = factor(UMdata_all_exams$SubjectLevel, levels=c('Undergraduate','Postgraduate'))

# This is the same graph as in chunck "Density plot presented as a facet graph grouped by faculty for final assessments (all formats)", I have repeated the code here for easy assess, as we changed the order of the factors above, this won't work until the master graph is re-ran
OverallWeighting_allformats_density <- ggplot(UMdata_all_exams, aes(x = Percentage)) + 
  geom_density(adjust = 2) + 
  xlim(0, 100) + 
  labs(title = "Density distribution of final assessments weightings \n(all formats, all faculties)", 
       y = "Density", x = "Final assessment weightings (all formats)", caption = "")

# Facet graphs with two groups 
OverallWeighting_allformats_density + facet_grid(Faculty.company ~ SubjectLevel)

Written exams

Of all examined final assessmnets the majority are written exams (1,610 of 1,988 records). Other formats are relatively rare. As written exams are much more consistent, we will also look at the patterns just focusing on these.

Descriptive statistics (written exams)

# Calculate the summary stats - entered into Excel sheet, keep code for re-calculating if necessary 
# mean(UMdata_written_exams$Percentage, na.rm=TRUE)
# UMdata_written_exams %>% count(Faculty.company)
# aggregate(UMdata_written_exams$Percentage, list(UMdata_written_exams$Faculty.company), FUN=mean, na.rm = TRUE) 

# Examine how many exams are high stakes
# HS_written <- subset(UMdata_written_exams, Percentage > 49)
# HS_written %>% count(Faculty.company)
# aggregate(HS_written$Percentage, list(HS_written$Faculty.company), FUN=mean, na.rm = TRUE) 

# Summary statistics for written exams
Written_exams_summary <- UMdata_summary[,c(1:5)]
kable(Written_exams_summary)

Faculty	MeanWeightingWrittenExams	NumberOfWrittenExams	NumberOfHSWrittenExams	PropHSWrittenExams
Architecture, Building and Planning	45.53	57	26	0.46
Arts	40.31	208	80	0.38
Business and Economics	56.24	444	373	0.84
Engineering and IT	53.85	236	176	0.75
Fine Arts and Music	25.83	18	1	0.06
Law	72.33	43	38	0.88
Medicine, Dentistry and Health Sciences	43.88	228	109	0.48
Melbournen Graduate School of Education	50.00	1	1	1.00
Science	58.60	249	190	0.76
Veterinary and Agricultural Sciences	46.71	160	113	0.71
Overall	51.61	1644	1107	0.67

# Subset data to only include records for written exams 
UMdata_written_exams_raw <- subset(UMdata, Format == "Written Exam")

# Subset to include the categories that fall at the end of the semester (week 12) or during the exam period
UMdata_written_exams <- subset(UMdata_written_exams_raw, Timing_Endofsemester == "Yes")
# Check the data to make sure it subsetted correctly 
summary(UMdata_written_exams$Timing)

# Final data set to use is "UMdata_written_exams"

# frequency distribution of end of semester written exam weights across all faculties at The University of Melbourne 
OverallWeighting_written <- ggplot(UMdata_written_exams, aes(x = Percentage)) + 
    geom_histogram(color = "black", fill = "grey", bins = 15) +
    scale_x_continuous(breaks = seq(0, 100, 10), lim = c(0, 100)) +
    theme_classic() +
    ggtitle(label = "Frequency distribution of written exam weightings") +
    xlab(label = "Final assessment weightings") +
    ylab(label = "Frequency")
OverallWeighting_written

Across The University of Melbourne there is a fairly normal distribution of end of semester written exam weightings.

Weighting of written exams by faculty:

# frequency distribution of end of semester written exam weights grouped by faculties at The University of Melbourne 
# code adapted from https://rpubs.com/rmulder/765674

# Check the listed faculties for consistency 
summary(UMdata_written_exams$Faculty.company)
# calculate means for each faculty 
# Note, not every subject has indcluded data for percentage, so I've added "na.rm = TRUE" to tell R to ignore those subjects
means <- ddply(UMdata_written_exams, "Faculty.company", summarise, AvePercentage = round(mean(Percentage, na.rm = TRUE), digits = 1), N = length(Faculty.company))
means
# density plot grouped by faculty
ggplot(UMdata_written_exams, aes(x = Percentage, fill = Faculty.company)) + 
  geom_density(alpha = 0.3, adjust = 2) + 
  xlim(0, 100) + 
  theme(text = element_text(size=30)) +
  geom_vline(data = means, aes(xintercept = AvePercentage, colour = Faculty.company), linetype = "longdash", size = 1) + 
  labs(title = "Density distribution of written exam weightings by faculty \n(dashed lines are means)", 
       y = "Density", x = "Written exam weightings", caption = "")

This shows us that in written exams, Engineering and IT has relatively low weighting on written exams (perhaps they focus more on practical assessments) and that law has a relatively heavy (yet highly variable) weightings on written exams. All other faculties show variation yet are on average are worth 50% of the final grade.

# density plot for weighting of written exams across all subject
OverallWeighting_density_written <- ggplot(UMdata_written_exams, aes(x = Percentage)) + 
  geom_density(adjust = 2) + 
  xlim(0, 100) + 
  labs(title = "Density distribution of written exam weightings \n(across all faculties)", 
       y = "Density", x = "Written exam weightings", caption = "")

# Facet graphs vertially 
OverallWeighting_density_written + facet_grid(Faculty.company ~ .)

Note: There is only one record for the Melbourne Graduate School of Education hence no density plot. This faculty appears to have many ‘written assignments’ yet no exams.

I’ll have to sort out how to fix the labels, in order from top to bottom they should read: Architecture, Building and Planning (n = 57) Arts (n = 208) Business and Economics (n = 444) Engineering and IT (n = 236) Fine Arts and Music (n = 18) Law (n = 43) Medicine, Dentistry and Health Sciences (n = 228) Melbourne Graduate School of Education (n = 1) Science (n = 249) Veterinary and Agricultural Sciences (n = 160)

Overall weighting split by level (undergraduate/postgraduate):

# frequency distribution of end of semester assessments weights grouped by the level (undergraduate/postgraduate) at The University of Melbourne 
# code adapted from https://rpubs.com/rmulder/765674


# calculate means for each faculty 
# Note, not every subject has indcluded data for percentage, so I've added "na.rm = TRUE" to tell R to ignore those subjects
means_level_written <- ddply(UMdata_written_exams, "SubjectLevel", summarise, AvePercentage = round(mean(Percentage, na.rm = TRUE), digits = 1), N = length(SubjectLevel))
means_level

# density plot grouped by faculty
ggplot(UMdata_written_exams, aes(x = Percentage, fill = SubjectLevel)) + 
  geom_density(alpha = 0.3, adjust = 2) + 
  xlim(0, 100) + 
  theme(text = element_text(size=30)) +
  geom_vline(data = means_level_written, aes(xintercept = AvePercentage, colour = SubjectLevel), linetype = "longdash", size = 1) + 
  labs(title = "Density distribution of written exam weightings by subject level \n(dashed lines are means)", 
       y = "Density", x = "Written exam weightings", caption = "")

# Change the order of the subject level factors so the graph presents in the order we want i.e. undergrad then postgrad, instead of the opposite
UMdata_written_exams$SubjectLevel = factor(UMdata_written_exams$SubjectLevel, levels=c('Undergraduate','Postgraduate'))

# This is the same graph as in chunck "Density plot presented as a facet graph grouped by faculty for final assessments (all formats)", I have repeated the code here for easy assess, as we changed the order of the factors above, this won't work until the master graph is re-ran
OverallWeighting_density_written <- ggplot(UMdata_written_exams, aes(x = Percentage)) + 
  geom_density(adjust = 2) + 
  xlim(0, 100) + 
  labs(title = "Density distribution of final assessments weightings \n(all formats, all faculties)", 
       y = "Density", x = "Final assessment weightings (all formats)", caption = "")

# Facet graphs with two groups 
OverallWeighting_density_written + facet_grid(Faculty.company ~ SubjectLevel)

High Stakes Assessment

Background

Approach

About the data

Data subsetting

Descriptive statistics

Patterns in the data

Written exams

Descriptive statistics (written exams)