Ofqual have provided a detailed technical report about the algorithm they used to determine the grades to be given for A level and GCSE students in England in 2020. Here, I’ll try to explain it step by step. Note this is not (currently) a critique; I’m just sharing it to help people understand it.
If you have any questions or things that you’d like more explanation around, please contact me on Twitter and I’ll do my best.
The algorithm itself is described in Section 8 of Ofqual’s technical report (p83). It includes the following steps:
This algorithm is used if a school has more than fifteen children doing an A level or GCSE in a given subject.
If a school has five or fewer children doing an A level or GCSE in a given subject, steps 1-7 get skipped, and the rough grades that get used to allocate marks to students are based on the grades their teachers originally predicted for them.
If a school has between five and fifteen children doing an A level or GCSE in a given subject, then a combination of the teacher predictions and the algorithmic predictions get used.
As teachers overall tend to over-estimate grades, this means overall scores will tend to be higher for small classes.
In the first step, Ofqual create a historic profile of the grades pupils have previously achieved for each subject offered at each school.
For A levels, this historic profile looks at the past three years of results.
For GCSEs it’s a bit more complicated because they have recently gone through reform: the grading has changed to numbers (9 to 1) and the curriculum has also changed to be more demanding. This reform has been staggered. For maths and English this happened in 2017. For most other popular GCSEs (including science, humanities and modern foreign languages like French and Spanish) it was in 2018. For less common GCSEs, including astronomy, sociology and Polish, the only grades we have come from 2019, the first year the reformed curriculum came into play.
As a consequence, for Phase One and Phase Two GCSEs, two years of historic data is used. For Phase Three GCSEs, one year of historic data is used. Phase Four GCSEs don’t have any historic data, so, again, raw teacher predictions are used for these subjects (biblical Hebrew, Gujarati, Persian, Portuguese and Turkish).
This example used in the technical report, shows how this is done for A levels but it’s similar for GCSEs.
If you prefer something more visual, here’s a graph of the same numbers.
The majority of the algorithm works on the percentage of students achieving at least a given grade. Here’s what that looks like:
In the algorithm the total number of students from the last one, two or three years achieving at least a given grade is converted into a percentage. The percentage achieving at least the lowest grade is always 100%.
In this step, Ofqual zooms out to look at the whole of England and all the students who have studied and got results on the relevant qualifications. Then Ofqual goes and looks at how their prior attainment: how well they did on their GCSEs for A levels, or their Key Stage 2 results for GCSEs.
Key Stage 2 results come from testing at the end of Year 6 (primary school) and give pupils scores in reading, writing and maths. These scores represent a general level of achievement at the end of primary school and are commonly used as a basis to understand how much progress children make during secondary school.
Rather than using raw scores from this prior attainment, Ofqual created ten buckets, each containing 10% of students, based on their GCSE or Key Stage 2 results.
How do these measures of prior attainment relate to the final grades students achieve in different subjects? To understand that, Ofqual create a matrix for each subject. Each row contains a different prior attainment (one of the ten buckets as described above) and each column an A level or GCSE grade as applicable. Here’s an example of what that table looks like.
As you might expect, looking across the whole of England, in general students that do well on Key Stage 2 will do well on GCSEs, and those that do well at GCSEs will do well at A levels. However, there are also some students who despite doing poorly at Key Stage 2 do very well on particular subjects at GCSE, and vice versa.
This matrix can be used to predict how many students will achieve particular grades in a subject, assuming they performed at the national average. To make it useful for doing that, all the numbers are again turned into percentages, and instead of looking at the percentage that achieved each grade, you look at the percentage that achieved at least that grade.