Predictive Analytics in Education: An introductory example

Anshul Kumar

Jan 10 2020

Presentation information

Key you can press What should happen when you press it
A Toggle between seeing all slides and one slide at a time
S Make everything on the slide smaller
B Make everything on the slide bigger

BE SURE TO PAUSE THE VIDEO ANY TIME YOU FEEL LIKE IT.

Learning goals of this presentation

By the end of this presentation, our goals are to:

  1. Identify the types of questions that predictive analytics methods can help us answer in education.

  2. Build intuition about how machine learning algorithms can help us predict group membership using systematically-organized data (also known as classification).

Noodles and pasta

I really like to eat noodles and pasta. It’s January 10 2020 and the weather is really cold here in Boston. A perfect day for hot noodles. I cooked some noodles today in a pot of boiling water.

But now I have a problem to solve:

(Right now the noodles and water are together in the pot)

What is our goal?

Our goal:

  1. Keep all of the noodles.

  2. Discard all of the water.

To be clear: Our goal is to take EVERYTHING IN THE POT and separate it into two groups:

KEEP and DISCARD.

How can we achieve our goal?

We will have to use some kind of sorting mechanism to separate the noodles and water, both of which are currently mixed together in the pot.

.

Sorting mechanism – Colander filter

.

Image source: https://www.masterfile.com/image/en/600-02346521/hands-straining-pasta

.

This is the sorting mechanism we’re going to use. We’ll pour the noodles and water from the pot into a colander filter.

This is called colander filtering.

What did we just do?

In plain words:

.

In predictive analytic / machine learning words:

Learning analytics with students

In addition to liking noodles, I am also an educator and administrator in an educational program.

Here are some key details:

.

I have a problem to solve:

Timeline and data

One-year program timeline for students:

Here’s what we do and don’t know for each cohort:

Cohort Fall term grades Spring term grades Final Exam Results
C1: 2018–19 Yes Yes Yes
C2: 2019–20 Yes No No
  • C1 data is complete. C2 data is incomplete.
  • Our goal: predict final exam results for C2, using only their fall term grades.

Remember:

  • Today’s date is Jan 10 2020
  • C1 students have all finished and graduated
  • C2 students are in the middle of their year. They have finished the fall term and are about to start the spring term.

PAUSE TO MAKE SURE THIS MAKES SENSE, BEFORE YOU CONTINUE.

What is our goal?

Our goal is to predict final exam results for C2, using only their fall term grades.

To be clear: Our goal is to take all of the students in C2 and sort them into two groups:

Predicted to pass final exam

and

Predicted to fail final exam (at-risk students)

Why do we have this goal?

How can we achieve our goal?

Just like with the noodles, we will have to use some kind of sorting mechanism to separate the students in C2 predicted to fail from the students predicted to pass, all of whom are half-way through the one-year program at this point.

Just imagine that you’re “pouring” students into a colander filter that will catch the at-risk students and let the other ones pass through.

Noodles:

.

Students:

How will we make the right filter (sorting mechanism) for students?

What we already know:

What we want to know:

Modified process diagram – Student learning analytics

  • We want to use old, complete data from C1 to make a prediction using incomplete data from C2.
  • The calibration process using old data (from C1) is what the “learning” in “machine learning algorithm” refers to, in this case. The computer looks for patterns in C1 and then applies them later to C2.

Explore student data

Split up C1 data

.

.

Sorting mechanism #1 – SVM

Test accuracy with the 25 students we left out before

Review the process

Pause and go back if needed.

Here’s what we are doing:

  1. Goal: predict whether C2 students will pass or fail, using patterns in data from C1 students.

  2. Randomly separate the students into two datasets: 75 students in training data, 25 students in testing data.

  3. Use all fall grades (independent variables) and final grades (dependent variable) of training dataset students to train a machine learning model.

  4. Plug the testing dataset students’ fall grades (independent variables) into the machine learning model to see if it predicts whether they passed or failed. Compare these predictions to what actually happened (which we know because we actually have their final results and we’re just pretending that we don’t).

  5. If the accuracy of the predictions (the success rate) is good enough, use the same machine learning model to predict final grades for C2 students (for whom we do not have the actual final grades).

Sorting mechanism #1 – Did we achieve our goal?

Look at the spreadsheet in the video.

Sorting mechanism #1 – Confusion matrix

What we wanted:

Actually failed Actually passed
Predicted to fail 3 0
Predicted to pass 0 22

What we actually got:

Actually failed Actually passed
Predicted to fail 2 6
Predicted to pass 1 16

Above:

Sorting mechanism #1 – Accuracy (success rate)

The accuracy of the sorting mechanism success rate is the proportion of correct predictions divided by the total number of students.

.

\[\text{accuracy} = \frac{\text{correctly classified students}}{\text{all students}} = \frac{\text{true negatives + true positives}}{\text{total students}}\]

.

\[\text{ideal desired accuracy} = \frac{3+22}{25} = 1\]

.

\[\text{Sorting mechanism #1 actual accuracy} = \frac{2 + 16}{25} = .72\]

Sorting mechanism #2 – Random Forest

Review the process again

Here’s what we are doing again:

  1. Goal: predict whether C2 students will pass or fail, using data from C1 students.

  2. Randomly separate the students into two datasets: 75 students in training data, 25 students in testing data.

  3. Use all fall grades (independent variables) and final grades (dependent variable) of training dataset students to train a machine learning model.

  4. Plug the testing dataset students’ fall grades (independent variables) into the machine learning model to see if it predicts whether they passed or failed. Compare these predictions to what actually happened (which we know because we actually have their final results and we’re just pretending that we don’t).

  5. If the accuracy of the predictions (the success rate) is good enough, use the same machine learning model to predict final grades for C2 students (for whom we do not have the actual final grades).

Sorting mechanism #2 – Confusion matrix

What we wanted:

Actually failed Actually passed
Predicted to fail 3 0
Predicted to pass 0 22

What we actually got:

Actually failed Actually passed
Predicted to fail 3 8
Predicted to pass 0 14

Above:

Sorting mechanism #2 – Accuracy (success rate)

The accuracy of the sorting mechanism success rate is the proportion of correct predictions divided by the total number of students.

.

\[\text{accuracy} = \frac{\text{correctly classified students}}{\text{all students}} = \frac{\text{true negatives + true positives}}{\text{total students}}\]

.

\[\text{ideal desired accuracy} = \frac{3+22}{25} = 1\]

.

\[\text{Sorting mechanism #1 actual accuracy} = \frac{3 + 14}{25} = .68\]

What we will do next

Before we declare which sorting mechanism is best, let’s review what we’re going to use the best sorting mechanism to do:

Select best sorting mechanism

Compare predictions made by each sorting mechanism on the 25 testing students from C1:

SVM:

Actually failed Actually passed
Predicted to fail 2 6
Predicted to pass 1 16
  • Accuracy: 0.72
  • False positives: 1
  • Number of students to remediate: 2 + 6 = 8

Random Forest:

Actually failed Actually passed
Predicted to fail 3 8
Predicted to pass 0 14
  • Accuracy: 0.68
  • False positives: 0
  • Number of students to remediate: 3 + 8 = 11

Criteria to consider while picking the best one:

Select best sorting mechanism

Compare predictions made by each sorting mechanism on the 25 testing students from C1 (with hypothetical C2 numbers in parentheses):

SVM:

Actually failed Actually passed
Predicted to fail 2 (8) 6 (24)
Predicted to pass 1 (4) 16 (64)
  • Accuracy: 0.72 (0.72)
  • False positives: 1 (4)
  • Number of students to remediate: 2 + 6 = 8 (32)

Random Forest:

Actually failed Actually passed
Predicted to fail 3 (12) 8 (32)
Predicted to pass 0 (0 or 1) 14 (56)
  • Accuracy: 0.68 (0.68)
  • False positives: 0 (0 or 1)
  • Number of students to remediate: 3 + 8 = 11 (44)

Criteria to consider while picking the best one:

Full analytics process

What did we just do?

In plain words:

In predictive analytics / machine learning terms:

Technical terms

Most important:

Optional:

We are using supervised machine learning in this example.

Classification algorithms examples

All of the algorithms above use different statistical and/or algorithmic approaches to predict classes (outcome categories) into which our observations (rows of data) fall.

Limitations, pitfalls, and tips

Notes

Discussion

Here are some questions to consider: