Predicting Grades of Online Learners Based on Activity Indicators

Benay Dara-Abrams
August 22, 2015

How Can We Understand and Improve Online Learning Experiences?

  • Many universities now offer massive open online courses (MOOCs)
  • Harvard and MIT collected data from fall, 2012 to summer, 2013
  • Learners' course activity tracked for first 17 online courses
  • Which activities lead to better learning experiences?
  • Shiny app developed to examine HarvardX-MITx dataset

The HarvardX-MITx Person-Course Dataset AY2013

  • Let's read the dataset into a data frame and determine the number of records (rows)
  • Here is the R expression that is evaluated when the code is run
  • The number of records (rows) in the dataset is shown below.
df_ed <- read.csv("../data_for_proj/HMXPC13_DI_v2_5-14-14.csv")
[1] 641138

What Learner Activities Were Tracked?

  • Number of Interactions with Course (nevents)
  • Number of Active Days (ndays_act)
  • Number of Videos Played (nplay_video)
  • Number of Chapters Viewed (nchapters)
  • Number of Forum Posts (nforum_posts)

Shiny App


  • HarvardX-MITx dataset
  • Linear Regression Models
  • Selection of different activity indicators as predictors
  • Explores how well different activities predict learners' grades
  • Best predictors: ndays_act, nchapters, nevents
  • Least useful predictors: nforum_posts, nplay_video


  • Examine future HarvardX-MITx datasets
  • Differences among courses?
  • Learner activities for other MOOCs?
  • Use Shiny app with other datasets
  • Extend app to use different activities as predictors
  • Which activities increase learning effectiveness?