Introduction

In recent decades, interests and funding has poured into the educational technology industry. Often referred to as the Ed Tech industry, this sector has disrupted the higher education and continuing education landscape.

The goal of this project is to assess learner engagement, defined as the quantity and quality of participation in an educational program, for one of my employer’s (termed “the Company”) asynchronous courses. This is done through the identification of various feature utilization such as page views, media/videos, open-response components, and assessments.

The Company serves adult learners pursing college and professional development credits and certifications. Key stakeholders for the project include the Company’s leadership team, editorial department, creative department, and sales department as well as company partners.

The key outcomes for the project are:

  1. Understand how features are utilized in the course;

  2. Determine a plan of action for features to prioritize or highlight in future course builds, renovations, and marketing.

Data

The data collected for this project was procured from the Company’s proprietary authoring and learning management system (LMS). The data was aggregated at various levels including the domain-, course-, feature-, and individual learner-level. The confidentiality of the learners was paramount and any identifiable learner data, such as log-in name or email, was anonymized for this project.

The data collected was generated by end-users clicks, views, and content directly entered into the authoring platform and varies based on the course features.

Examples of the type of data collected is listed below.

Feature Level 1 Level 2
Catalog Course title, category, creation date, CID, total enrolled , course owner, status N/a
Content Usage Page-Level: total visits, unique visits, avg. duration, total duration Details (learner-level): log-in name, page visits, avg. duration, total duration
Media Usage Media/Video-Level: Page title, Unique Views, Avg. % Played, Avg. Wait Details (learner-level): log-in name, last position, % played, completed (Y/N), wait, date
Component Usage Open-response components: page title, question prompt, total submissions Details: log-in, full name, date
Assessment Usage N/a Details: assignment, log-in name, full name, date, time, time spent, score

Note, this project represents data collected prior to May 2023.

Exploratory Analysis

The analysis began with a summary of one of the Company’s course domains.

Domain Analysis

A domain houses a catalog of course offerings. In this particular domain, there are 388 active courses open for enrollment. The average enrollment in a course is approximately 891 learners and the majority of courses, 75 percentile, have approximately 1000 or fewer learners enrolled. The highest enrollment in a single course was 7735 learners.

##     Title             Created            Category               ID      
##  Length:388         Length:388         Length:388         Min.   :  37  
##  Class :character   Class :character   Class :character   1st Qu.:1159  
##  Mode  :character   Mode  :character   Mode  :character   Median :1666  
##                                                           Mean   :1501  
##                                                           3rd Qu.:1982  
##                                                           Max.   :2213  
##     Enrolled         Owner              Notes              Status         
##  Min.   :   2.0   Length:388         Length:388         Length:388        
##  1st Qu.:  77.0   Class :character   Class :character   Class :character  
##  Median : 259.5   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 890.7                                                           
##  3rd Qu.:1002.5                                                           
##  Max.   :7735.0

Enrollment by Category

Within the domain, courses are identified by title, ID, and category. This category provides a grouping of courses by subject matter. Below is a table of the 34 course categories along with details on the number of courses per category and statistics on enrollment.

The next plot visualizes the distribution of course enrollments by category.

As shown in the plot above, most categories average under 1000 total enrollments per course within a particular suite. However, the categories of Nonprofit Management, Communication, HR Management, Leadership, Creativity and Innovation, and Data Analytics stand out for the most total enrollments.

Course Analysis

The course selected for analysis is the Introduction to Data Analysis. This is a 3-hour introductory course featuring assessments, videos, open-text responses, and games (not captured in this project).

Page Usage

There are 24 pages in the Introduction to Data Analysis course. Page usage is determined by the total and unique page visits.

##      Page            Assignment            Visits     Unique Visits 
##  Length:24          Length:24          Min.   :1668   Min.   : 985  
##  Class :character   Class :character   1st Qu.:1908   1st Qu.:1180  
##  Mode  :character   Mode  :character   Median :2174   Median :1228  
##                                        Mean   :2303   Mean   :1257  
##                                        3rd Qu.:2539   3rd Qu.:1334  
##                                        Max.   :3583   Max.   :1555  
##  Avg. Duration        Total Duration      
##  Min.   :1899-12-31   Min.   :1899-12-31  
##  1st Qu.:1899-12-31   1st Qu.:1899-12-31  
##  Median :1899-12-31   Median :1899-12-31  
##  Mean   :1899-12-31   Mean   :1899-12-31  
##  3rd Qu.:1899-12-31   3rd Qu.:1899-12-31  
##  Max.   :1899-12-31   Max.   :1899-12-31

The average total visits per page is 2303, almost twice the number of enrollments, and unique visits average 1257.

Only 63% of the unique users who visited from page 1-1 (1555) make it to the last page, 1-23 (985). This is a significant drop from unique views from the last content page on 1-21 (1152). This suggests that after learners complete the final assessment on 1-22, missing from this data set, they do not return to the course feedback page. If increasing responses or visibility of the course feedback page is important to the Company, page redirection after the final assessment may be an area to revisit.

Another area of interest at the macro-view of page visits is regarding nested pages as seen on pages 1-15.1 to 1-15.3. There is a small dip in visits of about 50 learners to the nested page content.

Page Usage at the Learner-level

As mentioned, learner identifiable information such as log-in names and email are anonymized.

##      Page            Assignment            Visits       Unique Visits  
##  Length:30214       Length:30214       Min.   : 1.000   Min.   : NA    
##  Class :character   Class :character   1st Qu.: 1.000   1st Qu.: NA    
##  Mode  :character   Mode  :character   Median : 1.000   Median : NA    
##                                        Mean   : 1.845   Mean   :NaN    
##                                        3rd Qu.: 2.000   3rd Qu.: NA    
##                                        Max.   :94.000   Max.   : NA    
##                                                         NA's   :30214  
##  Avg. Duration        Total Duration        Login Name       
##  Min.   :1899-12-31   Min.   :1899-12-31   Length:30214      
##  1st Qu.:1899-12-31   1st Qu.:1899-12-31   Class :character  
##  Median :1899-12-31   Median :1899-12-31   Mode  :character  
##  Mean   :1899-12-31   Mean   :1899-12-31                     
##  3rd Qu.:1899-12-31   3rd Qu.:1899-12-31                     
##  Max.   :1899-12-31   Max.   :1899-12-31                     
## 
## `summarise()` has grouped output by 'Page'. You can override using the
## `.groups` argument.

The average instance of a learner viewing a page is ~2. Within this course, the pages with the highest average number of total page views include the introduction page (3583).

Media Usage

For this study, media includes actor and subject matter expert video content.

## [1] 28  6
##     Media               Page            Unique Plays    Avg. % Played   
##  Length:28          Length:28          Min.   :   1.0   Min.   :0.0000  
##  Class :character   Class :character   1st Qu.:  21.0   1st Qu.:0.3575  
##  Mode  :character   Mode  :character   Median : 542.0   Median :0.4759  
##                                        Mean   : 421.3   Mean   :0.4337  
##                                        3rd Qu.: 694.0   3rd Qu.:0.5962  
##                                        Max.   :1263.0   Max.   :0.6852  
##    Avg. Wait                    Media Type       
##  Min.   :1899-12-31 00:00:00   Length:28         
##  1st Qu.:1899-12-31 00:00:01   Class :character  
##  Median :1899-12-31 00:00:16   Mode  :character  
##  Mean   :1899-12-31 00:00:43                     
##  3rd Qu.:1899-12-31 00:01:30                     
##  Max.   :1899-12-31 00:02:43

Twenty-eight unique media files were identified in the data set. The average unique media plays is 421, less than half of the viewers on a given page. The top two most watched videos included the 1-9 Video: Concepts of Measurement (1263) and the 1-3 Video: Introduction to Data Analysis (817). Note, these videos were the only content on the pages suggesting this page format may be beneficial in highlighting video content. Among the video watched, learners watched approximately 43% of the video on average.

Component Usage

Components in the course are identified as open-text responses.

##   Assignment          Question          Submissions   
##  Length:15          Length:15          Min.   :  0.0  
##  Class :character   Class :character   1st Qu.:219.0  
##  Mode  :character   Mode  :character   Median :602.0  
##                                        Mean   :497.9  
##                                        3rd Qu.:636.0  
##                                        Max.   :942.0

Fifteen individual components were identified in the data set including reflection questions and case study questions. The average submissions per component was 498.

##                                                                                 Assignment
## Component                                                                        1-14 Data Management...
##   1. Why might it be a good idea to use analytics to determine whether it is ...                       0
##   2. What kind of data would most likely be useful in determining whether peo...                       0
##   3. What are some of the potential worries a human resource manager might ha...                       0
##   A researcher for a pharmaceutical company says she has developed a chemical...                       0
##   Do you use descriptive, predictive, or prescriptive analytics most often in...                       0
##   Question 1. Please review the two charts containing SAT score trends for Ne...                       0
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...                       0
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...                       0
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...                       0
##   Question 3. Do you think Jones' approach to surveying current and past stud...                       0
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...                       0
##   Question 4. How would you advise Linda Jones to present the problem and her...                       0
##   Question 4. If Patel and his team can identify a segment of loyal customers...                       0
##   What are some specific ways that you might guard against data errors?...                           219
##                                                                                 Assignment
## Component                                                                        1-15.3 Biases and Errors...
##   1. Why might it be a good idea to use analytics to determine whether it is ...                           0
##   2. What kind of data would most likely be useful in determining whether peo...                           0
##   3. What are some of the potential worries a human resource manager might ha...                           0
##   A researcher for a pharmaceutical company says she has developed a chemical...                         737
##   Do you use descriptive, predictive, or prescriptive analytics most often in...                           0
##   Question 1. Please review the two charts containing SAT score trends for Ne...                           0
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...                           0
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...                           0
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...                           0
##   Question 3. Do you think Jones' approach to surveying current and past stud...                           0
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...                           0
##   Question 4. How would you advise Linda Jones to present the problem and her...                           0
##   Question 4. If Patel and his team can identify a segment of loyal customers...                           0
##   What are some specific ways that you might guard against data errors?...                                 0
##                                                                                 Assignment
## Component                                                                        1-19 Case Study: Data and Telecommuting...
##   1. Why might it be a good idea to use analytics to determine whether it is ...                                        206
##   2. What kind of data would most likely be useful in determining whether peo...                                        203
##   3. What are some of the potential worries a human resource manager might ha...                                        201
##   A researcher for a pharmaceutical company says she has developed a chemical...                                          0
##   Do you use descriptive, predictive, or prescriptive analytics most often in...                                          0
##   Question 1. Please review the two charts containing SAT score trends for Ne...                                          0
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...                                          0
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...                                          0
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...                                          0
##   Question 3. Do you think Jones' approach to surveying current and past stud...                                          0
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...                                          0
##   Question 4. How would you advise Linda Jones to present the problem and her...                                          0
##   Question 4. If Patel and his team can identify a segment of loyal customers...                                          0
##   What are some specific ways that you might guard against data errors?...                                                0
##                                                                                 Assignment
## Component                                                                        1-20 Case Study: Aligning Sales Objectives with Email Marketing Campaigns...
##   1. Why might it be a good idea to use analytics to determine whether it is ...                                                                            0
##   2. What kind of data would most likely be useful in determining whether peo...                                                                            0
##   3. What are some of the potential worries a human resource manager might ha...                                                                            0
##   A researcher for a pharmaceutical company says she has developed a chemical...                                                                            0
##   Do you use descriptive, predictive, or prescriptive analytics most often in...                                                                            0
##   Question 1. Please review the two charts containing SAT score trends for Ne...                                                                            0
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...                                                                          644
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...                                                                            0
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...                                                                          631
##   Question 3. Do you think Jones' approach to surveying current and past stud...                                                                            0
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...                                                                          625
##   Question 4. How would you advise Linda Jones to present the problem and her...                                                                            0
##   Question 4. If Patel and his team can identify a segment of loyal customers...                                                                          616
##   What are some specific ways that you might guard against data errors?...                                                                                  0
##                                                                                 Assignment
## Component                                                                        1-21 Case Study: Data Analytics in a Suburban School District...
##   1. Why might it be a good idea to use analytics to determine whether it is ...                                                                0
##   2. What kind of data would most likely be useful in determining whether peo...                                                                0
##   3. What are some of the potential worries a human resource manager might ha...                                                                0
##   A researcher for a pharmaceutical company says she has developed a chemical...                                                                0
##   Do you use descriptive, predictive, or prescriptive analytics most often in...                                                                0
##   Question 1. Please review the two charts containing SAT score trends for Ne...                                                              608
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...                                                                0
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...                                                              597
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...                                                                0
##   Question 3. Do you think Jones' approach to surveying current and past stud...                                                              582
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...                                                                0
##   Question 4. How would you advise Linda Jones to present the problem and her...                                                              585
##   Question 4. If Patel and his team can identify a segment of loyal customers...                                                                0
##   What are some specific ways that you might guard against data errors?...                                                                      0
##                                                                                 Assignment
## Component                                                                        1-5 Big Data...
##   1. Why might it be a good idea to use analytics to determine whether it is ...               0
##   2. What kind of data would most likely be useful in determining whether peo...               0
##   3. What are some of the potential worries a human resource manager might ha...               0
##   A researcher for a pharmaceutical company says she has developed a chemical...               0
##   Do you use descriptive, predictive, or prescriptive analytics most often in...             917
##   Question 1. Please review the two charts containing SAT score trends for Ne...               0
##   Question 1. What is the key flaw in Patel's conclusions about the team's Su...               0
##   Question 2. Linda Jones has an untested hypothesis that explains the declin...               0
##   Question 2. Why is Patel's first idea (purchasing more banner ads on third-...               0
##   Question 3. Do you think Jones' approach to surveying current and past stud...               0
##   Question 3. Do you think Logue's gut instinct to rein in email campaigns pr...               0
##   Question 4. How would you advise Linda Jones to present the problem and her...               0
##   Question 4. If Patel and his team can identify a segment of loyal customers...               0
##   What are some specific ways that you might guard against data errors?...                     0

Investigating component usage by unique visits, we discover the following:

  • 1-5 Open Response: 917 participants, 1401 unique visits

  • 1-14 Open Response: 219 participants, 1227 unique visits

  • 1-15.3 Open Response: 737 participants, 1155 unique visits

  • 1-19 Case Study: averaged 200 participants per question (x3), 1182 unique visits

  • 1-20 Case Study: averaged 630 participants per question (x4), 1165 unique visits

  • 1-21 Case Study: averaged 593 participants per question (x4), 1152 unique visits

Here we find the case studies appear to be utilized by only about half of the visitors to the page. The open response questions are slightly higher.

The graph above displays the changes in components usage over time. Both the 1-19 Case Study (teal) and 1-14 Data Management (yellow) open response questions were phased out by 2021 and case studies were primarily utilized instead. The case studies saw the most usage between Q2, 2021 - Q1, 2023. This may point towards an increase in enrollment during the mid 2020 through early 2023.

Overall, open response questions are utilized more by learners than the case studies; however, case studies saw an increase in usage in recent years as earlier reflection questions were phased out.

Assessment Usage

Assessments include both an ungraded pre-assessment and a final self-assessment.

There are 3223 observations in the assessment usage data set.

##   Assignment         Login Name         Full Name              Date           
##  Length:3223        Length:3223        Length:3223        Min.   :2019-03-10  
##  Class :character   Class :character   Class :character   1st Qu.:2021-02-06  
##  Mode  :character   Mode  :character   Mode  :character   Median :2021-12-07  
##                                                           Mean   :2021-10-25  
##                                                           3rd Qu.:2022-09-01  
##                                                           Max.   :2023-04-29  
##      Time            Time Spent            Score            Total    
##  Length:3223        Length:3223        Min.   : 10.00   Min.   :100  
##  Class :character   Class :character   1st Qu.: 53.40   1st Qu.:100  
##  Mode  :character   Mode  :character   Median : 70.00   Median :100  
##                                        Mean   : 68.81   Mean   :100  
##                                        3rd Qu.: 86.00   3rd Qu.:100  
##                                        Max.   :100.00   Max.   :100
## # A tibble: 2 x 2
##   Assignment           `mean(Score)`
##   <chr>                        <dbl>
## 1 1-2 Pre-Assessment            56.4
## 2 1-22 Self Assessment          78.0

1369 learners participated in the pre-assessment with an average score of 56%. The self assessment was taken 1854 times with an average score of 78%. Many learners took the final assessment multiple times to achieve a passing score of 70% or to improve on their previous score. The higher average in the self -assessment may be partly attributed to the self-assessment as a requirement for completion.

Conclusion

The goal of this project was to understand how features are utilized in the course and determine a plan of action for features to prioritize or highlight in future course builds, renovations, and marketing.

Regarding features, the average participation in media and components was under 500 views/submissions, a third of the total initial enrollments. The open-response components, particularly reflection questions, had a slightly higher rate of utilization than the media/videos. Learners did, however, take advantage of the pre-assessment in the course and took the final assessment more than once to achieve their preferred score.

When determining an action plan, the Company should consider positioning videos on pages without content to garner higher viewership. The Company may also want to limit nesting pages for necessary content. This was an area potentially skipped by the learner. Finally, to garner more visitation to the last page on course feedback, the company may want to review where/how learners are navigated from the final assessment.

Future analysis could review the intersection of learner behavior and assessment scores or enrollment in future courses. Evaluating the quality of open-response components through text analysis could be another area of further research.