We have some indication that remote learning may not best serve the needs of our scholars, but we want to be rigorous about if that is true and to what extent. Additionally, can we learn something about what needs require intervention to make remote learning better.
Student achievement is a complex topic so there is not a “one-size-fits-all” solution and learning type is no exception. We do, however, see general trends that can inform our approach moving forward with school re-openings in the post-COVID era.
In general, in-person learning confers greater test scores as compared to remote. There are notable exceptions to this, which can be creatively exploited in the event of a staged re-opening or hybrid learning model in the future. The following are takeaways to consider, keeping in mind the relatively small sample size of this data.
We will start with aggregate level information regarding remote and in-person learning types across several factors.
Below we see some striking differences between in-person and remote learning, based the on the mean percent correct for each question.
We will also aggregate these scores by question type for both learning types.
We see here that there is a noticeable difference in outcome relative to remote learning and in-person for ER and SR type problems. The difference in mean score for MC type problems is likely to be not meaningful. One intervention that can be drawn quickly for this is the need to practice extended and short response questions for students who are learning remotely. It should be noted that for ER type problems there are only two questions, so it is difficult to know for sure if the true difference between remote and in-person is actually >40%.
Looking at percent correct across the common core standards present a mixed bag of results. Remote scores appear to be better on several standards and in-person is better in roughly the same number of standards. There are a few standards by which the difference is too close to call with a three test sample.
Let’s now look a little deeper and break down the standards and test for learning types to see if there is more granular insights to be made.
So we see that for a subsection of the standards in-person is noticably better than remote. There are also some cases in which remote learning out performs in-person, but rarely is the difference large. 3.RI.8, 3.RI.4, and 3.L.5C are standards for which remote learning noticeably exceeded in-person. Standards 3.RL.&, 3.RL.1, 3.RL.2, 3.RI.5, and 3.RI.1 all had noticeably better outcomes with in-person instruction.
First we will look at the test performance over time. We see that both in-person and remote followed the same general trend that one might expect based on the change over a school. Students generally test better on something when it is fresh. It is meaningful to note that while the trend is the same across learning types, the peaks are higher and lows are not as low for in-person learning. Remote appears to be almost always a little behind in-person.
It is interesting to note that for the TA_U3 test, remote learning was 3% better on average than in-person, but the overall trend for that test over the time interval show a clear and rapid decline in scores. In the ELA_IA_2 test, which was the only test analyzed that showed a clear upward score trend, in-person learning generally outscored remote learning on the standards present on the ELA_IA_2 test.
It should be noted that the in-person location does not seem to have a major difference on performance. We see that the different locations have a mix of performance outcomes.
If we select students for which there are direct comparisons to be made between learning style, that is, students who took a test in both remote and in-person settings we see an interesting picture. First, in-person learning generally confers better test scores. This is especially true on the cumulative assessment ELA_IA_2. For that test, there are many students with striking differences in performance once taken in-person.
For comparison on standards this data set does not have a large enough sample to evaluate the difference thoroughly, but we see a mixed bag of results. More information is needed in this regard to determine a meaningful trend. Given the variation in the standards is it likely that some mix of remote and in-person could still accomplish performance goals, if that route is needed in the future. We would need to carefully design which standards are taught in what learning context (remote/in-person).
We can now focus on what, if any, other factors might come into play for student performance on these assessemtns. We will run some statistical and machine learning algorithms to determine which factors present in the dataset are most predictive of the percent correct.
## # A tibble: 2 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 accuracy binary 0.745
## 2 kap binary 0.418
## # A tibble: 2 x 2
## percent_correct pct
## <fct> <dbl>
## 1 0 0.628
## 2 1 0.372
Our model predicts better than a naive guess at all wrong answers, but only by 12%. The Kappa score could also be higher, but we would either need more data from a “good” scoring test to balance the right and wrong answers or we can up sample the correctly answer questions to better balance the data.
We see below that the primary factor for percent correct on a given test is the student. That should be obvious. The test date is also important, but that is likely due to both “freshness” of material and the fact that two of the three test trended downward, which the machines will learn quickly. The machine also learned of downward trend by weighting the TA_U3 and TA_U4 test highly. This further suggests the need to upsample correct answers or add more data points.
Interesting points include the high importance of each class (ClassID), the high importance of individual teachers (StaffListID), and the fact that remote learning is very far down the list. This isn’t to say that learning type is not valuable to student performance, we see that there are striking difference in certain contexts. However, there could also be other factors and interventions that contribute to more to performance than location or learning type.