1 Student Performance Remote vs. In-Person

We have some indication that remote learning may not best serve the needs of our scholars, but we want to be rigorous about if that is true and to what extent. Additionally, can we learn something about what needs require intervention to make remote learning better.

1.1 Key Takeaways

Student achievement is a complex topic so there is not a “one-size-fits-all” solution and learning type is no exception. We do, however, see general trends that can inform our approach moving forward with school re-openings in the post-COVID era.

In general, in-person learning confers greater test scores as compared to remote. There are notable exceptions to this, which can be creatively exploited in the event of a staged re-opening or hybrid learning model in the future. The following are takeaways to consider, keeping in mind the relatively small sample size of this data.

ER and SR question type should be taught in-person to greatest extent possible.

42% greater test scores from in-person learning on ER question types. (small sample)
12% greater test scores from in-person learning on SR question types.
MC question types show no meaningful change based on learning type.

There some clear standards that should be taught in-person and others that can better taught remotely.

Standards 3.RL.7, 3.RL.1, 3.RL.2, 3.RI.5, and 3.RI.1 should be taught in-person.
Standards 3.RI.8, 3.RI.4, and 3.L.5C can be taught effectively via remote learning.
Other standards do not show a meaningful difference in student scores.

In-Person learning has dramatic results on the cumulative assessement ELA_IA_2.

Students who have scores in both learning types generally performed much better in-person.
The unit tests were more mixed in result.
More data is needed to determine a comparison for performance on standards.

Future research is needed to investigate relative impact size for factors beyond learning type.

Teachers, class membership, and test date rank much higher in predictive value than in-person/remote
Other interventions maybe more impactful moving forward.
Predictive models developed only represent a starting point for analysis.

1.2 Analysis

We will start with aggregate level information regarding remote and in-person learning types across several factors.

1.2.1 Mean Student Scores by Question Type

Below we see some striking differences between in-person and remote learning, based the on the mean percent correct for each question.

We will also aggregate these scores by question type for both learning types.

We see here that there is a noticeable difference in outcome relative to remote learning and in-person for ER and SR type problems. The difference in mean score for MC type problems is likely to be not meaningful. One intervention that can be drawn quickly for this is the need to practice extended and short response questions for students who are learning remotely. It should be noted that for ER type problems there are only two questions, so it is difficult to know for sure if the true difference between remote and in-person is actually >40%.

1.2.2 Standard Performance

Looking at percent correct across the common core standards present a mixed bag of results. Remote scores appear to be better on several standards and in-person is better in roughly the same number of standards. There are a few standards by which the difference is too close to call with a three test sample.

Let’s now look a little deeper and break down the standards and test for learning types to see if there is more granular insights to be made.

So we see that for a subsection of the standards in-person is noticably better than remote. There are also some cases in which remote learning out performs in-person, but rarely is the difference large. 3.RI.8, 3.RI.4, and 3.L.5C are standards for which remote learning noticeably exceeded in-person. Standards 3.RL.&, 3.RL.1, 3.RL.2, 3.RI.5, and 3.RI.1 all had noticeably better outcomes with in-person instruction.

1.2.3 Test Performance

First we will look at the test performance over time. We see that both in-person and remote followed the same general trend that one might expect based on the change over a school. Students generally test better on something when it is fresh. It is meaningful to note that while the trend is the same across learning types, the peaks are higher and lows are not as low for in-person learning. Remote appears to be almost always a little behind in-person.

It is interesting to note that for the TA_U3 test, remote learning was 3% better on average than in-person, but the overall trend for that test over the time interval show a clear and rapid decline in scores. In the ELA_IA_2 test, which was the only test analyzed that showed a clear upward score trend, in-person learning generally outscored remote learning on the standards present on the ELA_IA_2 test.

It should be noted that the in-person location does not seem to have a major difference on performance. We see that the different locations have a mix of performance outcomes.

1.2.4 Student Performance

If we select students for which there are direct comparisons to be made between learning style, that is, students who took a test in both remote and in-person settings we see an interesting picture. First, in-person learning generally confers better test scores. This is especially true on the cumulative assessment ELA_IA_2. For that test, there are many students with striking differences in performance once taken in-person.

For comparison on standards this data set does not have a large enough sample to evaluate the difference thoroughly, but we see a mixed bag of results. More information is needed in this regard to determine a meaningful trend. Given the variation in the standards is it likely that some mix of remote and in-person could still accomplish performance goals, if that route is needed in the future. We would need to carefully design which standards are taught in what learning context (remote/in-person).

1.3 Beyond Learning Type

We can now focus on what, if any, other factors might come into play for student performance on these assessemtns. We will run some statistical and machine learning algorithms to determine which factors present in the dataset are most predictive of the percent correct.

## # A tibble: 2 x 3
##   .metric  .estimator .estimate
##   <chr>    <chr>          <dbl>
## 1 accuracy binary         0.745
## 2 kap      binary         0.418

## # A tibble: 2 x 2
##   percent_correct   pct
##   <fct>           <dbl>
## 1 0               0.628
## 2 1               0.372

Our model predicts better than a naive guess at all wrong answers, but only by 12%. The Kappa score could also be higher, but we would either need more data from a “good” scoring test to balance the right and wrong answers or we can up sample the correctly answer questions to better balance the data.

We see below that the primary factor for percent correct on a given test is the student. That should be obvious. The test date is also important, but that is likely due to both “freshness” of material and the fact that two of the three test trended downward, which the machines will learn quickly. The machine also learned of downward trend by weighting the TA_U3 and TA_U4 test highly. This further suggests the need to upsample correct answers or add more data points.

Interesting points include the high importance of each class (ClassID), the high importance of individual teachers (StaffListID), and the fact that remote learning is very far down the list. This isn’t to say that learning type is not valuable to student performance, we see that there are striking difference in certain contexts. However, there could also be other factors and interventions that contribute to more to performance than location or learning type.

Classical Charter School Data Assessment

Jeff Shamp

July 21, 2021