Teacher Sentiments on MTSS

Gagan Shergill

Questions

I continue to be interested in exploring what teachers discuss regarding Multitiered Systems of Supports (MTSS) on Reddit. This time however, I am interested in the sentiments of their posts. However, I am mistrustful of our robot overlords, so I am also interested in seeing if sentiment analysis matches my human perception.

Questions:

  1. What is the distribution of sentiments of posts on MTSS on r/Teachers?
  2. Does VADER’s ratings match hand coding?

Context

The data for this project comes from r/Teachers a subreddit with 1.3 million members and is “dedicated to open discussion about all things teaching”. Since MTSS plays a big role in public schools, I thought that this subreddit might contain some insights into how teachers feel about it. Since Reddit allows for anonymous posts to be made, it may allow teachers to express their true opinions regarding MTSS. The primary audience for this research may include researchers interested in conducting sentiment analysis on Reddit data.

Methods

Source:

Data was scraped using Reddit’s API through the RedditExtractoR package. Specifically, I scraped any post that contained “MTSS” or “Multitiered Systems of Supports” resulting in 311 posts.

Data Processing:

I used tidyverse and tidytext for tidying data.

Data Analysis:

To answer Question 1 I mainly use VADER to conduct a sentiment analysis. For Question 2, I hand coded a random sample of 50 posts and rate them as positive, negative, and neutral and calculate VADER’s classification accuracy. I used the caret package to calculate classification accuracy and ggplot2 to create a confusion matrix. If you’re curious about how to do this analysis, I primarily followed this post.

Data Collection

After running RedditExtractor and getting rid of duplicate posts, we end up with 311 unique posts about MTSS. Check out an example of a (very relateable) post.

Example of an MTSS Related Post
“Not sure if this is a NC statewide thing or just my school system but why are there SO many acronyms for the most menial things. And people, mostly admin, will regularly use them in conversation to the point that I feel like I need a decryption key to understand. Literally, hey, the WRC yesterday about that OUGS training didnt make sense, I thought we needed GHT and KEiBI for the students before we could JJRF them? Some more examples so maybe yall dont think Im insane: FSI, NWF, WTSS, PSF, ORF, LNF, BOY, EOY, EOG, BOG, CKLA, MTSS, ICC, ISS, OSD, OSS, RF, UMI and on and on unto eternity, I could probably list another 40.  I understand its intended for brevity but it seems so convoluted and unnecessary. It genuinely takes longer to decode these acronyms than it would to just say the full phrase! Thank you for reading my frustration, lol.”

Question 1: What are the Sentiments of Posts?

Once we run VADER and calculate an average for our compound score, we can see that VADER thinks that the majority of posts lean to the positive end. Looking at the ratio of positive to negative posts also indicates that there are more positive than negative posts. Interestingly, no neutral posts were identified indicating that my data may be highly polarized (people have strong opinions about this topic). It is unusual to have no neutral scores, so I will need to validate this output.

Mean Compound Score
0.31
Positive Neutral Negative Ratio
209 0 102 0.49

Question 2: Does VADER’s Ratings Match Hand Coding?

To conduct this analysis, I took a random sample of 50 posts and reviewed them manually, classifying them as positive, negative, or neutral. I then appended my ratings to the sample data set and created a confusion matrix, which displays the number of correct classifications (my ratings) that VADER correctly classified as well as where our ratings differed. The total accuracy of VADER was 38%, which is not great. In particular, while VADER was good at classifying positive posts that were actually positive (5 out of 6 correct), it was less accurate with negative posts that were actually negative, classifying 21 as positive when they should be negative. It also misclassified 9 neutral posts as positive. Overall, it appears that VADER classified posts as being more positive than they truly are.

Here is an example of a post that I rated as Neutral, but VADER classified as positive:

“I m an IS working in a small school leading the development of our MTSS. In this role, I m giving curriculum guidance to teachers. Our school wide data shows that we have low math proficiency, so I d like to work with our math teachers to choose a new curriculum. Ideally, I m looking for curriculum options that also have tier 2 and tier 3 components.  What math curriculum are you using and what do you like or dislike about that curriculum?”

Here also is an example of a post that I rated as Negative, but VADER classified as Positive:

The holier-than-thou admins and consultants in charge of American K-12 education are more like fervent religious missionaries who will either convert the nonbelievers or excommunicate those who will not bow down to their God,  ‘Student-driven,’ and whose dogma is MTSS.  On a massive scale, the money is in selling thousands of books to districts who just love throwing money away on the dogma. Because it’s all about *the kids*, right?

Conclusion

  • Question 1- What is the distribution of sentiments of posts on MTSS on r/Teachers?
    • Sentiments classified by VADER appear to be positive overall.
  • Question 2 - Does VADER’s ratings match hand coding?
    • No, VADER appears to classify text in a manner that is more positive than what is truly being conveyed. This may indicate that VADER is not well suited to Reddit data overall, or at least data about education.

Next Steps

  • Implications - Researchers seeking to use VADER should for Reddit data should validate VADER’s classification with their own independent review.

  • Limitations - A key limitation is the relatively small (N = 311) sample size, as well as the small number (n = 50) posts used to validate outputs. Future research should increase both for more reliable results.

  • Future Analysis - Given the low classification accuracy of VADER, a supervised machine learning approach may yield improved results.