This independent analysis focuses on studying public’s sentiment regarding the Science of Reading (SoR). Science of Reading relates to evidence-based reading instruction practices that can be designed to meet the needs of individual learners(Snowling & Hulme, 2005). This includes, acquisition of language, phonological and phonemic awareness, phonics and spelling, fluency, vocabulary, oral language, and comprehension. The main motivation for doing this analysis came from the recent debates that have been going on regarding modification of “Read to Achive” legislation, that calls for the adoption of Science of Reading curricula in North Carolina (Pondiscio,2021).In April 2021, the NC’s democratic governor mandated schools to use phoenics based approach to improve reading instruction. According to North Carolina’s Department of Public Instruction, in implementing this bill, teachers are expected to be trained in SoR and to base their instruction in it. The main purpose of this brief analysis is to highlight the public’s opinion regarding Science of Reading as a construct. The study followed through the Data-Intensive Research Workflow presented by Krumm et al.(2018), to perform the informed analysis and communicate the findings.
The research questions that guided this study include:
RQ1: What are the most frequent words that represent Science of Reading discussions on Twitter?
RQ2: What is the overall sentiment toward Science of Reading in social network platforms such as Twitter?
The analysis is based on Twitter data that is pulled through a developer account. The dataset primarily included 781 observations, which due to the limitations of the account, these are tweets that have been posted in the previous nine days.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
##
## col_factor
After loading the libraries, the next step involved pulling Twitter Data relating to science of reading through the developer account.
## <Token>
## <oauth_endpoint>
## request: https://api.twitter.com/oauth/request_token
## authorize: https://api.twitter.com/oauth/authenticate
## access: https://api.twitter.com/oauth/access_token
## <oauth_app> DoreenUsesR
## key: pTi0tvNb5VF0ONG2hQFX7RkqP
## secret: <hidden>
## <credentials> oauth_token, oauth_token_secret
## ---
In this study, the wrangling process involved pulling Twitter data by searching tweets that correspond to science of reading. I then pulled the data into R and performed selection of variables of interest , tokenized the tweets and got the lexicons for sentiment analysis. I have included comments in the code chuck to inform on the performed manipulations.
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
## # A tibble: 2,867 × 2
## word n
## <chr> <int>
## 1 reading 648
## 2 science 546
## 3 #scienceofreading 180
## 4 students 123
## 5 literacy 116
## 6 teachers 112
## 7 amp 103
## 8 read 80
## 9 learning 72
## 10 learn 64
## # … with 2,857 more rows
It can be observed some words included in this output are expected to be present in the discussions and they do not offer deeper meaning to the analysis. I therefore customized the stopwords to remove words such as “reading”, “science”, “amp”, ’#scienceofreading” which are repetitive of the main concept.
## # A tibble: 2,854 × 2
## word n
## <chr> <int>
## 1 students 123
## 2 literacy 116
## 3 teachers 112
## 4 read 80
## 5 learning 72
## 6 learn 64
## 7 teacher 51
## 8 training 49
## 9 school 46
## 10 instruction 43
## # … with 2,844 more rows
Inorder to have a neat wordcloud, I selected the top 100 words to include in the visualization.
## Selecting by n
From the visualized wordcloud, it can be observed words such as “debates”, “learn” “media” instruction” to be prevalent in the discussions about Science of reading. However,as much as I found the wordcloud to be informative, I think plotting the words can provide further insights to the study. I therefore created a bar chart to provide a clearer count of the words in use.
After exploring the frequent words by the word cloud and bar chart, in the next stage I loaded the lexicons (AFINN, BING and NRC) in order to compute the sentiment of public tweets. The use of these three lexicons enhances the validity of study especially in answering the second research question.
## # A tibble: 2,477 × 2
## word value
## <chr> <dbl>
## 1 abandon -2
## 2 abandoned -2
## 3 abandons -2
## 4 abducted -2
## 5 abduction -2
## 6 abductions -2
## 7 abhor -3
## 8 abhorred -3
## 9 abhorrent -3
## 10 abhors -3
## # … with 2,467 more rows
## # A tibble: 6,786 × 2
## word sentiment
## <chr> <chr>
## 1 2-faces negative
## 2 abnormal negative
## 3 abolish negative
## 4 abominable negative
## 5 abominably negative
## 6 abominate negative
## 7 abomination negative
## 8 abort negative
## 9 aborted negative
## 10 aborts negative
## # … with 6,776 more rows
## # A tibble: 13,875 × 2
## word sentiment
## <chr> <chr>
## 1 abacus trust
## 2 abandon fear
## 3 abandon negative
## 4 abandon sadness
## 5 abandoned anger
## 6 abandoned fear
## 7 abandoned negative
## 8 abandoned sadness
## 9 abandonment anger
## 10 abandonment fear
## # … with 13,865 more rows
## # A tibble: 4,150 × 2
## word sentiment
## <chr> <chr>
## 1 abandon negative
## 2 abandoned negative
## 3 abandoning negative
## 4 abandonment negative
## 5 abandonments negative
## 6 abandons negative
## 7 abdicated negative
## 8 abdicates negative
## 9 abdicating negative
## 10 abdication negative
## # … with 4,140 more rows
## # A tibble: 205 × 3
## word n value
## <chr> <int> <dbl>
## 1 support 37 2
## 2 free 24 1
## 3 struggling 24 -2
## 4 excited 20 3
## 5 love 20 3
## 6 join 15 1
## 7 opportunity 15 2
## 8 growth 14 2
## 9 importance 12 2
## 10 amazing 11 4
## # … with 195 more rows
## # A tibble: 213 × 3
## word n sentiment
## <chr> <int> <chr>
## 1 support 37 positive
## 2 free 24 positive
## 3 struggling 24 negative
## 4 balanced 20 positive
## 5 excited 20 positive
## 6 love 20 positive
## 7 lead 18 positive
## 8 skill 16 positive
## 9 amazing 11 positive
## 10 gains 10 positive
## # … with 203 more rows
## # A tibble: 803 × 3
## word n sentiment
## <chr> <int> <chr>
## 1 learning 72 positive
## 2 learn 64 positive
## 3 teacher 51 positive
## 4 teacher 51 trust
## 5 school 46 trust
## 6 instruction 43 positive
## 7 instruction 43 trust
## 8 teach 31 joy
## 9 teach 31 positive
## 10 teach 31 surprise
## # … with 793 more rows
## # A tibble: 2 × 2
## sentiment n
## <chr> <int>
## 1 positive 129
## 2 negative 84
## # A tibble: 2 × 2
## sentiment n
## <chr> <int>
## 1 positive 129
## 2 negative 84
## # A tibble: 1 × 4
## lexicon negative positive sentiment
## <chr> <int> <int> <int>
## 1 bing 84 129 45
## # A tibble: 1 × 2
## lexicon sentiment
## <chr> <dbl>
## 1 AFINN 137
## # A tibble: 698 × 2
## status_id text
## <chr> <chr>
## 1 1492248230003937287 "#dyslexia,#reading-intervention\n#scienceofreading #ear…
## 2 1492230303464792065 "“There is NO comprehension strategy powerful enough to …
## 3 1492228001521709059 "Today I was at @FoxNews talking about how NY politician…
## 4 1492224881949384710 "We LOVE to see this!! #unanimous #earlyliteracy #scienc…
## 5 1492222836123000838 "Great news out of Virginia today! Delegate @CarrieCoyne…
## 6 1492219473755095041 "@overtimerules The District is putting resources toward…
## 7 1490677380372959239 "A little #MondayMotivation for any literacy leaders out…
## 8 1492213780616552448 "We're adding some style to our uniforms with our new Pr…
## 9 1492181809727262721 "7th and 8th grade Earthworm Dissection in our Science L…
## 10 1492209877644681218 "#Educators: do you remember the first time you learned …
## # … with 688 more rows
## Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
## # A tibble: 716 × 3
## status_id word value
## <chr> <chr> <dbl>
## 1 1492248230003937287 recommended 2
## 2 1492230303464792065 powerful 2
## 3 1492224881949384710 love 3
## 4 1492219473755095041 supporting 1
## 5 1492219473755095041 support 2
## 6 1492213780616552448 spirit 1
## 7 1492181809727262721 fun 4
## 8 1492181809727262721 admire 3
## 9 1492209877644681218 shame -2
## 10 1492209877644681218 free 1
## # … with 706 more rows
## # A tibble: 246 × 2
## status_id value
## <chr> <dbl>
## 1 1489301156442431489 4
## 2 1489314374732763137 -4
## 3 1489342994885103616 4
## 4 1489356657457041408 0
## 5 1489357753508368393 2
## 6 1489385427803025408 4
## 7 1489400946362765313 -2
## 8 1489456081449500676 6
## 9 1489472827925610497 2
## 10 1489590243871318017 -1
## # … with 236 more rows
## # A tibble: 231 × 3
## status_id value sentiment
## <chr> <dbl> <chr>
## 1 1489301156442431489 4 positive
## 2 1489314374732763137 -4 negative
## 3 1489342994885103616 4 positive
## 4 1489357753508368393 2 positive
## 5 1489385427803025408 4 positive
## 6 1489400946362765313 -2 negative
## 7 1489456081449500676 6 positive
## 8 1489472827925610497 2 positive
## 9 1489590243871318017 -1 negative
## 10 1489607503998554112 -4 negative
## # … with 221 more rows
## # A tibble: 1 × 3
## negative positive ratio
## <int> <int> <dbl>
## 1 54 177 0.305
## # A tibble: 6 × 3
## method sentiment n
## <chr> <chr> <int>
## 1 AFINN positive 506
## 2 AFINN negative 210
## 3 bing positive 129
## 4 bing negative 84
## 5 nrc positive 244
## 6 nrc negative 86
## Joining, by = "method"
## # A tibble: 6 × 4
## method sentiment n total
## <chr> <chr> <int> <int>
## 1 AFINN positive 506 716
## 2 AFINN negative 210 716
## 3 bing positive 129 213
## 4 bing negative 84 213
## 5 nrc positive 244 330
## 6 nrc negative 86 330
## # A tibble: 6 × 5
## method sentiment n total percent
## <chr> <chr> <int> <int> <dbl>
## 1 AFINN positive 506 716 70.7
## 2 AFINN negative 210 716 29.3
## 3 bing positive 129 213 60.6
## 4 bing negative 84 213 39.4
## 5 nrc positive 244 330 73.9
## 6 nrc negative 86 330 26.1
The independent analysis provided baseline information that has helped in revealing useful insights on the public’s sentiment toward the Science of Reading as a concept and the words that were frequent in those discussions. In this discussion, I will use the research questions to guide the presentation of the findings.
The analysis revealed that words such as “literacy”, “debates”, “students”, instruction” and “debates” were prevalent in Twitter posts regarding the science of reading. In interpreting these words, it can be seen that the tweets were mostly focused on students who are the intended target group.
From the sentiment analysis, the findings indicate that the overall sentiment toward science of reading is positive. The specific percentage scores for each lexicons are 73% for Afinn, 59% for Bing and 73% for NRC. The use of these three lexicons was important for enhancing the validity of the sentiment analysis. The visualization shows the findings from these lexicons did not vary much. Afinn and NRC almost yielded the same scores for the tweets analyzed.
Based on the scope, this analysis has a number of limitations. The major one includes the number of observations used to perform the study. Due to the limitations of the developer account, the only tweets that could be imported were from the past 9 days. Such analysis requires data to be collected over a period of time to make sure the corpus is sufficient for a more accurate analysis. Secondly, this analysis was very general especially in gleaning sentiment from the public. It would have been useful to find a way of classifying these tweets according to who posted them. Classifying the sentiment according to positions and roles would have provided further meaning to the study. For instance, the opinion of teachers who are expected to undergo training to implement the legislation would be different from the sentiment of parents or policy makers who are already convinced on the positive results from the Science of Reading (Pondiscio,2021). This also provides opportunities for further research and ways that this study could be improved.
As much as the findings are limited, they can still provide foundational information for policy, research and practice. The public sentiment on science of reading can provide some entry level information for reformers, policy makers, educators and researchers who are looking into science of reading and the implementation of the legislation in the NC education system. The keywords can provide an indication of where the discussions in the social networks are headed and what issues are the primary concern.
Krumm, A., Means, B., & Bienkowski, M. (2018). Learning analytics goes to school: A collaborative approach to improving education. Routledge.
Snowling, M. J., & Hulme, C. E. (2005). The science of reading: A handbook. Blackwell Publishing.