We started by collecting #Edchat tweets between February 1 and May 31, 2020. Data from March 1-May 31 were already collected and cleaned, leaving only tweets from February 2020. These were collected with two additional TAGS trackers:
1/28/20 - 2/20/20 (65,872 tweets)
2/20/20 - 4/17/20 (182,555 tweets)
We remove duplicate tweets and then get additional tweet metadata with lookup_many_tweets() from the {tidytags} package.
This leaves us with 338523 tweets and 100 associated variables for each tweet.
With precisely defined start (2020-02-01) and end (2020-05-31 23:59:59) times, we now have 331538 tweets remaining.
Of these 331538 total tweets, there are 118163 original tweets (i.e., not retweets), or 35.64% of #Edchat tweets between March 1 and May 31, 2020. 12781 different users contributed these original tweets.
Of the 118163 original tweets, 16787 (14.21%) contained a question.
We used the {mapsapi} package to query the Google Maps API and retrieve map placements for the locations that #Edchat tweeters listed in their profiles. This table shows the top-10 countries in terms of number of distinct locations identified.
## country n
## 1 USA 2821
## 2 Canada 363
## 3 UK 360
## 4 Australia 119
## 5 India 73
## 6 Ireland 47
## 7 New Zealand 37
## 8 Germany 26
## 9 Spain 26
## 10 Philippines 23
From these locations, we identified the self-identified country of tweeters. The following table shows the number of #Edchat tweets from tweeters, binned by the country of tweeters.
## country n
## 1 USA 73859
## 2 UK 8111
## 3 Canada 6220
## 4 Australia 2432
## 5 Spain 1194
## 6 India 981
## 7 New Zealand 595
## 8 Ireland 577
## 9 China 390
## 10 Colombia 300
We found that there were 8111 original tweets from UK tweeters and 73859 original tweets from US tweeters during COVID-19, February 1-May 31, 2020.
First, we plotted the relative frequency of words tweeted by UK users compared to US tweeters.
First, here is a list of 20 words that occur with very similar frequencies in tweets from UK and US users, according to the calculated log odds ratio for each word.
## # A tibble: 20 x 4
## word USA UK logratio
## <chr> <dbl> <dbl> <dbl>
## 1 strategy 0.000251 0.000251 0.000694
## 2 #mypchat 0.0000837 0.0000838 0.000694
## 3 cuts 0.0000837 0.0000838 0.000694
## 4 incredibly 0.0000837 0.0000838 0.000694
## 5 priorities 0.0000837 0.0000838 0.000694
## 6 setting 0.000228 0.000227 -0.000787
## 7 question 0.000682 0.000682 0.000941
## 8 #wellness 0.0000719 0.0000718 -0.00165
## 9 encourage 0.000276 0.000275 -0.00236
## 10 development 0.000540 0.000539 -0.00274
## 11 steam 0.000107 0.000108 0.00383
## 12 variety 0.000107 0.000108 0.00383
## 13 level 0.000465 0.000467 0.00467
## 14 #criticalthinking 0.0000601 0.0000598 -0.00492
## 15 tutorials 0.0000601 0.0000598 -0.00492
## 16 7pm 0.000119 0.000120 0.00493
## 17 event 0.000369 0.000371 0.00525
## 18 examples 0.000169 0.000168 -0.00632
## 19 message 0.000297 0.000299 0.00691
## 20 comment 0.000108 0.000108 -0.00710
Second, here is a visualization of 20 words that occur with very disparate frequencies in tweets from UK and US users, according to the calculated log odds ratio for each word.
We are comparing many slopes here and some of them are not statistically significant, so let’s apply an adjustment to the p-values for multiple comparisons.
Now let’s find the most important slopes. Which words have changed in frequency at a moderately significant level in our tweets
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.0 (2020-04-24)
## os macOS Mojave 10.14.6
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2020-07-30
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## anytime * 0.3.7 2020-01-20 [1] CRAN (R 4.0.0)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
## backports 1.1.7 2020-05-13 [1] CRAN (R 4.0.0)
## bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.0)
## broom * 0.5.6 2020-04-20 [1] CRAN (R 4.0.0)
## callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0)
## caTools 1.18.0 2020-01-17 [1] CRAN (R 4.0.0)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.0)
## class 7.3-17 2020-04-26 [1] CRAN (R 4.0.0)
## classInt 0.4-3 2020-04-07 [1] CRAN (R 4.0.0)
## cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
## cluster 2.1.0 2019-06-19 [1] CRAN (R 4.0.0)
## codetools 0.2-16 2018-12-24 [1] CRAN (R 4.0.0)
## colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.0)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
## data.table 1.12.8 2019-12-09 [1] CRAN (R 4.0.0)
## DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0)
## dbplyr 1.4.3 2020-04-19 [1] CRAN (R 4.0.0)
## dendextend 1.13.4 2020-02-28 [1] CRAN (R 4.0.0)
## desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
## devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0)
## digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
## dplyr * 1.0.0 2020-05-29 [1] CRAN (R 4.0.0)
## e1071 1.7-3 2019-11-26 [1] CRAN (R 4.0.0)
## ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
## farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.0)
## fastmatch 1.1-0 2017-01-28 [1] CRAN (R 4.0.0)
## forcats * 0.5.0 2020-03-01 [1] CRAN (R 4.0.0)
## foreach 1.5.0 2020-03-30 [1] CRAN (R 4.0.0)
## fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0)
## gclus 1.3.2 2019-01-07 [1] CRAN (R 4.0.0)
## gdata 2.18.0 2017-06-06 [1] CRAN (R 4.0.0)
## generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0)
## ggplot2 * 3.3.0 2020-03-05 [1] CRAN (R 4.0.0)
## glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0)
## gplots 3.0.4 2020-07-05 [1] CRAN (R 4.0.0)
## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.0)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0)
## gtools 3.8.2 2020-03-31 [1] CRAN (R 4.0.0)
## haven 2.2.0 2019-11-08 [1] CRAN (R 4.0.0)
## hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.0)
## htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0)
## httr 1.4.1 2019-08-05 [1] CRAN (R 4.0.0)
## ISOcodes 2020.03.16 2020-03-16 [1] CRAN (R 4.0.0)
## iterators 1.0.12 2019-07-26 [1] CRAN (R 4.0.0)
## janeaustenr 0.1.5 2017-06-10 [1] CRAN (R 4.0.0)
## jsonlite 1.6.1 2020-02-02 [1] CRAN (R 4.0.0)
## KernSmooth 2.23-17 2020-04-26 [1] CRAN (R 4.0.0)
## knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0)
## labeling 0.3 2014-08-23 [1] CRAN (R 4.0.0)
## lattice 0.20-41 2020-04-02 [1] CRAN (R 4.0.0)
## lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0)
## lubridate * 1.7.8 2020-04-06 [1] CRAN (R 4.0.0)
## magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0)
## mapsapi * 0.4.5 2020-04-14 [1] CRAN (R 4.0.0)
## MASS 7.3-51.6 2020-04-26 [1] CRAN (R 4.0.0)
## Matrix 1.2-18 2019-11-27 [1] CRAN (R 4.0.0)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
## mgcv 1.8-31 2019-11-09 [1] CRAN (R 4.0.0)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.0)
## modeltools 0.2-23 2020-03-05 [1] CRAN (R 4.0.1)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0)
## nlme 3.1-147 2020-04-13 [1] CRAN (R 4.0.0)
## NLP 0.2-0 2018-10-18 [1] CRAN (R 4.0.0)
## pillar 1.4.4 2020-05-05 [1] CRAN (R 4.0.0)
## pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 4.0.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0)
## pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0)
## plyr 1.8.6 2020-03-03 [1] CRAN (R 4.0.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
## processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0)
## ps 1.3.3 2020-05-08 [1] CRAN (R 4.0.0)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
## quanteda * 2.1.0 2020-07-05 [1] CRAN (R 4.0.1)
## R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
## RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.0.0)
## Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0)
## RcppParallel 5.0.1 2020-05-06 [1] CRAN (R 4.0.0)
## readr * 1.3.1 2018-12-21 [1] CRAN (R 4.0.0)
## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.0)
## registry 0.5-1 2019-03-05 [1] CRAN (R 4.0.0)
## remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0)
## reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.0)
## reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.0.0)
## rlang 0.4.6 2020-05-02 [1] CRAN (R 4.0.0)
## rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0)
## rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0)
## rstudioapi 0.11 2020-02-07 [1] CRAN (R 4.0.0)
## rvest 0.3.5 2019-11-08 [1] CRAN (R 4.0.0)
## scales * 1.1.1 2020-05-11 [1] CRAN (R 4.0.0)
## seriation * 1.2-8 2019-08-27 [1] CRAN (R 4.0.0)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
## sf 0.9-3 2020-05-04 [1] CRAN (R 4.0.0)
## slam 0.1-47 2019-12-21 [1] CRAN (R 4.0.0)
## SnowballC 0.7.0 2020-04-01 [1] CRAN (R 4.0.0)
## stopwords 2.0 2020-04-14 [1] CRAN (R 4.0.0)
## stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
## testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0)
## tibble * 3.0.1 2020-04-20 [1] CRAN (R 4.0.0)
## tidyr * 1.1.0 2020-05-20 [1] CRAN (R 4.0.0)
## tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0)
## tidytags * 0.1.0 2020-06-16 [1] Github (bretsw/tidytags@35f83cc)
## tidytext * 0.2.4 2020-04-17 [1] CRAN (R 4.0.0)
## tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 4.0.0)
## tm 0.7-7 2019-12-12 [1] CRAN (R 4.0.1)
## tokenizers 0.2.1 2018-03-29 [1] CRAN (R 4.0.0)
## topicmodels * 0.2-11 2020-04-19 [1] CRAN (R 4.0.1)
## TSP 1.1-10 2020-04-17 [1] CRAN (R 4.0.0)
## units 0.6-6 2020-03-16 [1] CRAN (R 4.0.0)
## usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0)
## utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.0)
## vctrs 0.3.1 2020-06-05 [1] CRAN (R 4.0.0)
## viridis * 0.5.1 2018-03-29 [1] CRAN (R 4.0.0)
## viridisLite * 0.3.0 2018-02-01 [1] CRAN (R 4.0.0)
## withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0)
## xfun 0.14 2020-05-20 [1] CRAN (R 4.0.0)
## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.0)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library