Code
load("/User/ProjectDirectory/YourDataFrame.RData")
This tutorial covers how to build and interpret a basic structural topic model using the “stm” package in R. To begin, you will need a dataframe containing a “stripped” version of the text you plan to analyze alongside any covariates to be used as metadata. You can find a tutorial covering how to prepare raw text data for analysis, including stripping the text, HERE.
For the tutorial, I will use data consisting of tweets containing at least one of 5 key words related to COVID-19; you can find the process for obtaining the data HERE, but note, due to recent changes to the Twitter API you now need a paid developer account to accomplish the tasks in the tutorial.
If your data are stored on your computer, load them into your workspace.
load("/User/ProjectDirectory/YourDataFrame.RData")
To use metadata variables as covariates in your structural topic model, you should start by constructing a text corpus and document-feature matrix (dfm). I like to begin by constructing a corpus and dfm with the “quanteda” package. Go ahead and load the “quanteda” and “stm” packages.
library(quanteda)
library(stm)
Make sure you call the stripped version of your text variable. Mine here is called “strippedT” while the original variable is named “text”
<- quanteda::corpus(BigDat$strippedT) Big_Qcorp
For the dfm, I begin by tokenizing the corpus I just created, which I called “Big_Qcorp”. I want to tokenize at the word level (as opposed to the sentence of collocation levels), retain the document variables, and to ensure all symbols are removed from my stripped text variable, I also want to remove any symbols my initial stripping of the text may have missed. This is not common, but it does happen, so I like to include the “remove_symbols” argument as a last check when constructing my dfm. I use the quanteda “tokens” function, identify the corpus, identify the “what” as “word,” and set “remove_symbols” and “include_docvars” to equal “TRUE” (or, “T”).
Next I create the dfm with the quanteda “dfm()” function.
I can stem the tokenized words in the dfm to their root. This will remove any prefix/suffix. For example, stemming the word “disability” to the root “disab” will group uses of “disability,” “disabled,” “disable,” and “disabling,” which could be important for something like ensuring a disability-related topic accounts for all text discussing disability, whether it refers to the disability itself, the process of becoming disabled, or an identity as a disabled person.
I can remove stopwords, common words like “and,” “the,” “we,” etc. that do not provide important information on their own, with the “dfm_select()” funciton. This will help the structural topic model construct meaningful topics instead of a topic about “and the we” (doesn’t make much sense, right?).
<- tokens(Big_Qcorp,
Big_dfm what = "word",
remove_symbols = T,
include_docvars = T) %>%
dfm() %>%
dfm_wordstem() %>%
dfm_select(pattern = stopwords("english"),
selection = c("remove"),
valuetype = c("fixed"))
Now I want to identify the metadata associated with the dfm. I do this with the “meta” function. IMPORTANT: Notice that the function goes on the left side of the arrow in this case (where usually it points left with information on the right). Here we are saying which part of the object on the left to place the information on the right. In this case, I want to tell the computer that the variables in the “BigDat” dataframe contain the metadata for the “Big_dfm” document-feature matrix.
::meta(Big_dfm) <- BigDat quanteda
I can run the “stm” function for the structural topic model with the quanteda dfm directly, but this uses a lot of memory. The stm already uses A LOT of memory, so it’s always best to convert if possible. This is especially important if you have a dataframe with many documents (100,000+). I convert the dfm to stm format with the quanteda “convert” function.
<- convert(Big_dfm,
BigDat_prepSTM to = c("stm"),
docvars = BigDat)
The new “BigDat_prepSTM” object has three elements, “meta,” “vocab,” and “documents.” I can extract these and store them in their own objects prior to using the “prepDocuments” function. Here I will create three new objects for meta, vocab, and documents, attaching the “og” label to the end of the object name so I know it is from the original prep object.
<- BigDat_prepSTM$meta
Bigmeta_og <- BigDat_prepSTM$vocab
Bigvocab_og <- BigDat_prepSTM$documents Bigdocs_og
Next I use the stm “prepDocuments” function to create an object formatted correctly for structural topic modeling. Here I will name the new object “Big_outprep” so I know it is the output of the prepDocuments for the BigDat/Big_dfm.
<- prepDocuments(Bigdocs_og,
Big_outprep
Bigvocab_og, Bigmeta_og)
Removing 6965 of 11548 terms (6965 of 63765 tokens) due to frequency
Removing 5 Documents with No Words
Your corpus now has 4995 documents, 4583 terms and 56800 tokens.
If you have documents with no words, the prepDocuments function will eliminate the documents. In this case, I had 5 documents with no words, so they were removed. My new corpus object (Big_outprep) has 4995 documents, while the original BigDat dataframe contained 5000 documents (in this case, tweets).
Now I am ready to build my structural topic model using my new corpus object containing the prepped documents, vocabulary, and metadata.
There are some important considerations for the structural topic model:
Number of topics (K)
Initialization type
Max iterations
Prevalence covariates
Content covariates
The most difficult thing about the structural topic model is the lack of objective parameters for determining the above, especially the “K” number of topics.
You can make some arbitrary decisions for number of topics and begin running models, but this is not always the most efficient approach. When you have many documents and it is unclear how many topics might exist, you may find it useful to start with an exploratory t-SNE projection using the Lee and Mimno algorithm (2014). This identifies an exact convex hull (giving a potential “K” number of topics to begin with) for an approximate set of vectors, which operates on a random and partial projection of the co-occurrence matrix (documents-words) instead of using a complete and “exact” input matrix (set of vectors, docs-words) and approximate convex hull (based on the set of “K” number of topics tested), the Arora et al. 2014 algorithm. To use the Lee and Mimno (2014) approach, you set the “init.type” argument to equal “Spectral” and “K” number of topics to equal “0”.
If you are using the Lee and Mimno (2014) algorithm, you already know you will use a “Spectral” initialization. The “Spectral” initialization without K=0 will instead call its opposite (and the stm default), the Aurora et al. (2014) algorithm. The “Spectral” initialization is deterministic, and thus, will produce different results for each model, regardless of seed, even if the same model with the same number of K topics is run several times.
Of the other available initialization types, the “LDA” is useful for communicating your results to many audiences. The Gibbs sampling statistic is useful because the Latent Dirichlet Allocation (LDA) model is a widely used (and therefore widely understood) topic model, and because it can be reproduced by others by setting the seed.
The default maximum number of iterations for the stm is 500, meaning the model will terminate if it has not converged after 500 iterations. You can set this number to be higher or lower, or you can set it to 0 to return the initialization. I tend to keep this at the default unless it becomes clear that I will need many more iterations to reach convergence. If your model converges too quickly, it may be a poor fit.
If you already know that you have variables in your metadata whose effect on prevalence/content you are interested in, you can include those covariates from the outset. I like to run all of my structural topic models without covariates first, then include the covariates one at a time in later models (here following the traditional approach for statistical “controls”).
Remember, big memory jobs can make computers angry. To avoid losing your work, SAVE YOUR WORKSPACE before running any structural topic model. Your future self will thank you.
save.image("/User/ProjectDirectory/NameYourWorkspace.RData")
Let’s estimate a structural topic model with no covariates using the Lee and Mimno (2014) algorithm. Remember to store your stm results in a workspace object. I will call mine “BigSTM_0” to reflect my setting K to equal 0 and the initialization to equal spectral. You can observe the estimation (showing how many topics and words/topic) every few iterations by setting “verbose = TRUE” (or “T”). This will give a lot of output, so sometimes you might find it useful to set verbose to equal FALSE (or, “F”). I’ll set it to “T” TRUE here to show what the output looks like between iterations.
<- stm(documents = Big_outprep$documents,
BigSTM_0 vocab = Big_outprep$vocab,
K = 0,
data = Big_outprep$meta,
init.type = c("Spectral"),
verbose = T)
In this case, the model converged after 5 iterations, and the algorithm projected 79 topics.
Now I have to make a decision–I can run another 79 topic model but with the Arora et al. (2014) algorithm, I can run some diagnostics to decide whether to increase or decrease K number of topics in my next model, or I can run the Lee and Mimno (2014) algorithm again with prevalence/content covariates included.
I can plot the STM to get a better idea of what diagnostics mean in context of information from the model, like most likely words for a topic. I can also get this information with the “summary” function, but the output is not in plot format.
summary(BigSTM_0)
A topic model with 79 topics, 4995 documents and a 4584 word dictionary.
Topic 1 Top Words:
Highest Prob: rt, also, protect, covid, lift, say, im
FREX: lift, valu, fcretir, note, wouldnt, set, tilda
Lift: 114f, kiwuikiwi, vash, fcretir, zellekag, 9month, bankcollap
Score: 114f, western, tilda, swinton, lift, valu, record
Topic 2 Top Words:
Highest Prob: rt, billion, dure, pandem, requir, million, worker
FREX: 265, requir, worker, billion, maker, 250, canadian
Lift: 265, quicktak, cancerdrug, officersget, ohioteach, socia, princesskelli
Score: 265, billion, requir, dure, million, worker, cancerdrug
Topic 3 Top Words:
Highest Prob: march, miss, sign, 2023, rt, seen, worth
FREX: march, miss, sign, across, 2023, worth, 23
Lift: 300b, delilah, dy, hialeah, missi, missingkid, ampcov
Score: 300b, march, miss, sign, across, worth, 2023
Topic 4 Top Words:
Highest Prob: record, medic, level, rt, minut, move, mani
FREX: record, minut, level, seneg, villag, medic, struggl
Lift: 42c, seneg, 457c, pgdyne, villag, veteran, quiet
Score: 42c, record, minut, medic, level, seneg, struggl
Topic 5 Top Words:
Highest Prob: 10, potenti, bot, viral, sniper, tech, rt
FREX: sniper, viral, idk, potenti, bot, 70k, 10
Lift: 70k, hanzoyasunaga, liv, maestro, mc, sniper, idk
Score: 70k, bot, potenti, sniper, viral, idk, tech
Topic 6 Top Words:
Highest Prob: now, rt, give, pandem, without, parent, canada
FREX: alberta, give, without, parent, canada, minor, now
Lift: abdaniellesmith, httpstconn9ketzdez, makismd, vgclements1, alberta, cbcs, elkebabiuk
Score: abdaniellesmith, now, give, canada, without, alberta, parent
Topic 7 Top Words:
Highest Prob: start, us, d, might, protect, big, hour
FREX: might, d, start, mail, hour, hesit, xrp
Lift: ada, mail, httpstco1jdavzjepi, hesit, xrp, vitamin, algo
Score: ada, start, d, might, big, hour, us
Topic 8 Top Words:
Highest Prob: rt, thing, student, chang, get, pandem, system
FREX: applaus, vow, ethanol, oregon, thing, alexbruesewitz, realdonaldtrump
Lift: austin, applaus, vow, 368, billioncan, rdnstai, spe
Score: student, austin, thing, chang, oregon, thousand, mrandyngo
Topic 9 Top Words:
Highest Prob: peopl, mani, rt, pandem, million, begin, someth
FREX: jinxland, rting, bitch, 14000, wale, mani, crazi
Lift: bitch, jinxland, rting, 14000, wale, 10yearold, libbi
Score: bitch, mani, jinxland, rting, peopl, crazi, begin
Topic 10 Top Words:
Highest Prob: risk, can, rt, manag, event, lot, lose
FREX: readi, reward, volatil, 查, debt, manag, som
Lift: brows, baat, kg, mann, modi, thewirein, pharmacogenom
Score: brows, risk, tast, manag, can, readi, reward
Topic 11 Top Words:
Highest Prob: educ, cost, protect, econom, awar, risk, alway
FREX: educ, brianbrenberg, sub, econom, cost, compound, assministr
Lift: burden, brianbrenberg, spiroghost, breakfast, michaelmact, bord, cbp
Score: burden, educ, cost, econom, el, sub, brianbrenberg
Topic 12 Top Words:
Highest Prob: risk, rt, way, bank, investor, took, know
FREX: investor, johnnyxbrown, risk, heart, fat, way, mo
Lift: citizenlenz, genflynn, httpstcooomse2vbjx, peril, johnnyxbrown, httpstcoynb67uplr, stevenvoiceov
Score: risk, citizenlenz, investor, epidemiolog, johnnyxbrown, belli, mo
Topic 13 Top Words:
Highest Prob: person, famili, pandem, rt, probabl, opinion, great
FREX: person, famili, st, probabl, opinion, competitor, contrari
Lift: competitor, systemupd, contrari, resea, st, famili, person
Score: competitor, person, famili, probabl, opinion, st, great
Topic 14 Top Words:
Highest Prob: rt, join, amp, seek, risk, crypto, heighten
FREX: hmm, justice4tigray, tigray, join, genocid, heighten, seek
Lift: barrett, conjunct, ctv, insight, lisa, hmm, justice4tigray
Score: conjunct, join, dynam, seek, hmm, justice4tigray, tigray
Topic 15 Top Words:
Highest Prob: fauci, said, rt, call, irrespons, went, amp
FREX: irrespons, cook, karilak, arrest, cnn, fauci, said
Lift: cook, acosta, lollipop, perjuri, queri, michaelrulli, daddi
Score: cook, fauci, irrespons, said, karilak, cnn, call
Topic 16 Top Words:
Highest Prob: explain, rt, corybook, can, 2nd, pleas, sarscov2
FREX: explain, corybook, 2nd, sarscov2, can, pleas, individu
Lift: corybook, explain, supervisor, 2nd, sarscov2, individu, vulner
Score: corybook, explain, 2nd, can, sarscov2, dear, pleas
Topic 17 Top Words:
Highest Prob: rt, virus, dr, exist, infect, gone, parti
FREX: 75, bola, elluu, obi, officialtelz, pastor, gone
Lift: cowan, httpstcoscnez01aus, mariusknulst, therefor, desir, elhopkin, 75
Score: gone, cowan, dr, exist, virus, 75, bola
Topic 18 Top Words:
Highest Prob: dure, pandem, billion, rt, vs, staff, see
FREX: staff, 20192021, loblaw, net, underpay, richer, billion
Lift: cuban, 20192021, loblaw, net, underpay, dyk, genderbas
Score: cuban, billion, dure, vs, richer, pandem, 43
Topic 19 Top Words:
Highest Prob: state, rt, unit, 2022, start, expert, covid
FREX: unit, 2022, state, apr, highrisk, legislatur, decis
Lift: 1745549, ccpvirus, cumul, decemb, excessdeath, germa, mostransl
Score: cumul, state, unit, 2022, music, apr, highrisk
Topic 20 Top Words:
Highest Prob: protect, someon, love, rt, stay, inform, hide
FREX: decept, dishonest, givi, theholisticpsyc, someon, love, hide
Lift: decept, dishonest, givi, theholisticpsyc, happyharpys, chyna, paulbenedict7
Score: decept, someon, love, dishonest, givi, theholisticpsyc, inform
Topic 21 Top Words:
Highest Prob: covid19, vaccin, pandem, individu, energi, test, mass
FREX: energi, httpstcowe74vshlw, covid19, murder, montana, blood, anniversari
Lift: diagnos, ultimahora, 827js, encod, nucleocapsid, httpstcovggd3vl5lg, 0213
Score: diagnos, covid19, energi, httpstcowe74vshlw, vaccin, moderna, montana
Topic 22 Top Words:
Highest Prob: trump, pandem, creat, respons, end, follow, crisi
FREX: team, follow, obamabiden, disband, creat, rail, trump
Lift: disband, cjjohnson17th, rail, team, cre, httpstcomc8d2rsqeu, obamabiden
Score: disband, trump, rail, noliewithbtc, obamabiden, team, crisi
Topic 23 Top Words:
Highest Prob: now, near, rt, handl, go, ive, pandem
FREX: near, handl, gain, tashi343i, touch, coronavirus, acquir
Lift: dot, theatlant, acquir, medrop, notif, retweetfollow, facebookdown
Score: dot, near, youarelobbylud, handl, gain, coronavirus, now
Topic 24 Top Words:
Highest Prob: day, ani, rt, us, next, safe, trust
FREX: harbor, stackhodl, ani, trust, mythic, sashamackinnon, unmint
Lift: drag, harbor, stackhodl, mythic, sashamackinnon, unmint, whitelist
Score: drag, id, ani, day, child, trust, god
Topic 25 Top Words:
Highest Prob: money, govern, rt, everyth, run, bank, risk
FREX: money, unlimit, calvari, proxi, intend, poor, run
Lift: electricalwsop, riskavers, te, calvari, petti, sacrifi, peterstefanovi2
Score: electricalwsop, money, everyth, govern, run, drunk, poor
Topic 26 Top Words:
Highest Prob: will, everi, right, rt, friend, men, tri
FREX: httpstcoy9nzbiqlkc, movetheworldus, bless, smile, citizenfreepr, jackmal, pad
Lift: elimin, useounasscodenamshicouponnoondiscountsivvipromo, farmland, grown, lat, ontariofarm, pave
Score: will, elimin, httpstcoy9nzbiqlkc, movetheworldus, smile, everi, friend
Topic 27 Top Words:
Highest Prob: bailout, rt, svb, weve, far, seen, blue
FREX: bailout, dcdraino, blue, weve, far, disguis, looser
Lift: fai, looser, dcdraino, esg, quantit, skillset, climatetech
Score: fai, bailout, dcdraino, disguis, blue, relief, weve
Topic 28 Top Words:
Highest Prob: let, rt, c, protect, like, build, economi
FREX: let, build, disappear, c, twitterhol, economi, piec
Lift: ash, forb, independ, ingramethoma, kate, regan, clas
Score: forb, let, c, economi, build, twitterhol, disappear
Topic 29 Top Words:
Highest Prob: stop, rt, nation, covid, 1, seem, idea
FREX: nation, excus, politician, stop, pezntjournalist, intellig, seem
Lift: frankdescushin, fleec, monkey, mirror, rearview, reced, viol
Score: frankdescushin, stop, nation, pezntjournalist, politician, seem, intellig
Topic 30 Top Words:
Highest Prob: t, help, rate, fed, rt, find, fight
FREX: fed, help, rate, religion, t, butat, cortesstev
Lift: gilt, butat, cortesstev, risemelbourn, langley, ucc, ukrcancongress
Score: gilt, rate, fed, help, t, find, butat
Topic 31 Top Words:
Highest Prob: sinc, 2, children, rt, p, real, gate
FREX: iluminatibot, abhor, revel, steviemat, devast, sinc, stupid
Lift: greed, iluminatibot, httpstco2smkyugb9n, ndp, fraudul, orchestr, abhor
Score: greed, iluminatibot, sinc, stupid, p, 2, abhor
Topic 32 Top Words:
Highest Prob: know, rt, america, lie, annaapp91838450, china, corrupt
FREX: annaapp91838450, badass, gat, omgpatriot, america, hold, corrupt
Lift: gwenmommabear, kevin, unnot, 3year, dozen, lovenegan, badass
Score: gwenmommabear, know, annaapp91838450, america, badass, gat, omgpatriot
Topic 33 Top Words:
Highest Prob: covid, time, rt, vaccin, dure, wasnt, good
FREX: whistleblow, graphen, oxid, er, httpstco4l2wigdt6, time, wasnt
Lift: httpstco5lij7dslv9, er, httpstco4l2wigdt6, httpstcoemiv7upaig, mbrookerhk, remad, graphen
Score: covid, httpstco5lij7dslv9, time, vaccin, whistleblow, graphen, oxid
Topic 34 Top Words:
Highest Prob: year, rt, 3, pandem, just, dead, pfizer
FREX: 101, reanalysi, unexpect, huy, quan, drloupi, httpstcocnowjseptt
Lift: httpstcocnowjseptt, fastest, pac, huy, quan, wwe, biontain
Score: httpstcocnowjseptt, 3, dead, year, pfizer, drloupi, unexpect
Topic 35 Top Words:
Highest Prob: new, rt, us, can, last, visit, confirm
FREX: dc, default, new, confirm, visit, dod, christma
Lift: httpstcoep68bjb4bm, 35842, 3716045, dc, categori, classifi, viriformsa
Score: new, httpstcoep68bjb4bm, dc, christma, confirm, dod, visit
Topic 36 Top Words:
Highest Prob: death, rt, name, risk, vaccin, gun, dog
FREX: nycacc, ep, govkathyhochul, httpstconetodjsjvz, venetianblond, death, dog
Lift: ep, govkathyhochul, httpstconetodjsjvz, venetianblond, 13m, 330k, denisrancourt
Score: httpstconetodjsjvz, death, name, dog, nycacc, calcul, 13
Topic 37 Top Words:
Highest Prob: virus, got, doe, immun, effect, releas, like
FREX: got, almost, releas, effect, immun, doe, quit
Lift: httpstcorc7tqx2wlg, delthiarick, genom, httpstco1d6pzpy91d, wifeyalpha, proport, enemyinast
Score: httpstcorc7tqx2wlg, got, virus, doe, effect, immun, releas
Topic 38 Top Words:
Highest Prob: tell, yet, rt, havent, heavi, strong, rain
FREX: heavi, 48, atmospher, rain, usstormwatch, feet, yet
Lift: httpstco1ymrfhrnhr, iconoclast1919, johniadarola, uhh, willeckert99, 48, atmospher
Score: iconoclast1919, tell, coffe, httpstcoieid4tgr3c, nywolforg, yet, havent
Topic 39 Top Words:
Highest Prob: rt, bank, action, protect, b, silicon, valley
FREX: action, b, wealth, joshuaphill, millionair, leap, angri
Lift: anastas25608217, illiquid, mitziforpelosi, ryanafourni, stevenjfrisch, 5bn, bitcoinmagazin
Score: illiquid, action, joshuaphill, millionair, billionair, wealth, leap
Topic 40 Top Words:
Highest Prob: make, rt, public, health, actual, fit, takethatct
FREX: make, takethatct, public, kentlee47, fit, meltvirus, health
Lift: kentlee47, meltvirus, takethatct, ghopp, lummhandi, maureenstroud, make
Score: kentlee47, make, public, takethatct, rt, health, ghopp
Topic 41 Top Words:
Highest Prob: get, back, job, noth, shit, rt, work
FREX: augustjpollak, back, shit, noth, job, get, greedi
Lift: kfish2691, augustjpollak, newborn, thinkpiec, airbnb, itjustmeez, betray
Score: kfish2691, get, back, job, augustjpollak, shit, greedi
Topic 42 Top Words:
Highest Prob: rt, maggiehassan, whitehous, usun, senatormenendez, sfrcdem, senateforeign
FREX: maggiehassan, usun, senatormenendez, sfrcdem, whitehous, senateforeign, helentekulu
Lift: maggiehassan, senatormenendez, sfrcdem, usun, senateforeign, whitehous, tigrayt21
Score: maggiehassan, usun, senatormenendez, sfrcdem, senateforeign, whitehous, dear
Topic 43 Top Words:
Highest Prob: didnt, thank, place, remind, rt, zero, plan
FREX: didnt, remind, place, thank, alterivan1, cholera, stakehold
Lift: matam, alterivan1, cholera, stakehold, progress, elainecarol3, danitasteinberg
Score: matam, didnt, place, thank, remind, zero, alterivan1
Topic 44 Top Words:
Highest Prob: rt, protect, pandem, mcfunni, tenebra99, thisisnothappen, turn
FREX: mcfunni, rt, tenebra99, thisisnothappen, protect, turn, negat
Lift: mcfunni, tenebra99, thisisnothappen, negat, rt, turn, brought
Score: mcfunni, rt, tenebra99, thisisnothappen, protect, pandem, negat
Topic 45 Top Words:
Highest Prob: im, rt, ill, sure, hospit, whi, busi
FREX: im, censored4sur, sure, amitrippedcat, alobarkahn, danielbaronn, baffl
Lift: method, cap, cervic, depoprovera, diaphragm, emilylhaus, hormon
Score: method, im, censored4sur, ill, amitrippedcat, sure, noteveri
Topic 46 Top Words:
Highest Prob: man, rt, made, work, protect, wall, bought
FREX: wall, bought, carcinogen, injecti, virul, naomirwolf, man
Lift: monterey, academi, carcinogen, injecti, virul, abfedlabour, albertan
Score: monterey, man, partner, wall, bought, carcinogen, injecti
Topic 47 Top Words:
Highest Prob: veri, black, women, peopl, white, covid, number
FREX: white, black, veri, women, blame, ne, jimmydor
Lift: ne, cbergie1007, flvoicenew, notkus, preexist, male, oppress
Score: ne, black, white, veri, women, blame, jimmydor
Topic 48 Top Words:
Highest Prob: elonmusk, thechiefnerd, rt, mrna, world, covid, product
FREX: thechiefnerd, elonmusk, product, mrna, ai, conserv, cancer
Lift: antibodi, enzolyt, monoclon, onlytrippi, chicago1ray, delhi, ea
Score: onlytrippi, thechiefnerd, elonmusk, mrna, product, spartajustic, cancer
Topic 49 Top Words:
Highest Prob: onli, fdic, happen, rt, insur, protect, u
FREX: fdic, repthomasmassi, happen, premium, paid, u, patrickbetdavid
Lift: 125b, 126, patrickbetdavid, repthomasmassi, 250000, fearmong, selectiv
Score: patrickbetdavid, fdic, insur, repthomasmassi, onli, premium, depositor
Topic 50 Top Words:
Highest Prob: think, read, serious, interest, covid, pandem, e
FREX: think, paragraph, httpstco7eepdml4yd, prisonplanet, e, read, 55000000
Lift: piss, httpstco7eepdml4yd, prisonplanet, morgfair, antivaxx, chud, httpstcobunzmay8um
Score: piss, think, httpstco7eepdml4yd, prisonplanet, late, serious, read
Topic 51 Top Words:
Highest Prob: go, 2, month, print, folk, short, oh
FREX: print, go, folk, month, scoop, basic, ship
Lift: prong, aleresnik, transitori, ninobox, scoop, fau, 40
Score: prong, go, oh, print, ninobox, folk, scoop
Topic 52 Top Words:
Highest Prob: virus, use, rt, whi, spread, covid19, befor
FREX: spread, martin, david, hemorrhag, use, cdcthe, oper
Lift: puneindia, npis, retrospect, transm, hemorrhag, advent, jonddo
Score: puneindia, spread, virus, use, cdcthe, corona, factcheck
Topic 53 Top Words:
Highest Prob: xi, jinp, ccp, rt, weapon, enemi, earth
FREX: jinp, enemi, xi, ccp, weapon, dictat, humankind
Lift: purplerose19999, enemi, dictat, humankind, jinp, xi, weapon
Score: purplerose19999, jinp, xi, enemi, weapon, ccp, dictat
Topic 54 Top Words:
Highest Prob: one, even, rt, becaus, will, via, mean
FREX: closer, totalitarian, tweetiez, jesus, one, via, rest
Lift: rebound, cbsnew, httpstcodturqddxwc, 35mph, nationwid, reddit, rubber
Score: one, rebound, even, via, youtub, closer, totalitarian
Topic 55 Top Words:
Highest Prob: know, everyon, part, tri, just, account, rt
FREX: part, everyon, account, digit, currenc, catturd2, tri
Lift: rosewind2007, httpstcowh7lvakmsr, everyt, marcmeetsworld1, catturd2, currenc, digit
Score: rosewind2007, know, currenc, catturd2, digit, account, scare
Topic 56 Top Words:
Highest Prob: work, lockdown, protect, covid, meant, shift, pandem
FREX: sccounti, soft, shift, meant, remot, work, lockdown
Lift: sccounti, soft, remot, dwr, parjaro, usacehq, core
Score: sccounti, work, lockdown, meant, dwr, parjaro, usacehq
Topic 57 Top Words:
Highest Prob: secur, s, social, rt, abl, american, senatedem
FREX: senatedem, generat, retir, abl, social, secur, s
Lift: featur, artin, erickson, hysteria, massihi, realpaulmuse, whistl
Score: senatedem, secur, s, abl, social, generat, retir
Topic 58 Top Words:
Highest Prob: rt, senatorshaheen, isnt, issu, wrong, senat, correct
FREX: senatorshaheen, issu, isnt, correct, senat, wrong, dear
Lift: senatorshaheen, correct, supervisor, isnt, issu, senat, dear
Score: senatorshaheen, isnt, senat, issu, dear, correct, wrong
Topic 59 Top Words:
Highest Prob: rt, two, human, rememb, commit, import, year
FREX: often, human, two, outsourc, solomonmissouri, peac, posit
Lift: sensass, senato, senatorf, secblinken, eve, outsourc, solomonmissouri
Score: sensass, two, commit, atroc, often, crime, peac
Topic 60 Top Words:
Highest Prob: dont, mask, anyon, rt, protect, pleas, allow
FREX: ausbassi, environ, tonightpray, anyon, dont, abil, yo
Lift: socialistlyakwd, abejmorri, httpstco6lhq2m2apj, propagan, rainnwilson, taylorvizionn, ausbassi
Score: socialistlyakwd, dont, mask, ausbassi, environ, tonightpray, anyon
Topic 61 Top Words:
Highest Prob: peopl, still, cant, around, just, rt, believ
FREX: around, cant, peopl, soulless, believ, still, 1lonerlifestyl
Lift: soulless, dowellml, 1lonerlifestyl, essenc, tryna, utter, ge
Score: soulless, peopl, cant, around, still, believ, 1lonerlifestyl
Topic 62 Top Words:
Highest Prob: offic, rt, fact, continu, program, us, senior
FREX: offic, blind, program, strangerf, senior, continu, fact
Lift: strangerf, cor, blind, foundat, melinda, emeraldrobinson, offic
Score: strangerf, offic, continu, fact, program, senior, blind
Topic 63 Top Words:
Highest Prob: oh, pro, though, pull, look, whi, refus
FREX: exec, pro, though, matthewstol, oh, riskmanag, paul
Lift: suella, derail, proun, tomthunkitsmind, 4190, rizzaislam, ding
Score: suella, oh, pro, though, pull, beg, exec
Topic 64 Top Words:
Highest Prob: great, articl, rt, yeah, twitter, mental, risk
FREX: twitter, great, co2, mental, swipe, aukus, chebianquita
Lift: swipe, plenti, chebianquita, dakeldsen, co2, twitter, submarin
Score: swipe, great, twitter, co2, aukus, chebianquita, dakeldsen
Topic 65 Top Words:
Highest Prob: bank, rt, valley, silicon, rule, trump, media
FREX: ovat, rule, legaci, penni, media, roll, collaps
Lift: tarabull808, insanit, kelli, ovat, bide, lefti, sleight
Score: bank, tarabull808, valley, silicon, rule, roll, donald
Topic 66 Top Words:
Highest Prob: well, rt, pandem, global, 1, im, coupl
FREX: choir, preach, sho, well, brownecfm, coupl, workforc
Lift: taylor, breonna, senduckworth, sworn, woken, choir, httpstcot
Score: taylor, well, choir, preach, sho, coupl, workforc
Topic 67 Top Words:
Highest Prob: alway, protect, talk, hes, anim, constitut, beauti
FREX: alway, beauti, pain, constitut, anim, saw, talk
Lift: thetoddhirsch, dome, midnightmuseumep3, gabi, thebachelor, lekeolushuyi, sili
Score: thetoddhirsch, alway, dome, midnightmuseumep3, talk, anim, protect
Topic 68 Top Words:
Highest Prob: virus, come, rt, china, lab, side, wrong
FREX: 300k, chrosthugo, microfluid, neuron, uniqu, side, come
Lift: time4divorc, 300k, chrosthugo, microfluid, neuron, uniqu, housefli
Score: time4divorc, virus, come, side, wendyor, httpstcoeurywyez3, monkeyking67
Topic 69 Top Words:
Highest Prob: want, rt, protect, kill, die, sever, matter
FREX: justiceforeden, punishedmoth, press, younger, want, sever, voldemorgoret
Lift: quo, toman, justiceforeden, punishedmoth, voldemorgoret, cutest, hyung
Score: want, toman, justiceforeden, punishedmoth, abov, press, stori
Topic 70 Top Words:
Highest Prob: o, reason, rt, expect, univers, covid, becaus
FREX: expect, reason, o, griffith, univers, weak, finger
Lift: chrisjmerch, georgefreemanmp, httpstcowwdgnysdyx, nceoscienc, nercscienc, uniofleicest, friedmanja
Score: uniofleicest, reason, o, expect, univers, griffith, syndrom
Topic 71 Top Words:
Highest Prob: amp, rt, covid, sexual, rape, report, increas
FREX: sa, amp, railroad, report, sexual, block, rape
Lift: url, 57, sa, emiss, andyjay1, dbkell, dec
Score: url, amp, sexual, rape, sa, tigrayan, block
Topic 72 Top Words:
Highest Prob: told, close, presid, suggest, law, guess, offici
FREX: told, suggest, guess, offici, close, restrict, claim
Lift: usbr, marshablackburn, rank, suggest, guess, transgend, incent
Score: usbr, told, close, suggest, presid, guess, law
Topic 73 Top Words:
Highest Prob: take, like, rt, feel, just, wont, life
FREX: take, feel, aidanthejest, brown, wont, manipul, promis
Lift: vanish, aacoek, phrase, aidanthejest, brown, smarter, buildjakapan
Score: vanish, take, trhloffici, like, wont, fuck, feel
Topic 74 Top Words:
Highest Prob: put, push, gov, air, crimin, pandem, cpi
FREX: gov, put, air, cpi, coup, watsonvillec, push
Lift: watsonvillec, cspan, mishandl, coup, gov, cpi, air
Score: watsonvillec, put, gov, push, air, crimin, cpi
Topic 75 Top Words:
Highest Prob: woke, mind, virus, caus, infect, rt, like
FREX: woke, mind, georgetakei, infect, caus, gop, call
Lift: jayblackisfunni, jess, mampm, neutral, potatohead, timodc, unwok
Score: woke, mind, watter, georgetakei, infect, gop, caus
Topic 76 Top Words:
Highest Prob: live, much, bad, show, yes, rt, covid
FREX: live, much, cancel, tv, bad, book, coguest
Lift: whilst, coguest, ian, madg, difficulti, bringbackmask, covidisairborn
Score: whilst, live, much, bad, yes, cancel, show
Topic 77 Top Words:
Highest Prob: covid, rt, today, flu, surviv, credit, https
FREX: spanish, wwii, 104, depressionbutt, epochinspir, grandpa, surviv
Lift: wwii, 104, depressionbutt, epochinspir, grandpa, spanish, fleet
Score: wwii, flu, surviv, today, covid, spanish, 104
Topic 78 Top Words:
Highest Prob: need, rt, see, order, protect, potus, pay
FREX: assert, rahraw999, wealthi, need, order, non, sivb
Lift: zelenskyy, 2x2sometimes5, evict, kievpechersk, lavra, monast, monk
Score: zelenskyy, need, assert, rahraw999, wealthi, non, order
Topic 79 Top Words:
Highest Prob: rt, protect, covid, 신고, pandem, say, us
FREX: rt, protect, 신고, say, covid, whi, becaus
Lift: 신고, rt, protect, say, agre, long, whi
Score: 신고, rt, protect, pandem, covid, say, virus
The “summary” plot type shows words per topic alongside expected topic proportion (what proportion of total documents fall into this topic).
par(mar = c(4, 1, 2.5, 3),
oma = c(0, 0, 0, 0),
cex = 0.6,
pin = c(8, 8),
pty = c("m"))
plot(BigSTM_0,
type = c("summary"),
n = 15,
main = c("Topics - Structural Topic Model with 79 Topics"),
xlab = c("Expected Topic Proportion"),
width = 30,
text.cex = 0.7)
Now let’s run some diagnostics. I can start by plotting the topic quality, showing the semantic coherence (how much sense do these topics make, are they meaningful) versus exclusivity (how unique are the topic words to the topic–are these words included in many other topics (low exclusivity) or few other topics (high exclusivity)).
topicQuality(model = BigSTM_0,
documents = Big_outprep$documents,
xlab = "Semantic Coherence",
ylab = "Exclusivity",
labels = 1:ncol(BigSTM_0$theta),
M = 20)
[1] -1023.0853 -1160.9862 -954.0572 -1139.3616 -960.9008 -964.5784
[7] -1091.0152 -829.6952 -901.8136 -1150.0287 -1142.7408 -934.4556
[13] -1101.1677 -821.9230 -771.5503 -1106.2812 -953.2280 -831.2214
[19] -967.9256 -994.1188 -1197.7351 -859.0507 -1168.7762 -1145.7079
[25] -668.8475 -1305.1354 -621.3109 -1080.8161 -994.7254 -871.1093
[31] -1196.4457 -889.3234 -1066.8819 -915.0058 -798.9292 -1060.3618
[37] -1118.3080 -1011.3512 -857.9551 -1317.8524 -708.9720 -815.3447
[43] -1099.9524 -1259.4082 -1080.3005 -1022.0708 -1189.7477 -996.1223
[49] -797.1227 -1069.9073 -921.1437 -804.2461 -1084.5479 -1001.0709
[55] -897.9858 -1078.5455 -1021.1512 -1074.3307 -909.6860 -990.8653
[61] -924.9148 -1200.7047 -1120.2440 -1238.4450 -765.4181 -883.9233
[67] -1123.7198 -853.5613 -994.4648 -1161.5031 -1108.2359 -1110.8285
[73] -1004.9981 -1012.9155 -883.4362 -1073.1122 -1017.7800 -749.4770
[79] -867.3861
[1] 19.82254 19.75266 19.88709 19.88699 19.93562 19.69565 19.82441 19.61715
[9] 19.70505 19.49311 19.75324 19.63148 19.90342 19.69641 19.79883 19.95059
[17] 19.68120 19.80182 19.62447 19.78001 19.59195 19.94585 19.69128 19.49511
[25] 19.74379 19.81719 19.84632 19.62904 19.67723 19.88224 19.82853 19.83595
[33] 19.55016 19.76268 19.66723 19.67487 19.78735 19.75415 19.80070 19.93359
[41] 19.69270 19.95824 19.91858 19.94020 19.75145 19.78643 19.58733 19.65033
[49] 19.84558 19.74274 19.93928 19.65187 19.93177 19.57602 19.88914 19.93758
[57] 19.90345 19.94494 19.74793 19.77599 19.91996 19.94323 19.76807 19.87335
[65] 19.57746 19.79724 19.76826 19.65793 19.76673 19.52631 19.59764 19.91387
[73] 19.66209 19.94798 19.85681 19.74058 19.61259 19.66550 19.93451
In this case, the topics have somewhat high exclusivity (scored out of M words, which I set to 20 here). I can look at the large negative scores for semantic coherence as something to cause me to pause–are these topics lacking in semantic coherence (are we getting something like “and the we” as a topic?)? I can also look at the wide range of semantic coherence, though, here ranging from less than -1200 to about -600. My first thought is that I can find a better fitting model, which makes sense considering the quick convergence (5 iterations with max set to 500).
I can also check the residual dispersion. Residual dispersion greater than 1 is evidence that you may need more topics. Here I am looking to fail to reject the null hypothesis that residual dispersion = 1. If i can reject the null hypothesis because dispersion > 1, I should consider another model with a greater number of topics. If I test many models and residual dispersion becomes negative (undefined), I may start to consider that there are other confounding factors and select the number of topics with the dispersion score closest to zero while still being greater than zero.
checkResiduals(BigSTM_0, Big_outprep$documents)
$dispersion
[1] 172.4054
$pvalue
[1] 0
$df
[1] 120035
Here I have a significant p-value (0.00) and a dispersion score much higher than 1 (15.8008). I could still run another 75 topic model, but based on these results, it will likely be a better fit to choose a larger number of topics.
The “hist” plot type shows the MAP scores for each topic. You can typically plot the MAP scores for up to 20 topics at a time, which means you will need to obtain several plots for a model with 79 topics like this one has. Let’s pull the MAP scores for the first 20 topics here.
par(mar = c(5, 4, 4, 2),
oma = c(1, 1, 1, 1),
cex = 0.6,
pty = c("m"),
cex.lab = 0.8)
plot(BigSTM_0,
type = c("hist"),
topics = c(1:20))
I can change the range of topics in my code to plot the MAP scores for the remaining topics in the model.
par(mar = c(5, 4, 4, 2),
oma = c(1, 1, 1, 1),
cex = 0.6,
pty = c("m"),
cex.lab = 0.8)
plot(BigSTM_0,
type = c("hist"),
topics = c(21:40))
par(mar = c(5, 4, 4, 2),
oma = c(1, 1, 1, 1),
cex = 0.6,
pty = c("m"),
cex.lab = 0.8)
plot(BigSTM_0,
type = c("hist"),
topics = c(41:60))
par(mar = c(5, 4, 4, 2),
oma = c(1, 1, 1, 1),
cex = 0.6,
pty = c("m"),
cex.lab = 0.8)
plot(BigSTM_0,
type = c("hist"),
topics = c(61:79))
One last thing to note is that models with greater than 100,000 topics tend to work best with models starting between 60-100 topics. In this case, I only have 4,995 documents. I can ignore this for now, or I can make a mental note to keep an eye out for negative residuals. I’ll look for the negatives, and I will run at least 1 other model with fewer than 79 topics.
Depending on your total number of documents, you may decide to test subsequent models with several different K numbers of topics using one of two techniques. You can run each model and compare, or you can use the “manyTopics” function to run a series of models with a list of K numbers of topics. Here I will demonstrate how to use the “manyTopics” function. To obtain one stm with 20 topics, one with 50 topics, and one with 100 topics, I set K to equal “c(20, 50, 100)”. I will set verbose to equal FALSE (“F”) this time to avoid an extremely long output from the three models.
<- manyTopics(documents = Big_outprep$documents,
BigSTM_manyT vocab = Big_outprep$vocab,
K = c(20, 50, 100),
data = Big_outprep$meta,
verbose = F,
init.type = c("Spectral"))
The manyTopics object contains your STM objects. I set K to equal three different values: 20, 50, and 100. This means my manyTopics object should contain three STM objects, one with a 20 topic model, one with a 50 topic model, and one with a 100 topic model. You should extract your STM objects from the manyTopics object, then you can run some diagnostics like we did in Step 6 with the 79 topics model.
Here I will create three new STM objects named “BigSTM_20”, “BigSTM_50”, and “BigSTM_100”
<- BigSTM_manyT$out[[1]]
BigSTM_20 <- BigSTM_manyT$out[[2]]
BigSTM_50 <- BigSTM_manyT$out[[3]] BigSTM_100
You should always inspect your new objects to ensure you have stored the information correctly (and, stored the correct information).
summary(BigSTM_20)
A topic model with 20 topics, 4995 documents and a 4584 word dictionary.
Topic 1 Top Words:
Highest Prob: rt, vaccin, covid19, covid, befor, new, death
FREX: cdcthe, hughmankind, truth, factcheck, david, often, default
Lift: frequenc, geograph, marburg, nfschagnew, 116, 17000, 17th
Score: vaccin, covid19, ultimahora, truth, hughmankind, cdcthe, wear
Topic 2 Top Words:
Highest Prob: protect, didnt, govern, becaus, take, us, even
FREX: border, wildlif, buy, illeg, confou, lol, legisl
Lift: advantag, bord, cancerdrug, cbp, elimin, nytim, paso
Score: protect, elimin, didnt, govern, buy, tri, border
Topic 3 Top Words:
Highest Prob: pandem, time, rt, get, noth, global, 2
FREX: noth, gate, er, httpstco4l2wigdt6, md, era, augustjpollak
Lift: cliffohio, httpstcozvaj0jhzoz, realest, unseri, greed, er, httpstco4l2wigdt6
Score: pandem, time, greed, noth, 2, era, global
Topic 4 Top Words:
Highest Prob: trump, rt, creat, respons, crisi, end, follow
FREX: rail, decept, dishonest, givi, theholisticpsyc, justiceforeden, punishedmoth
Lift: rope, album, allforj, decept, dishonest, electricalwsop, ep
Score: rail, disband, noliewithbtc, trump, obamabiden, team, administr
Topic 5 Top Words:
Highest Prob: peopl, can, rt, dure, say, children, begin
FREX: sniper, bot, stupid, narrat, brand, begin, control
Lift: 70k, 827js, competitor, maestro, mc, sniper, swipe
Score: peopl, 827js, can, children, dure, begin, say
Topic 6 Top Words:
Highest Prob: rt, bailout, just, state, seen, weve, far
FREX: ausbassi, environ, tonightpray, 1lonerlifestyl, bitch, essenc, soulless
Lift: 1lonerlifestyl, 800, annaekstrom, ausbassi, backward, bitch, bridgen
Score: bailout, disguis, dcdraino, resudesu, seen, weve, relief
Topic 7 Top Words:
Highest Prob: rt, bank, right, day, thank, today, place
FREX: signatur, millionair, action, silverg, dive, taken, lift
Lift: 2012z, 25k, 5bn, 76, anastas25608217, bitcoinmagazin, crown9th
Score: signatur, meltvirus, action, right, thank, bank, day
Topic 8 Top Words:
Highest Prob: virus, rt, know, never, first, th, china
FREX: mrschines, wwe, graphen, oxid, mitch, parasit, fli
Lift: bullsnip, desir, graphen, las, oxid, 3year, 4190
Score: washburnealex, virus, china, wuhan, mrschines, th, first
Topic 9 Top Words:
Highest Prob: rt, virus, fauci, know, china, said, come
FREX: badass, gat, omgpatriot, beij, bio, neurolog, xuanwu
Lift: 1979hab, humankind, ursula, 35mph, academi, allbitenobark88, badass
Score: fauci, badass, gat, omgpatriot, virus, impeach, china
Topic 10 Top Words:
Highest Prob: rt, protect, thing, bank, presid, insur, happen
FREX: premium, ethanol, applaus, vow, repthomasmassi, kevin, unnot
Lift: 250000, alterivan1, cholera, mythic, sashamackinnon, stakehold, unmint
Score: insur, presid, applaus, vow, ethanol, fdic, libsoftiktok
Topic 11 Top Words:
Highest Prob: risk, rt, need, mani, want, die, dont
FREX: peer, sake, denisedewald, ok, discuss, better, greedi
Lift: dist, goodvibepolitik, httpstcoynb67uplr, mollyjongfast, responsib, stevenvoiceov, acct
Score: tag, need, wrong, mani, risk, die, shit
Topic 12 Top Words:
Highest Prob: covid, rt, long, elonmusk, biden, mrna, thechiefnerd
FREX: thechiefnerd, symptom, product, utter, outcom, ge, long
Lift: 33, 57, alvi, amt, auction, chicago, comptrol
Score: covid, genius, thechiefnerd, elonmusk, mrna, long, matter
Topic 13 Top Words:
Highest Prob: rt, amp, pfizer, risk, im, 5, find
FREX: odd, 101, reanalysi, choir, preach, sho, benefit
Lift: arsenal, choir, ethan, httpstcotvkzapivoj, karma, kiwuikiwi, merry123459
Score: readi, pfizer, 101, reanalysi, odd, per, trial
Topic 14 Top Words:
Highest Prob: rt, everi, got, virus, veri, oh, tell
FREX: effect, secret, 75, bola, elluu, obi, officialtelz
Lift: annemar45451941, coffe, httpstcoieid4tgr3c, hydroxychloroquin, ident, kiel, mspopok
Score: heal, got, everi, oh, effect, print, virus
Topic 15 Top Words:
Highest Prob: rt, one, virus, caus, woke, infect, dont
FREX: georgetakei, legaci, ovat, jesus, abhor, closer, revel
Lift: elhopkin, exasper, hemorrhag, hidden, hydroxychloriquin, leilanidowd, permitl
Score: virologist, virus, one, woke, georgetakei, infect, caus
Topic 16 Top Words:
Highest Prob: rt, go, will, risk, let, protect, bank
FREX: budget, penni, key, 300k, chrosthugo, microfluid, neuron
Lift: captiv, dharmatrad, eden, featur, hkeskiva, insanit, kelli
Score: go, battl, let, investor, key, social, budget
Topic 17 Top Words:
Highest Prob: rt, student, sinc, system, school, chang, public
FREX: oregon, prosecut, mrandyngo, mbalter, thousand, coguest, ian
Lift: bioreal, bull, coguest, ian, los, madg, mbalter
Score: mbalter, student, oregon, mrandyngo, thousand, school, ten
Topic 18 Top Words:
Highest Prob: rt, like, bank, risk, amp, look, irrespons
FREX: cook, karilak, sexual, cnn, someth, irrespons, arrest
Lift: belli, bombast, breitbartnew, carcinogen, consider, ding, downr
Score: irrespons, like, cook, karilak, sigmat, bank, look
Topic 19 Top Words:
Highest Prob: year, rt, just, 3, pandem, last, dead
FREX: catastroph, unexpect, berniesand, oscar, ke, huy, quan
Lift: berniesand, christma, huy, looser, observ, oscar, quan
Score: year, dead, unexpect, catastroph, drloupi, huy, quan
Topic 20 Top Words:
Highest Prob: rt, now, think, whi, get, use, start
FREX: feel, think, ive, reduc, must, saw, now
Lift: conceptualjam, hdteeve, veteran, homeless, compli, tashi343i, korevirus
Score: conceptualjam, think, now, feel, whi, ive, start
summary(BigSTM_50)
A topic model with 50 topics, 4995 documents and a 4584 word dictionary.
Topic 1 Top Words:
Highest Prob: covid19, rt, mrna, world, vaccin, warn, matter
FREX: product, covid19, murder, danger, mrna, found, warn
Lift: encod, jama, npis, nucleocapsid, puneindia, retrospect, transm
Score: covid19, ultimahora, mrna, warn, product, spartajustic, danger
Topic 2 Top Words:
Highest Prob: ill, potus, social, kid, black, serv, medicar
FREX: energi, noteveri, serv, ill, black, bless, western
Lift: elimin, useounasscodenamshicouponnoondiscountsivvipromo, httpstcomhl2f97a1r, mingyulog, capac, httpstcog1ycexd5la, jonathanbodnar
Score: elimin, ill, social, potus, black, medicar, western
Topic 3 Top Words:
Highest Prob: children, rt, creat, week, care, abl, one
FREX: outsourc, solomonmissouri, children, abhor, revel, steviemat, straight
Lift: greed, outsourc, solomonmissouri, dcverso1, kevros765, exasper, verbos
Score: greed, creat, children, crisi, abl, week, proposit
Topic 4 Top Words:
Highest Prob: trump, respons, end, rt, pandem, follow, administr
FREX: administr, end, respons, team, noliewithbtc, rail, trump
Lift: backward, repronnyjackson, rail, noliewithbtc, disband, administr, team
Score: trump, disband, rail, obamabiden, noliewithbtc, team, follow
Topic 5 Top Words:
Highest Prob: bank, silicon, valley, took, risk, investor, will
FREX: investor, silicon, bank, valley, mo, 827js, coffe
Lift: 827js, ashishkjha46, calendar, cdcdirector, longcovidawarenessday, march15, norpelsam
Score: bank, silicon, valley, 827js, investor, took, risk
Topic 6 Top Words:
Highest Prob: pandem, time, dure, rt, first, left, wake
FREX: httpstcowe74vshlw, dure, wolsn, treati, cliffohio, httpstcozvaj0jhzoz, left
Lift: cliffohio, httpstcozvaj0jhzoz, unseri, amt, auction, httpstco5lij7dslv9, httpstcowe74vshlw
Score: pandem, time, dure, resudesu, first, treati, left
Topic 7 Top Words:
Highest Prob: us, thank, didnt, rt, also, protect, million
FREX: wouldnt, pressur, thank, abandon, didnt, led, god
Lift: muthoninjakw, streng, meltvirus, ment, pressur, sincer, tinkswonu
Score: thank, meltvirus, didnt, pressur, million, also, continu
Topic 8 Top Words:
Highest Prob: pfizer, amp, 5, find, rt, harm, per
FREX: 101, reanalysi, benefit, pfizer, per, harm, trial
Lift: httpstcoynb67uplr, stevenvoiceov, 101, reanalysi, washburnealex, conjunct, pbis
Score: pfizer, washburnealex, 101, reanalysi, odd, per, trial
Topic 9 Top Words:
Highest Prob: china, rt, th, wuhan, know, origin, virus
FREX: hold, mrschines, th, china, wuhan, origin, mitch
Lift: beij, frame, httpstcoczlrqfornm, impeach, mrschines, neurolog, xuanwu
Score: china, impeach, wuhan, th, mrschines, hold, mitch
Topic 10 Top Words:
Highest Prob: rt, theyr, happen, take, understand, u, protect
FREX: atroc, repthomasmassi, ninobox, premium, theyr, happen, hmm
Lift: 2x2sometimes5, 96, atroc, babyberoo96, brexitbust, charlott, drawbridg
Score: libsoftiktok, theyr, atroc, repthomasmassi, premium, u, commit
Topic 11 Top Words:
Highest Prob: come, person, wrong, rt, lockdown, friend, covid
FREX: wrong, person, come, fcretir, disastr, lockdown, closur
Lift: realpeteyb123, desir, tag, text, wro, fam, wrong
Score: wrong, tag, come, person, lockdown, disastr, fcretir
Topic 12 Top Words:
Highest Prob: covid, rt, bailout, state, far, weve, seen
FREX: graphen, oxid, disguis, attend, dcdraino, whistleblow, coldlik
Lift: captiv, graphen, httpstcotvkzapivoj, karma, los, oxid, politicalprison
Score: covid, bailout, genius, disguis, dcdraino, relief, weve
Topic 13 Top Words:
Highest Prob: risk, rt, svb, manag, money, put, head
FREX: belli, johnnyxbrown, looser, readi, calvari, risk, fai
Lift: belli, goodvibepolitik, johnnyxbrown, responsib, lhfang, readi, reskless
Score: risk, readi, svb, manag, head, money, looser
Topic 14 Top Words:
Highest Prob: got, everi, rt, effect, 2020, anim, doe
FREX: got, secret, anim, effect, everi, httpstcoy9nzbiqlkc, movetheworldus
Lift: explor, haunt, heal, httpstcoy9nzbiqlkc, hydroxychloroquin, importan, ingredi
Score: heal, got, everi, effect, 2020, anim, articl
Topic 15 Top Words:
Highest Prob: dr, media, rt, receiv, kanekoathegreat, ha, stand
FREX: ha, ovat, media, bhakdi, sucharit, hometown, germani
Lift: frequenc, geograph, marburg, virologist, ovat, legaci, yahoonew
Score: virologist, dr, bhakdi, sucharit, hometown, media, ovat
Topic 16 Top Words:
Highest Prob: go, rt, let, protect, fight, point, cut
FREX: butat, cortesstev, fight, go, religion, valu, let
Lift: acosta, battl, butat, cortesstev, lollipop, perjuri, queri
Score: battl, go, let, fight, penni, protect, cut
Topic 17 Top Words:
Highest Prob: one, fauci, elonmusk, great, thechiefnerd, caus, vaccin
FREX: great, elonmusk, mbalter, thechiefnerd, drelidavid, type, cancer
Lift: mbalter, aghuff, synthet, allbitenobark88, recip, bioreal, robertwright
Score: mbalter, elonmusk, thechiefnerd, fauci, great, caus, one
Topic 18 Top Words:
Highest Prob: like, rt, look, begin, someth, fdic, onli
FREX: look, jinxland, rting, roll, project, sexual, someth
Lift: carcinogen, ding, edit, injecti, jinxland, kyl33t, premier
Score: like, look, sigmat, sexual, roll, rape, jinxland
Topic 19 Top Words:
Highest Prob: year, rt, pandem, just, 3, dead, ago
FREX: huy, quan, berniesand, dc, christma, unexpect, sudden
Lift: huy, observ, quan, biontain, biontechgroup, dc, mileston
Score: year, dead, ago, observ, 3, unexpect, drloupi
Topic 20 Top Words:
Highest Prob: can, rt, lot, protect, develop, might, organ
FREX: lot, can, permitl, shannonrwatt, volatil, longer, might
Lift: 10yearold, authent, baat, cognit, conceptualjam, impair, kg
Score: can, conceptualjam, lot, develop, volatil, permitl, shannonrwatt
Topic 21 Top Words:
Highest Prob: never, kill, guy, rt, mayb, hard, protect
FREX: assess, hard, guy, mayb, kill, diedsudden, never
Lift: gwenmommabear, majstar7, mourn, pharmacogenom, rosewind2007, tenebra99, thisisnothappen
Score: gwenmommabear, never, kill, guy, assess, mayb, hard
Topic 22 Top Words:
Highest Prob: work, virus, stop, rt, knew, bad, covid
FREX: jinp, 2009, bio, sniper, work, 2015, humankind
Lift: 70k, competitor, dwr, humankind, maestro, mc, monterey
Score: work, wendyor, virus, stop, knew, jinp, bio
Topic 23 Top Words:
Highest Prob: rt, last, time, covid, place, 4, wors
FREX: er, httpstco4l2wigdt6, 4, alterivan1, cholera, stakehold, chanc
Lift: alterivan1, cholera, crown9th, hrtcoup, kst, mucor, stakehold
Score: stage, last, time, 4, er, httpstco4l2wigdt6, 6
Topic 24 Top Words:
Highest Prob: day, long, veri, next, rt, covid, test
FREX: veri, next, site, chicago, mythic, popup, sashamackinnon
Lift: chicago, mythic, popup, sashamackinnon, unmint, whitelist, alvi
Score: lo, veri, next, day, long, test, studi
Topic 25 Top Words:
Highest Prob: protect, rt, countri, fail, cost, invest, help
FREX: paulmitchellab, andrew, cost, bridgen, def, facil, countri
Lift: bridgen, def, facil, grab, paulmitchellab, 368, andrew
Score: protect, gabi, cost, invest, countri, fail, uk
Topic 26 Top Words:
Highest Prob: rt, student, public, sinc, system, school, chang
FREX: httpstco7eepdml4yd, oregon, prisonplanet, fearmong, selectiv, mrandyngo, public
Lift: httpstco7eepdml4yd, prisonplanet, 300b, ampcov, bund, cap, categori
Score: httpstco7eepdml4yd, student, thousand, oregon, mrandyngo, ten, public
Topic 27 Top Words:
Highest Prob: im, rt, well, sure, mask, wear, befor
FREX: tilda, im, sure, choir, preach, sho, swinton
Lift: choir, merry123459, pierrepoilievr, preach, sho, 114f, 42c
Score: im, pierrepoilievr, sure, well, wear, tilda, workforc
Topic 28 Top Words:
Highest Prob: want, rt, p, fact, stori, protect, opinion
FREX: justiceforeden, punishedmoth, younger, want, press, jake, stori
Lift: electricalwsop, howl, justiceforeden, punishedmoth, riskavers, te, younger
Score: want, baji, justiceforeden, punishedmoth, press, stori, p
Topic 29 Top Words:
Highest Prob: now, start, pandem, rt, covid, thought, vs
FREX: now, vs, 104, depressionbutt, epochinspir, grandpa, wwii
Lift: realest, 104, abdaniellesmith, chrisjmerch, depressionbutt, epochinspir, friedmanja
Score: now, riskfre, start, vs, pandem, surviv, thread
Topic 30 Top Words:
Highest Prob: dont, rt, pleas, anyon, protect, allow, just
FREX: ausbassi, environ, tonightpray, corybook, maggiehassan, senateforeign, senatormenendez
Lift: ournewhomecoach, outperform, ausbassi, corybook, environ, joy, karenkho
Score: beauti, anyon, pleas, dont, yo, allow, ausbassi
Topic 31 Top Words:
Highest Prob: peopl, rt, mani, around, just, protect, walk
FREX: 1lonerlifestyl, bitch, essenc, soulless, tryna, around, peopl
Lift: 125, 1lonerlifestyl, 265, 637, advantag, bake, bitch
Score: peopl, advantag, mani, around, 1lonerlifestyl, bitch, essenc
Topic 32 Top Words:
Highest Prob: get, thing, rt, big, protect, two, presid
FREX: ethanol, applaus, vow, alexbruesewitz, realdonaldtrump, iowa, get
Lift: twitterhol, abejmorri, advent, applaus, bor, hhepplewhit, httpstco6lhq2m2apj
Score: get, stomach, ethanol, applaus, vow, alexbruesewitz, realdonaldtrump
Topic 33 Top Words:
Highest Prob: virus, rt, woke, infect, mind, caus, like
FREX: kdnerak33, 75, bola, elluu, obi, officialtelz, pastor
Lift: elhopkin, 2344b, 75, bola, clade, elluu, httpstco1d6pzpy91d
Score: virus, woke, infect, mind, georgetakei, tryangregori, everyth
Topic 34 Top Words:
Highest Prob: will, rt, fauci, 2017, said, trump, becaus
FREX: 2017, 48, atmospher, usstormwatch, rain, 11, closer
Lift: cw3escripp, elev, emiss, ggreenwald, hydrolog, ordinari, satur
Score: will, ordinari, 2017, 11, fauci, januari, gregrubini
Topic 35 Top Words:
Highest Prob: use, tell, yes, better, love, rt, protect
FREX: yes, tell, away, game, better, definit, use
Lift: beyblad, rip, energet, happyharpys, httpstco1ymrfhrnhr, iconoclast1919, johniadarola
Score: beyblad, tell, yes, love, use, pass, game
Topic 36 Top Words:
Highest Prob: 2, rt, 1, interest, pandem, bill, director
FREX: 300k, chrosthugo, microfluid, neuron, uniqu, director, popul
Lift: 300k, chrosthugo, microfluid, neuron, uniqu, 2012z, 55000000
Score: 2, aftermath, director, interest, 300k, chrosthugo, microfluid
Topic 37 Top Words:
Highest Prob: rt, keep, fed, market, free, forget, protect
FREX: keep, market, civil, free, harbor, ident, stackhodl
Lift: mspopok, 12x, 3m, ghopp, harbor, httpstcovggd3vl5lg, ident
Score: ounassnoonpromonontoyoutofordealhmbathandbodyhampm, keep, fed, market, stock, judg, free
Topic 38 Top Words:
Highest Prob: someon, stay, love, rt, protect, inform, hide
FREX: decept, dishonest, givi, theholisticpsyc, someon, stay, hide
Lift: cbcs, elkebabiuk, factchec, fung4l, httpstcogv3yia0ggg, phys, trudeaus
Score: stole, someon, decept, dishonest, givi, theholisticpsyc, hide
Topic 39 Top Words:
Highest Prob: protect, rt, alway, plan, tri, ani, see
FREX: hell, alway, interact, intimid, popular, father, tl
Lift: alanthompson58, amongst, assministr, serenityin24, wootton, 911onfox, interact
Score: alway, luminouskryst, protect, plan, hell, interact, intimid
Topic 40 Top Words:
Highest Prob: give, join, rt, flu, miss, f, diseas
FREX: join, f, diseas, miss, small, sar, give
Lift: anthrax, barrett, chimera, cocktail, ctv, drlimengyan1, hiv
Score: anthrax, diseas, join, flu, give, miss, small
Topic 41 Top Words:
Highest Prob: whi, say, rt, truth, virus, arent, cdc
FREX: cdcthe, hughmankind, factcheck, truth, martin, david, arent
Lift: album, allforj, fanbas, hardwork, httpstcoeurywyez3, hughmankind, monkeyking67
Score: youtub, say, truth, whi, hughmankind, cdcthe, factcheck
Topic 42 Top Words:
Highest Prob: rt, said, right, call, irrespons, went, state
FREX: cook, joshuaphill, irrespons, karilak, cnn, right, millionair
Lift: breitbartnew, joshuaphill, kansa, 1745549, ccpvirus, cumul, decemb
Score: irrespons, call, jojofromjerz, cook, said, cnn, karilak
Topic 43 Top Words:
Highest Prob: rt, show, secur, live, new, data, brought
FREX: leadership, brought, show, smith, coguest, ian, madg
Lift: coguest, ian, madg, 2023s, 2900, aespa, beck
Score: na, show, secur, brought, reveal, data, coguest
Topic 44 Top Words:
Highest Prob: make, rt, bank, billion, collaps, spread, just
FREX: dive, cheridinovo, 20192021, loblaw, net, underpay, make
Lift: princesskelli, 20192021, assumpt, httpstcopqj0zoe28a, httpstcouzoqeaoegl, katrinapanova, lite
Score: make, oneunderscor, billion, bank, collaps, dive, silverg
Topic 45 Top Words:
Highest Prob: rt, pay, back, noth, much, job, shit
FREX: much, augustjpollak, pay, shit, noth, assert, rahraw999
Lift: ada, algo, augustjpollak, avax, dot, facebookdown, futuretak
Score: pay, xbox, much, greedi, augustjpollak, shit, back
Topic 46 Top Words:
Highest Prob: covid, still, rt, cant, real, believ, peopl
FREX: bs, real, bohemianatmosp1, utter, stupid, cant, ge
Lift: confou, barstoolsport, economy4health, httpstcowwayqt5xtj, luminari, httpstcodaunmusrbh, rumbl
Score: barstoolsport, cant, covid, still, real, believ, stupid
Topic 47 Top Words:
Highest Prob: virus, rt, oh, done, print, lab, becom
FREX: becom, alobarkahn, danielbaronn, print, aleresnik, transitori, amitrippedcat
Lift: aleresnik, delthiarick, genom, managertact, pedophil, phrase, proun
Score: managertact, virus, gone, oh, print, becom, censored4sur
Topic 48 Top Words:
Highest Prob: know, rt, everyon, account, lie, part, govern
FREX: badass, gat, omgpatriot, mccarthi, kevin, unnot, digit
Lift: 13m, 330k, 77steez, denisrancourt, drs4covideth, httpstco1lo4ezd8dl, httpstcoekwqg
Score: know, httpstco1lo4ezd8dl, badass, gat, omgpatriot, currenc, annaapp91838450
Topic 49 Top Words:
Highest Prob: need, rt, democrat, protect, dont, bc, agre
FREX: need, democrat, agre, handl, bc, chiron, kelli
Lift: insanit, kiwuikiwi, mandatori, pat300000, spell, vash, 2nda
Score: need, poverti, democrat, bc, agre, oh, protect
Topic 50 Top Words:
Highest Prob: rt, think, protect, covid, pandem, just, becaus
FREX: think, rt, mattwalshblog, protect, covid, becaus, see
Lift: mattwalshblog, think, live, learn, see, rt, protect
Score: mattwalshblog, protect, think, rt, covid, pandem, see
summary(BigSTM_100)
A topic model with 100 topics, 4995 documents and a 4584 word dictionary.
Topic 1 Top Words:
Highest Prob: covid19, rt, vaccin, pandem, may, mass, warn
FREX: covid19, soon, murder, scientist, request, reject, mass
Lift: 155, 235, benjaminmateus7, dyk, genderbas, httpstcovggd3vl5lg, occurr
Score: covid19, ultimahora, moderna, soon, mass, vaccin, scientist
Topic 2 Top Words:
Highest Prob: happen, understand, u, theyr, depositor, insur, take
FREX: repthomasmassi, understand, u, happen, premium, paid, depositor
Lift: elimin, useounasscodenamshicouponnoondiscountsivvipromo, repthomasmassi, bori, dawnwestg, kanga, thetrumptrain
Score: elimin, repthomasmassi, premium, u, happen, understand, theyr
Topic 3 Top Words:
Highest Prob: dure, children, rt, week, abl, face, era
FREX: era, abhor, revel, steviemat, abl, week, children
Lift: greed, ga, abhor, revel, steviemat, dcverso1, kevros765
Score: greed, dure, era, children, abl, week, devast
Topic 4 Top Words:
Highest Prob: virus, spread, fauci, chines, lab, covid, dr
FREX: chines, spread, al, heard, lab, frequenc, geograph
Lift: repronnyjackson, frequenc, geograph, marburg, career, chines, 4tn
Score: repronnyjackson, spread, chines, fauci, lab, virus, via
Topic 5 Top Words:
Highest Prob: p, dear, whitehous, corybook, maggiehassan, senateforeign, senatormenendez
FREX: corybook, maggiehassan, senateforeign, senatormenendez, senatorshaheen, sfrcdem, usun
Lift: 827js, corybook, maggiehassan, senateforeign, senatormenendez, senatorshaheen, sfrcdem
Score: 827js, httpstcoeurywyez3, monkeyking67, corybook, maggiehassan, senateforeign, senatormenendez
Topic 6 Top Words:
Highest Prob: befor, famili, rt, hard, 50, patient, crypto
FREX: famili, 50, befor, suit, wtf, hard, guarante
Lift: resudesu, royalti, whil, 50, suit, wtf, guarante
Score: resudesu, befor, famili, 50, suit, guarante, hard
Topic 7 Top Words:
Highest Prob: thank, rt, didnt, also, day, million, mani
FREX: thank, abandon, wake, disastr, wouldnt, fcretir, didnt
Lift: muthoninjakw, streng, meltvirus, ment, sincer, tinkswonu, wonwoo
Score: meltvirus, thank, mor, wake, million, obama, denisedewald
Topic 8 Top Words:
Highest Prob: even, better, taxpay, rt, twice, pandem, total
FREX: twice, better, even, rhebright, taxpay, washburnealex, bet
Lift: pandemiccost, washburnealex, rhebright, twice, wast, consist, sullydish
Score: washburnealex, even, better, rhebright, taxpay, twice, bet
Topic 9 Top Words:
Highest Prob: china, rt, know, hold, wuhan, th, virus
FREX: hold, china, wuhan, mrschines, mitch, joe, th
Lift: allbitenobark88, bide, frame, httpstcoczlrqfornm, impeach, lefti, recip
Score: china, impeach, hold, wuhan, mrschines, mitch, annaapp91838450
Topic 10 Top Words:
Highest Prob: origin, rt, releas, theyr, countri, child, consid
FREX: origin, amaz, du, jeffrey, releas, repres, consid
Lift: barcelona, dox, groom, libsoftiktok, marsh, muslimdaili, larger
Score: libsoftiktok, origin, releas, child, consid, theyr, du
Topic 11 Top Words:
Highest Prob: wrong, person, rt, lockdown, isnt, covid, closur
FREX: wrong, person, closur, isnt, realpeteyb123, lockdown, wro
Lift: desir, ashishkjha46, calendar, cdcdirector, longcovidawarenessday, march15, norpelsam
Score: tag, wrong, lockdown, person, realpeteyb123, isnt, closur
Topic 12 Top Words:
Highest Prob: covid, rt, long, death, mask, wear, arent
FREX: coldlik, plethora, chuckcallesto, astroaugusto, long, chicago1ray, delhi
Lift: chicago1ray, coldlik, cretin, delhi, denounc, ea, european
Score: covid, genius, long, chuckcallesto, death, coldlik, plethora
Topic 13 Top Words:
Highest Prob: manag, free, discuss, havent, readi, low, secret
FREX: readi, secret, hmm, justice4tigray, tigray, interven, manag
Lift: readi, httpstcoynb67uplr, stevenvoiceov, hmm, justice4tigray, tigray, coffe
Score: readi, manag, free, secret, discuss, hmm, justice4tigray
Topic 14 Top Words:
Highest Prob: trade, immun, 2020, risk, rt, natur, effect
FREX: trade, immun, counterparti, 2020, wclementeiii, lower, natur
Lift: heal, counterparti, certainti, programmat, nfts, trade, wclementeiii
Score: heal, trade, immun, 2020, natur, counterparti, cold
Topic 15 Top Words:
Highest Prob: studi, covid, rt, question, longcovid, catch, level
FREX: studi, virologist, longcovid, npr, question, obes, catch
Lift: virologist, npr, yahoonew, kiss, calvinfroedg, transform, loscharlo
Score: virologist, studi, question, longcovid, catch, npr, contract
Topic 16 Top Words:
Highest Prob: let, go, rt, fight, help, t, covid
FREX: fight, let, butat, cortesstev, religion, prosecut, drelidavid
Lift: mandatei, tooth, battl, butat, clas, cortesstev, imm
Score: battl, let, fight, prosecut, butat, cortesstev, religion
Topic 17 Top Words:
Highest Prob: cours, join, rt, covid, shot, fauci, mbalter
FREX: mbalter, cours, mrddmia, gridiron, join, shot, attend
Lift: gridiron, conjunct, mbalter, mrddmia, pbis, pbiuk, pharmacogenom
Score: mbalter, cours, join, mrddmia, gridiron, shot, dinner
Topic 18 Top Words:
Highest Prob: like, rt, look, begin, someth, fdic, pandem
FREX: jinxland, rting, detransit, 125b, 126, patrickbetdavid, someth
Lift: detransit, eden, jinxland, rting, aidanthejest, brown, edit
Score: sigmat, like, look, jinxland, rting, fdic, crazi
Topic 19 Top Words:
Highest Prob: year, rt, ago, pandem, 3, lost, last
FREX: huy, quan, berniesand, looser, ago, midst, lost
Lift: huy, quan, looser, observ, andyjay1, biontain, biontechgroup
Score: year, observ, ago, huy, quan, berniesand, midst
Topic 20 Top Words:
Highest Prob: can, pandem, rt, entir, organ, explain, medic
FREX: can, entir, stress, organ, emot, ongo, cognit
Lift: cognit, impair, unmitig, conceptualjam, stress, emot, can
Score: can, conceptualjam, stress, entir, ongo, choic, emot
Topic 21 Top Words:
Highest Prob: know, tri, everyon, part, govern, account, rt
FREX: part, everyon, tri, know, currenc, account, digit
Lift: citizenlenz, genflynn, gwenmommabear, httpstcooomse2vbjx, peril, rosewind2007, tenebra99
Score: know, gwenmommabear, currenc, digit, catturd2, part, tri
Topic 22 Top Words:
Highest Prob: work, stop, virus, rt, covid, fauci, ani
FREX: jinp, sniper, work, bio, 2015, xi, stop
Lift: 70k, competitor, dwr, maestro, mc, monterey, parjaro
Score: work, wendyor, stop, jinp, bot, bio, dictat
Topic 23 Top Words:
Highest Prob: last, year, rt, us, visit, can, abridgen
FREX: christma, dc, dod, visit, last, abridgen, confirm
Lift: 1979hab, ursula, dc, stage, christma, 2000, crisesth
Score: stage, last, dc, christma, dod, washington, visit
Topic 24 Top Words:
Highest Prob: veri, rt, great, ivermectin, ignor, poor, healthi
FREX: veri, ignor, healthi, deworm, ivermectin, poor, hcq
Lift: arewer, deworm, lo, disturb, dig, ssn, vkabramowicz
Score: lo, veri, ignor, sever, ivermectin, deworm, arewer
Topic 25 Top Words:
Highest Prob: cost, rt, countri, educ, protect, deal, tonight
FREX: cost, sub, 368, billioncan, pi, rdnstai, spe
Lift: 368, appl, ash, barbar, billioncan, bow, breakfast
Score: gabi, cost, pi, 368, billioncan, rdnstai, spe
Topic 26 Top Words:
Highest Prob: put, crimin, push, pandem, gov, littl, httpstco7eepdml4yd
FREX: put, httpstco7eepdml4yd, crimin, push, gov, target, mishandl
Lift: httpstco7eepdml4yd, cspan, httpstconizojpgzu6, michigan, mishandl, heavili, coup
Score: httpstco7eepdml4yd, put, crimin, push, mishandl, littl, gov
Topic 27 Top Words:
Highest Prob: im, rt, sure, well, mask, wear, covid
FREX: im, sure, swinton, choir, preach, sho, brownecfm
Lift: choir, merry123459, pierrepoilievr, preach, sho, airbnb, itjustmeez
Score: im, pierrepoilievr, sure, tilda, swinton, choir, preach
Topic 28 Top Words:
Highest Prob: want, rt, kill, stori, sever, welcom, opinion
FREX: justiceforeden, punishedmoth, younger, want, stori, press, jake
Lift: custodia, justiceforeden, nonfract, opti, peterktodd, punishedmoth, quo
Score: want, baji, justiceforeden, punishedmoth, press, sever, stori
Topic 29 Top Words:
Highest Prob: thought, build, best, rt, aukus, uk, thread
FREX: build, realest, thought, defenceprof, aukus, design, best
Lift: realest, riskfre, trilater, ukaus, defenceprof, treason, hitler
Score: riskfre, build, thought, aukus, best, thread, realest
Topic 30 Top Words:
Highest Prob: still, ani, feel, ive, alreadi, rt, beauti
FREX: feel, still, beauti, ive, ani, alreadi, father
Lift: hdteeve, beauti, feel, brave, field, ive, still
Score: beauti, still, feel, ive, ani, hdteeve, alreadi
Topic 31 Top Words:
Highest Prob: rt, point, secur, medicar, clear, social, cut
FREX: medicar, penni, clear, point, cut, budget, key
Lift: penni, sensherrodbrown, 96, advantag, demvoice1, drs4covideth, httpstcoekwqg
Score: advantag, medicar, penni, secur, budget, cut, point
Topic 32 Top Words:
Highest Prob: get, take, rt, like, covid, vaccin, biden
FREX: get, take, piec, holi, twitterhol, trhloffici, probabl
Lift: twitterhol, vanish, cihi, cihiici, cvsd, stomach, trhloffici
Score: get, stomach, take, trhloffici, twitterhol, holi, disappear
Topic 33 Top Words:
Highest Prob: virus, rt, infect, name, woke, tomorrow, brain
FREX: kdnerak33, name, 75, bola, elluu, obi, officialtelz
Lift: elhopkin, 75, bola, elluu, hemorrhag, kdnerak33, obi
Score: gone, tryangregori, name, infect, woke, virus, kdnerak33
Topic 34 Top Words:
Highest Prob: will, rt, fauci, said, 2017, trump, 11
FREX: 2017, will, rain, 48, atmospher, usstormwatch, closer
Lift: closer, cw3escripp, elev, hydrolog, ordinari, rohn, satur
Score: will, ordinari, 2017, fauci, machin, 11, gregrubini
Topic 35 Top Words:
Highest Prob: tell, yes, n, true, let, y, us
FREX: yes, tell, beyblad, rip, true, n, road
Lift: beyblad, rip, ccaiano7, dbongino, httpstco1ymrfhrnhr, iconoclast1919, johniadarola
Score: beyblad, yes, tell, realiti, rip, true, innov
Topic 36 Top Words:
Highest Prob: 2, rt, 1, interest, oh, gate, popul
FREX: 2, 300k, chrosthugo, microfluid, neuron, uniqu, popul
Lift: 300k, aftermath, aleresnik, chrosthugo, delilah, dy, hialeah
Score: 2, aftermath, 300k, chrosthugo, microfluid, neuron, uniqu
Topic 37 Top Words:
Highest Prob: risk, fed, keep, rt, bank, must, reduc
FREX: keep, fed, civil, juri, requir, reduc, tough
Lift: ounassnoonpromonontoyoutofordealhmbathandbodyhampm, mspopok, ident, toler, process, civil, keep
Score: ounassnoonpromonontoyoutofordealhmbathandbodyhampm, fed, keep, juri, civil, ident, risk
Topic 38 Top Words:
Highest Prob: someon, love, stay, rt, inform, protect, hide
FREX: someon, decept, dishonest, givi, theholisticpsyc, stay, love
Lift: decept, dishonest, fung4l, givi, httpstcogv3yia0ggg, phys, stole
Score: stole, someon, decept, dishonest, givi, theholisticpsyc, hide
Topic 39 Top Words:
Highest Prob: alway, rt, hell, popular, tl, see, saw
FREX: hell, interact, intimid, tl, alway, popular, lolli
Lift: assministr, burden, demolit, dilftawi, everybodi, featur, foster
Score: alway, luminouskryst, hell, interact, intimid, tl, popular
Topic 40 Top Words:
Highest Prob: small, rt, serv, diseas, virus, chicken, boat
FREX: small, chicken, serv, pox, boat, domin, diseas
Lift: chicken, pox, anthrax, drlimengyan1, housefli, hpaih5n1, vector
Score: anthrax, small, serv, chicken, diseas, pox, boat
Topic 41 Top Words:
Highest Prob: truth, rt, arent, virus, cdcthe, david, amp
FREX: cdcthe, truth, david, 116, martin, arent, oper
Lift: 116, 35mph, album, allforj, fanbas, glennkirschner2, grand
Score: youtub, truth, cdcthe, david, factcheck, hughmankind, martin
Topic 42 Top Words:
Highest Prob: right, rt, b, action, govern, wealth, angri
FREX: joshuaphill, right, millionair, wealth, leap, angri, b
Lift: joshuaphill, transgend, 1745549, ccpvirus, cumul, decemb, excessdeath
Score: jojofromjerz, right, joshuaphill, billionair, millionair, wealth, action
Topic 43 Top Words:
Highest Prob: thing, secur, rt, two, social, presid, big
FREX: ethanol, applaus, vow, thing, realdonaldtrump, iowa, alexbruesewitz
Lift: 2023s, applaus, ethanol, na, sponsor, vow, 2900
Score: na, ethanol, applaus, vow, alexbruesewitz, secur, realdonaldtrump
Topic 44 Top Words:
Highest Prob: make, see, billion, rt, covid, profit, final
FREX: cheridinovo, make, 20192021, loblaw, net, underpay, trend
Lift: 20192021, 300000, assumpt, httpstcouzoqeaoegl, loblaw, marbl, minneapoli
Score: oneunderscor, make, billion, cheridinovo, 20192021, loblaw, net
Topic 45 Top Words:
Highest Prob: pay, rt, much, job, get, shit, potus
FREX: pay, investor, much, shit, augustjpollak, greedi, job
Lift: augustjpollak, smarter, xbox, assert, rahraw999, wealthi, lavinemmanuell
Score: pay, xbox, investor, augustjpollak, much, greedi, mor
Topic 46 Top Words:
Highest Prob: cant, real, covid, rt, pandem, believ, 19
FREX: real, cant, 19, confou, rest, anybodi, believ
Lift: confou, artin, barstoolsport, davidkra, erickson, hysteria, massihi
Score: 19, barstoolsport, real, cant, confou, rest, covid
Topic 47 Top Words:
Highest Prob: virus, becom, censored4sur, rt, amitrippedcat, alobarkahn, danielbaronn
FREX: becom, alobarkahn, danielbaronn, amitrippedcat, censored4sur, twitter, suppress
Lift: suppress, chyna, lite, managertact, paulbenedict7, theoret, alobarkahn
Score: managertact, censored4sur, becom, amitrippedcat, alobarkahn, danielbaronn, virus
Topic 48 Top Words:
Highest Prob: becaus, death, rt, democrat, covid, held, win
FREX: becaus, held, confin, captiv, politicalprison, solitari, thrown
Lift: captiv, mandatori, pat300000, politicalprison, solitari, thrown, vac
Score: becaus, httpstco1lo4ezd8dl, death, democrat, held, captiv, politicalprison
Topic 49 Top Words:
Highest Prob: need, rt, agre, dont, oh, go, u
FREX: agre, need, chiron, voter, focus, spell, shortag
Lift: spell, 2nda, alyalyoutnfre, custodi, dang, derail, interdogrescu
Score: need, poverti, agre, voter, oh, spell, difficult
Topic 50 Top Words:
Highest Prob: women, rt, men, pass, play, ban, bill
FREX: men, women, sport, play, pass, breitbartnew, kansa
Lift: breitbartnew, kansa, mattwalshblog, authent, nurseosavag, men, sport
Score: mattwalshblog, women, men, pass, play, sport, ban
Topic 51 Top Words:
Highest Prob: turn, gain, kid, rt, function, fact, pandem
FREX: gain, turn, function, handl, daili, kid, fact
Lift: 2x2sometimes5, acosta, amus, evict, httpstcoqgbvxrcxzw, httpstcorqajbyf9mv, kievpechersk
Score: medrop, gain, turn, handl, function, daili, kid
Topic 52 Top Words:
Highest Prob: fear, dont, amount, fool, puppet, socialistlyakwd, takethatct
FREX: fear, socialistlyakwd, takethatct, fool, master, amount, abejmorri
Lift: abejmorri, ghopp, httpstco6lhq2m2apj, kentlee47, lummhandi, maureenstroud, propagan
Score: ghopp, takethatct, fear, amount, socialistlyakwd, abejmorri, httpstco6lhq2m2apj
Topic 53 Top Words:
Highest Prob: onli, rt, lie, america, corrupt, expos, know
FREX: badass, gat, omgpatriot, corrupt, idea, lie, onli
Lift: princesskelli, badass, dropout, earlier, gat, newborn, omgpatriot
Score: earlier, badass, gat, omgpatriot, onli, annaapp91838450, journalist
Topic 54 Top Words:
Highest Prob: protect, rt, amp, parent, pray, ask, thebachelor
FREX: protect, thebachelor, rt, pray, parent, dad, amp
Lift: thebachelor, dad, disney, protect, parent, rt, pray
Score: protect, rt, thebachelor, amp, pray, dad, parent
Topic 55 Top Words:
Highest Prob: virus, mind, zombi, call, made, clear, recent
FREX: zombi, reviv, virus, corona, cell, mind, billackman
Lift: bingo, timodc, unwok, reviv, cowan, httpstcoscnez01aus, mariusknulst
Score: virus, timodc, mind, corona, zombi, woke, gone
Topic 56 Top Words:
Highest Prob: bailout, ukrain, rt, weve, far, state, seen
FREX: bailout, ukrain, disguis, weve, httpstcolmzfjphrt, dcdraino, far
Lift: 114f, 42c, 457c, httpstcolmzfjphrt, item, matam, pgdyne
Score: httpstcolmzfjphrt, bailout, ukrain, disguis, dcdraino, relief, loan
Topic 57 Top Words:
Highest Prob: just, mani, around, rt, walk, peopl, wasnt
FREX: 1lonerlifestyl, bitch, essenc, soulless, tryna, around, walk
Lift: 1lonerlifestyl, bitch, concentra, elid, essenc, insular, soulless
Score: inventor, just, mani, around, 1lonerlifestyl, bitch, essenc
Topic 58 Top Words:
Highest Prob: rt, articl, cdc, mask, amp, 2020, director
FREX: articl, redfield, importan, journal, prof, solidar, tufekci
Lift: autopsi, babyd1111229, cite, ellen, httpstcorz3l9irp36, importan, jama
Score: cite, articl, cdc, redfield, director, importan, journal
Topic 59 Top Words:
Highest Prob: ill, rt, kanekoathegreat, media, receiv, ha, dr
FREX: ill, ovat, kanekoathegreat, germani, legaci, ha, fat
Lift: contribut, johnnyxbrown, ovat, walkerbragman, jimmydor, darktimes1984, noteveri
Score: ill, contribut, ovat, kanekoathegreat, hometown, welcom, legaci
Topic 60 Top Words:
Highest Prob: new, rt, show, march, data, case, 2023
FREX: genet, new, data, march, show, 35842, 3716045
Lift: 35842, 3716045, 20202021, categori, classifi, comptrol, dinapoli
Score: categori, new, genet, march, data, show, date
Topic 61 Top Words:
Highest Prob: vaccin, covid, 10, bc, 1, minut, covid19
FREX: 10, bc, minut, minist, blood, 645, montana
Lift: 5pm, 645, expedit, unsaf, encod, nucleocapsid, ronnyjacksontx
Score: 5pm, vaccin, 10, bc, minut, covid, shirtless
Topic 62 Top Words:
Highest Prob: lot, rt, serious, subject, st, congress, time
FREX: lot, st, serious, arsenal, ethan, nwaneri, spare
Lift: arsenal, ethan, nwaneri, spare, teenag, attitud, brexitbust
Score: xxclusionari, lot, subject, serious, spare, arsenal, ethan
Topic 63 Top Words:
Highest Prob: never, good, rt, covid, use, get, pandem
FREX: good, tv, never, smith, economy4health, httpstcowwayqt5xtj, luminari
Lift: coguest, ian, madg, permitl, shannonrwatt, economy4health, explor
Score: juic, never, good, economy4health, httpstcowwayqt5xtj, luminari, smith
Topic 64 Top Words:
Highest Prob: think, rt, anyth, pandem, destroy, republican, democraticitizn
FREX: think, conspir, democraticitizn, msm, destroy, theyd, anyth
Lift: 240m, adamchamb, alanthompson58, alexgre36024221, amongst, bell, mtnutz
Score: think, alexgre36024221, anyth, democraticitizn, destroy, msm, alanthompson58
Topic 65 Top Words:
Highest Prob: say, rt, yeah, desanti, market, covid, bad
FREX: idk, say, yeah, impli, weak, desanti, unfilteredboss1
Lift: advent, idk, jonddo, pedophil, rino, httpstcoxmb1zjabwo, truste
Score: idk, say, yeah, anxious, desanti, pretti, advent
Topic 66 Top Words:
Highest Prob: woke, everyth, caus, call, infect, gop, like
FREX: everyth, woke, gop, caus, georgetakei, infect, electricalwsop
Lift: electricalwsop, riskavers, sacrifi, te, httpstcomhl2f97a1r, peterstefanovi2, michaelrulli
Score: woke, everyth, georgetakei, gop, httpstcomhl2f97a1r, infect, caus
Topic 67 Top Words:
Highest Prob: increas, rt, buy, bank, instead, power, presid
FREX: increas, buy, instead, power, collinrugg, previous, shut
Lift: cancerdrug, 57, ada, algo, avax, dot, futuretak
Score: httpstcoieqgdwfjm6, buy, increas, instead, shut, power, wo
Topic 68 Top Words:
Highest Prob: everi, matter, anim, respect, michaelpseng, covid, fact
FREX: howl, httpstcoy9nzbiqlkc, everi, cult, respect, anim, animaladvoc
Lift: animaladvoc, haunt, howl, httpstcoy9nzbiqlkc, ingredi, marketend, milk
Score: httpstcoy9nzbiqlkc, everi, anim, howl, cult, respect, animaladvoc
Topic 69 Top Words:
Highest Prob: sexual, includ, rape, rt, amp, girl, women
FREX: includ, sexual, tigrayan, mutil, rape, 查, gang
Lift: 33, militar, 查, 盗, dowellml, mutilatio, shape
Score: 盗, sexual, includ, rape, tigrayan, violenc, gang
Topic 70 Top Words:
Highest Prob: man, made, close, beg, bought, look, conserv
FREX: beg, carcinogen, injecti, virul, man, joy, bought
Lift: carcinogen, injecti, virul, joy, bord, cbp, paso
Score: joy, man, beg, bought, carcinogen, injecti, virul
Topic 71 Top Words:
Highest Prob: covid, flu, rt, today, credit, got, https
FREX: flu, 104, depressionbutt, epochinspir, grandpa, wwii, spanish
Lift: 104, canberra, deliveri, depressionbutt, driveway, epochinspir, grandpa
Score: ya, flu, 104, depressionbutt, epochinspir, grandpa, wwii
Topic 72 Top Words:
Highest Prob: happi, pandem, declar, today, birthday, anniversari, middl
FREX: birthday, happi, anniversari, declar, third, class, middl
Lift: janisirwin, kronii, birthday, anniversari, kronillust, krotanjoubi, third
Score: kronii, happi, anniversari, birthday, third, declar, class
Topic 73 Top Words:
Highest Prob: bank, silicon, valley, rt, risk, fail, manag
FREX: ewarren, silicon, roll, fail, valley, bank, head
Lift: 5bn, bitcoinmagazin, consider, healthier, queer, spearhead, ahmedbaba
Score: bank, silicon, valley, btcarchiv, fail, ewarren, donald
Topic 74 Top Words:
Highest Prob: law, friend, rt, amp, gun, notic, communiti
FREX: law, friend, refer, mayor, notic, dedic, vice
Lift: cbergie1007, enforcem, everytown, faucinot, flvoicenew, gallopinggay, matjendav4
Score: cbergie1007, law, friend, notic, gun, refer, un
Topic 75 Top Words:
Highest Prob: come, next, rt, tweet, done, back, day
FREX: come, next, parliament, couldnt, tweet, stone, done
Lift: bridgen, def, facil, mythic, sashamackinnon, unmint, whitelist
Score: salami, come, next, tweet, done, refresh, mythic
Topic 76 Top Words:
Highest Prob: elonmusk, thechiefnerd, mrna, rt, world, matter, danger
FREX: thechiefnerd, elonmusk, mrna, synthet, product, danger, ppl
Lift: synthet, aghuff, antibodi, enzolyt, fals, monoclon, onlytrippi
Score: thechiefnerd, fals, mrna, elonmusk, product, danger, spartajustic
Topic 77 Top Words:
Highest Prob: start, pandem, vs, black, problem, rt, today
FREX: vs, start, black, problem, defend, sig, highlight
Lift: sig, easier, vs, highlight, black, hunter, signal
Score: sig, start, vs, black, problem, highlight, pandem
Topic 78 Top Words:
Highest Prob: trump, crisi, pandem, creat, rt, respons, end
FREX: crisi, rail, disband, noliewithbtc, follow, end, creat
Lift: cjjohnson17th, disband, rail, dive, impend, noliewithbtc, buildjakapan
Score: disband, rail, trump, noliewithbtc, crisi, obamabiden, cjjohnson17th
Topic 79 Top Words:
Highest Prob: safe, rt, seem, decis, covid, bank, fall
FREX: safe, decis, seem, coordin, apr, coast, fall
Lift: admiss, harbor, stackhodl, frankdescushin, mirror, phrase, rearview
Score: admiss, safe, seem, decis, apr, coordin, babszon
Topic 80 Top Words:
Highest Prob: student, sinc, pandem, chang, rt, system, school
FREX: oregon, sinc, student, mrandyngo, thousand, school, cbcs
Lift: cbcs, elkebabiuk, factchec, ournewhomecoach, outperform, trudeaus, accou
Score: student, ournewhomecoach, oregon, sinc, thousand, mrandyngo, ten
Topic 81 Top Words:
Highest Prob: care, rt, health, plan, built, pandem, wall
FREX: outsourc, solomonmissouri, built, care, wall, plan, health
Lift: alterivan1, cholera, stakehold, 17th, academi, caringacrossgen, lanceusa70
Score: trumpsdontgiveadamn, care, health, built, plan, outsourc, solomonmissouri
Topic 82 Top Words:
Highest Prob: live, life, yet, anoth, want, rt, age
FREX: life, live, age, yet, mattletiss7, anoth, arriv
Lift: mattletiss7, toman, mvankerkhov, age, cute, life, live
Score: toman, life, live, yet, age, mattletiss7, anoth
Topic 83 Top Words:
Highest Prob: rt, expert, bank, us, fund, war, pandem
FREX: expert, shutdown, slo, war, trigger, 2022, annemar45451941
Lift: annemar45451941, kiel, thaigerman, anastas25608217, barrett, ctv, httpstcovdvmyezplr
Score: httpstcovdvmyezplr, expert, trigger, shutdown, war, fund, 2022
Topic 84 Top Words:
Highest Prob: peopl, rt, irrespons, said, went, amp, fauci
FREX: irrespons, cook, karilak, arrest, peopl, cnn, england
Lift: innoc, yash25571056, 14000, risemelbourn, wale, 3year, dozen
Score: innoc, peopl, irrespons, yash25571056, karilak, cook, 14000
Topic 85 Top Words:
Highest Prob: break, rt, learn, covid, news, r, spartajustic
FREX: graphen, oxid, break, whistleblow, learn, rope, content
Lift: graphen, los, oxid, rope, voucher, climatetech, sector
Score: rope, break, graphen, oxid, whistleblow, vial, pfizer
Topic 86 Top Words:
Highest Prob: risk, dont, pleas, anyon, allow, access, yo
FREX: risk, yo, ausbassi, environ, tonightpray, access, anyon
Lift: reskless, tast, ausbassi, environ, tonightpray, click, thera666
Score: risk, tast, dont, abil, ausbassi, environ, tonightpray
Topic 87 Top Words:
Highest Prob: o, rt, brain, research, univers, expect, reason
FREX: univers, griffith, identifi, o, expect, brain, fatigu
Lift: chrisjmerch, friedmanja, georgefreemanmp, griffith, httpstcowwdgnysdyx, nceoscienc, nercscienc
Score: chrisjmerch, univers, expect, o, brain, research, fatigu
Topic 88 Top Words:
Highest Prob: time, first, covid, pandem, wors, 4, last
FREX: time, er, httpstco4l2wigdt6, md, default, httpstcocnowjseptt, episod
Lift: cliffohio, amt, auction, httpstco5lij7dslv9, inherit, statut, unclaim
Score: time, cliffohio, er, httpstco4l2wigdt6, md, first, default
Topic 89 Top Words:
Highest Prob: whi, rt, thing, support, love, 1goodtern, hospit
FREX: beij, neurolog, xuanwu, seal, whi, baffl, 1goodtern
Lift: beij, bin, capac, context, curent, happyharpys, httpstcog1ycexd5la
Score: happyharpys, whi, beij, neurolog, xuanwu, baffl, 1goodtern
Topic 90 Top Words:
Highest Prob: pfizer, 5, rt, find, amp, benefit, harm
FREX: 101, reanalysi, benefit, 5, ninobox, clinic, per
Lift: 101, reanalysi, sl, ninobox, 300b, ampcov, bund
Score: pfizer, 10000, 101, reanalysi, trial, clinic, moderna
Topic 91 Top Words:
Highest Prob: guy, hope, without, rt, see, nice, pandem
FREX: nice, hope, guy, kick, without, privat, majstar7
Lift: majstar7, camera, coronavirustyp, crowther, httpstco2smkyugb9n, laura, loos
Score: httpstco2smkyugb9n, guy, without, hope, nice, littl, privat
Topic 92 Top Words:
Highest Prob: biden, knew, rt, virus, democrat, 2009, right
FREX: knew, hrc, 2009, biden, 17, socialist, democrat
Lift: censor, clinton, cre, freespeech, hrc, httpstcomc8d2rsqeu, oxymoron
Score: knew, clinton, biden, 2009, hrc, angelatough, brianpr52840873
Topic 93 Top Words:
Highest Prob: one, rt, blame, said, pandem, hate, first
FREX: destruct, blame, one, dam, hate, unpreced, cycl
Lift: destruct, bull, dam, destin, exasper, guardrail, jeffroushpoetri
Score: destruct, one, blame, dam, immeme0, jacobisrael71, exasper
Topic 94 Top Words:
Highest Prob: financi, rt, bank, result, failur, sign, 2018
FREX: financi, leader, direct, 2018, result, failur, assum
Lift: dharmatrad, goodvibepolitik, hkeskiva, plea, responsib, summar, httpstcolak3xf7zhn
Score: httpstcolak3xf7zhn, financi, result, failur, direct, leader, 2018
Topic 95 Top Words:
Highest Prob: now, rt, pandem, went, know, gave, risktak
FREX: kevin, unnot, mccarthi, risktak, bi, gave, now
Lift: fewer, makismd, youarelobbylud, 125, 265, 637, abdaniellesmith
Score: youarelobbylud, now, kevin, unnot, mccarthi, reckless, gave
Topic 96 Top Words:
Highest Prob: rt, base, liber, damag, new, whi, covid
FREX: base, liber, wipe, owner, firearm, pmcculloughmd, damag
Lift: grab, lieu, nyc, rubh, therein, wipe, owner
Score: nyc, base, wipe, liber, firearm, vigilantfox, pmcculloughmd
Topic 97 Top Words:
Highest Prob: die, million, rt, dead, pandem, 3, climat
FREX: die, gender, sudden, million, climat, unexpect, dead
Lift: fearmong, selectiv, voldemorgoret, unexpect, gender, sake, sudden
Score: voldemorgoret, drloupi, unexpect, catastroph, gender, die, sudden
Topic 98 Top Words:
Highest Prob: realli, vote, pandem, late, book, ad, sorri
FREX: realli, late, vote, sorri, prisonplanet, ad, book
Lift: prisonplanet, late, sorri, realli, ad, realiz, vote
Score: prisonplanet, realli, vote, late, book, sorri, ad
Topic 99 Top Words:
Highest Prob: talk, rt, rememb, import, step, posit, peac
FREX: rememb, peac, often, posit, step, import, talk
Lift: 9month, bankcollap, 0213, alvi, dioxin, eve, height
Score: dioxin, talk, import, posit, peac, rememb, often
Topic 100 Top Words:
Highest Prob: rt, pandem, covid, 007mss, amp, go, use
FREX: rt, pandem, 007mss, covid, go, amp, use
Lift: 007mss, rt, pandem, covid, use, go, republican
Score: 007mss, rt, pandem, covid, amp, go, use
You should run diagnostics on each of your models to assess which number of topics is the “best fit” for your data.
Let’s get the residual dispersion first.
checkResiduals(BigSTM_20, documents = Big_outprep$documents)
$dispersion
[1] 15.67758
$pvalue
[1] 0
$df
[1] 766811
checkResiduals(BigSTM_50, documents = Big_outprep$documents)
$dispersion
[1] 33.36291
$pvalue
[1] 0
$df
[1] 413797
checkResiduals(BigSTM_100, documents = Big_outprep$documents)
$dispersion
[1] -92.38647
$pvalue
[1] NaN
$df
[1] -188808
Notice that the dispersion score for the 100 topic model is negative! I can reject the null that dispersion = 1 for both the 20 and 50 topic models, but these dispersion scores are much closer to zero (7.91 and 6.59) than the 79 topic model (15.80).
Given this information, I could go ahead and run more models, or I could continue with my diagnostics before making the decision about further models.
Let’s keep looking at these models, starting with plotting the semantic coherence and exclusivity for the topics.
par(mar = c(5, 4, 2, 1),
oma = c(0.8, 0.8, 0.8, 0.5),
cex = 0.8)
plot(BigSTM_manyT$semcoh[[1]],
$exclusivity[[1]],
BigSTM_manyTcol = "blue",
pch = 19,
xlab = c(" "),
ylab = c(" "),
xlim = c(-280, -40),
ylim = c(9.4, 10.1)
)points(BigSTM_manyT$semcoh[[2]],
$exclusivity[[2]],
BigSTM_manyTcol = "green",
pch = 19
)points(BigSTM_manyT$semcoh[[3]],
$exclusivity[[3]],
BigSTM_manyTcol = "red",
pch = 19
)legend("bottomleft",
legend = c("20 Topics","50 Topics","100 Topics"),
fill = c("blue","green","red"),
title = c("Models"))
title(main = c("Topic Quality"),
xlab = c("Semantic Coherence"),
ylab = c("Exclusivity"))
Looking at the plot of topic quality for the three models, the ideal model would have points clustered in the top right corner of the graph. The closest model is the 50 topic model, represented by the green points on the plot. Let’s look at the MAP estimates for the 50 topic model.
plot(BigSTM_50, type = "hist", topics = 1:20)
View the output for topics 1-20 (50 topic model).
plot(BigSTM_50, type = "hist", topics = 21:40)
View the output for topics 21-40 (50 topic model).
plot(BigSTM_50, type = "hist", topics = 41:50)
View the output for topics 41-50 (50 topic model).
These estimates all cluster towards 0. All together, the diagnostics on the 50 topic model indicate that it is an okay fit, but we can probably find a better model.
For the purposes of the tutorial, I will move to the next step. If I were using this analysis for a publication, I would probably try to find a better base model before adding covariates.
Let’s finish out the tutorial with some more graphs, sticking with the 50 topic model.
I can find words that load exclusively onto a topic with the “checkBeta” function. Let’s see if any of the topics in the 50 topic model have exclusive words.
checkBeta(BigSTM_50)
$problemTopics
$problemTopics[[1]]
integer(0)
$topicErrorTotal
$topicErrorTotal[[1]]
[1] 0
$problemWords
$problemWords[[1]]
Word Topic
[1,] "No Words" "Load Exclusively Onto 1 Topic"
$wordErrorTotal
$wordErrorTotal[[1]]
[1] 0
$checked
[1] TRUE
In this case, no words load exclusively onto 1 topic.
I can make a wordcloud that shows the marginal distribution of words in the entire corpus (across all topics), or I can look at individual topics. I know I want to look at the entire corpus, so let’s start there.
The “type” argument specifies whether to look at the probability of a word given a topic, or whether to instead plot words within documents with a topic proportion higher than some threshold.
I’ll start with the “model” type to look at the probability of a word given a topic. The max words defaults to 100. I like to start with slightly fewer words than this when I have under 100,000 documents (which here, I do). I’ll set the “max.words” argument to equal 75 here.
cloud(BigSTM_50,
topic = NULL,
type = c("model"),
max.words = 75)
Now let’s try with the max words set to 150.
cloud(BigSTM_50,
topic = NULL,
type = c("model"),
max.words = 150)
Looking at these clouds, the first important thing I notice is that “rt” is the largest word in these clouds. Given these data are tweets, and that “RT” is commonly used to note that one is “retweeting” the comments to follow. This is often the result of wanting to “retweet” a post from a user whose profile is set to private (meaning their posts cannot be retweeted, aka reposted). This can also appear in the comments of a post to note strong agreement.
The next largest words are “pandem” (the root for “pandemic”), “protect,” “covid,” “bank,” “virus,” “risk,” “peopl” (the root for “people”), and “trump.” These words are slightly more informative. Importantly, the words “covid,” “pandemic,” “risk,” “protection,” and “vrus” were the keywords queried to obtain these tweets. It makes sense that these would be prevalent across topics and documents. The words “bank,” “peopl,” and “trump” are notable for being both prevalent and not keywords used to query the tweets. In these data, many of the tweets related to former President Donald Trump, and many mentioned more general things like “people” and financial institutions.
Now, say I want to look at words within documents with a threshold of 0.5. I would specify the “documents” type and set the threshold to equal 0.5. Let’s try it.
cloud(BigSTM_50,
type = c("documents"),
documents = Big_outprep$documents,
thresh = 0.5,
max.words = 200)
In this case, the results do not change very much. I can plot the expected topic proportions to get a better idea of what these words look like in the topics.
par(mar = c(4, 1, 2.5, 3),
oma = c(0, 0, 0, 0),
cex = 0.6,
pin = c(8, 8),
pty = c("m"))
plot(BigSTM_50,
type = c("summary"),
n = 15,
main = c("Topics - Structural Topic Model with 79 Topics"),
xlab = c("Expected Topic Proportion"),
width = 30,
text.cex = 0.9)
Topic 12 here gives some insight as to the prevalence of the word “bank” in the wordclouds. Here it looks like there weer many discussions of the United States federal budget, which makes sense, as this has been an ongoing issue in Congress since the start of 2023, and these data were collected in early 2023 (March 14).
I can get extended information and top words using different algorithms with the stm “sageLabels” function.
sageLabels(BigSTM_50, n = 15)
Topic 1:
Marginal Highest Prob: covid19, rt, mrna, world, vaccin, warn, matter, new, product, danger, spartajustic, death, health, thechiefnerd, elonmusk
Marginal FREX: product, covid19, murder, danger, mrna, found, warn, jeno, world, matter, spartajustic, date, induc, mortal, protein
Marginal Lift: encod, jama, npis, nucleocapsid, puneindia, retrospect, transm, ultimahora, longev, nutrit, occurr, tomborelli, mvankerkhov, 17000, diagnos
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, ultimahora, covid19, mrna
Topic Kappa:
Kappa with Baseline:
Topic 2:
Marginal Highest Prob: ill, potus, social, kid, black, serv, medicar, will, western, women, without, energi, law, noteveri, speech
Marginal FREX: energi, noteveri, serv, ill, black, bless, western, kid, baddcompani, joke, surpris, potus, evil, speech, social
Marginal Lift: elimin, useounasscodenamshicouponnoondiscountsivvipromo, httpstcomhl2f97a1r, mingyulog, capac, httpstcog1ycexd5la, jonathanbodnar, energi, faucinot, gallopinggay, matjendav4, noteveri, bord, cbp, paso
Marginal Score: elimin, justiceforeden, punishedmoth, stra, ill, social, potus, black, medicar, western, serv, energi, kid, noteveri, without
Topic Kappa:
Kappa with Baseline:
Topic 3:
Marginal Highest Prob: children, rt, creat, week, care, abl, one, face, crisi, two, straight, proposit, outsourc, solomonmissouri, abhor
Marginal FREX: outsourc, solomonmissouri, children, abhor, revel, steviemat, straight, abl, week, proposit, care, yash25571056, creat, face, exasper
Marginal Lift: greed, outsourc, solomonmissouri, dcverso1, kevros765, exasper, verbos, abhor, revel, steviemat, yash25571056, cbsnew, httpstcodturqddxwc, rebound, straight
Marginal Score: anyway, bust, cult, greed, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, trustwallet, creat, children, crisi
Topic Kappa:
Kappa with Baseline:
Topic 4:
Marginal Highest Prob: trump, respons, end, rt, pandem, follow, administr, team, crisi, creat, obamabiden, noliewithbtc, rail, disband, direct
Marginal FREX: administr, end, respons, team, noliewithbtc, rail, trump, follow, obamabiden, disband, crisi, direct, creat, weaken, senwarren
Marginal Lift: backward, repronnyjackson, rail, noliewithbtc, disband, administr, team, obamabiden, weaken, follow, end, senwarren, direct, respons, crisi
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, repronnyjackson, stra, trustwallet, trump, disband
Topic Kappa:
Kappa with Baseline:
Topic 5:
Marginal Highest Prob: bank, silicon, valley, took, risk, investor, will, know, protect, potus, expert, mo, signatur, deposit, dont
Marginal FREX: investor, silicon, bank, valley, mo, 827js, coffe, httpstcoieid4tgr3c, nywolforg, took, expert, sig, inflict, recount, signatur
Marginal Lift: 827js, ashishkjha46, calendar, cdcdirector, longcovidawarenessday, march15, norpelsam, notrecov, jp, morgan, investor, anastas25608217, illiquid, mitziforpelosi, ryanafourni
Marginal Score: 827js, anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, bank, silicon
Topic Kappa:
Kappa with Baseline:
Topic 6:
Marginal Highest Prob: pandem, time, dure, rt, first, left, wake, treati, wolsn, httpstcowe74vshlw, certain, hous, money, turn, chuckcallesto
Marginal FREX: httpstcowe74vshlw, dure, wolsn, treati, cliffohio, httpstcozvaj0jhzoz, left, unseri, time, first, pandem, certain, wake, chuckcallesto, livestream
Marginal Lift: cliffohio, httpstcozvaj0jhzoz, unseri, amt, auction, httpstco5lij7dslv9, httpstcowe74vshlw, inherit, resudesu, statut, unclaim, livestream, walkerbragman, squar, wolsn
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, resudesu, stra, trustwallet, pandem, time
Topic Kappa:
Kappa with Baseline:
Topic 7:
Marginal Highest Prob: us, thank, didnt, rt, also, protect, million, mani, continu, wouldnt, god, american, abandon, led, taken
Marginal FREX: wouldnt, pressur, thank, abandon, didnt, led, god, continu, also, smoke, taken, denisedewald, lift, million, us
Marginal Lift: muthoninjakw, streng, meltvirus, ment, pressur, sincer, tinkswonu, wonwoo, expedit, unsaf, medium, resea, kristinoem, wouldnt, smoke
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, meltvirus, mr, non, punishedmoth, reall, stra, trustwallet, thank, didnt
Topic Kappa:
Kappa with Baseline:
Topic 8:
Marginal Highest Prob: pfizer, amp, 5, find, rt, harm, per, 3, vigilantfox, benefit, 10000, clinic, trial, time, moderna
Marginal FREX: 101, reanalysi, benefit, pfizer, per, harm, trial, odd, find, 5, 10000, clinic, vigilantfox, washburnealex, httpstcoynb67uplr
Marginal Lift: httpstcoynb67uplr, stevenvoiceov, 101, reanalysi, washburnealex, conjunct, pbis, pbiuk, ukbas, odd, pandemiccost, benefit, trial, fundrais, per
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, washburnealex, pfizer, 101
Topic Kappa:
Kappa with Baseline:
Topic 9:
Marginal Highest Prob: china, rt, th, wuhan, know, origin, virus, biden, hold, busi, joe, account, will, america, amp
Marginal FREX: hold, mrschines, th, china, wuhan, origin, mitch, beij, neurolog, xuanwu, baffl, joe, 1goodtern, busi, oct
Marginal Lift: beij, frame, httpstcoczlrqfornm, impeach, mrschines, neurolog, xuanwu, oct, httpstcoemiv7upaig, mbrookerhk, remad, resum, mitch, baffl, 33
Marginal Score: anyway, bust, cult, greedi, impeach, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, china, wuhan
Topic Kappa:
Kappa with Baseline:
Topic 10:
Marginal Highest Prob: rt, theyr, happen, take, understand, u, protect, even, depositor, talk, paid, insur, rememb, mass, fdic
Marginal FREX: atroc, repthomasmassi, ninobox, premium, theyr, happen, hmm, justice4tigray, tigray, paid, u, understand, scoop, commit, mass
Marginal Lift: 2x2sometimes5, 96, atroc, babyberoo96, brexitbust, charlott, drawbridg, drrf, enchantress, entic, eve, evict, fawcett, heaven, jenric
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, libsoftiktok, mr, non, punishedmoth, reall, stra, trustwallet, theyr, atroc
Topic Kappa:
Kappa with Baseline:
Topic 11:
Marginal Highest Prob: come, person, wrong, rt, lockdown, friend, covid, disastr, fcretir, closur, 12, sound, 1st, amp, immigr
Marginal FREX: wrong, person, come, fcretir, disastr, lockdown, closur, friend, realpeteyb123, immigr, sound, queer, wro, fatal, color
Marginal Lift: realpeteyb123, desir, tag, text, wro, fam, wrong, queer, drpaulmarik1, stretch, tho, vitsm, fcretir, disastr, supervisor
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, tag, trustwallet, wrong, come
Topic Kappa:
Kappa with Baseline:
Topic 12:
Marginal Highest Prob: covid, rt, bailout, state, far, weve, seen, ukrain, blue, loan, dcdraino, relief, disguis, student, vaccin
Marginal FREX: graphen, oxid, disguis, attend, dcdraino, whistleblow, coldlik, plethora, bailout, 14000, wale, far, dropout, blue, relief
Marginal Lift: captiv, graphen, httpstcotvkzapivoj, karma, los, oxid, politicalprison, prouddoggiemom, solitari, thrown, vac, voucher, antivaxx, chicago1ray, chud
Marginal Score: anyway, bust, cult, genius, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, covid, bailout
Topic Kappa:
Kappa with Baseline:
Topic 13:
Marginal Highest Prob: risk, rt, svb, manag, money, put, head, run, system, failur, high, simpl, lobbi, way, bailout
Marginal FREX: belli, johnnyxbrown, looser, readi, calvari, risk, fai, svb, cvpayn, debt, simpl, manag, lobbi, reward, failur
Marginal Lift: belli, goodvibepolitik, johnnyxbrown, responsib, lhfang, readi, reskless, tes, brianbrenberg, climatetech, sector, vital, calvari, fai, looser
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, readi, reall, stra, trustwallet, risk, svb
Topic Kappa:
Kappa with Baseline:
Topic 14:
Marginal Highest Prob: got, everi, rt, effect, 2020, anim, doe, articl, mask, travel, secret, mother, earli, protect, anybodi
Marginal FREX: got, secret, anim, effect, everi, httpstcoy9nzbiqlkc, movetheworldus, anybodi, 2020, travel, zeynep, ritual, articl, importan, journal
Marginal Lift: explor, haunt, heal, httpstcoy9nzbiqlkc, hydroxychloroquin, importan, ingredi, journal, marketend, milk, movetheworldus, novavax, ritual, solidar, toni
Marginal Score: anyway, bust, cult, greedi, heal, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, got, everi
Topic Kappa:
Kappa with Baseline:
Topic 15:
Marginal Highest Prob: dr, media, rt, receiv, kanekoathegreat, ha, stand, bhakdi, sucharit, welcom, germani, hometown, hero, legaci, ovat
Marginal FREX: ha, ovat, media, bhakdi, sucharit, hometown, germani, legaci, kanekoathegreat, dr, receiv, virologist, stand, hero, frequenc
Marginal Lift: frequenc, geograph, marburg, virologist, ovat, legaci, yahoonew, hometown, germani, bhakdi, sucharit, ha, jimmydor, stand, kanekoathegreat
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, virologist, dr, bhakdi
Topic Kappa:
Kappa with Baseline:
Topic 16:
Marginal Highest Prob: go, rt, let, protect, fight, point, cut, clear, valu, singl, blame, t, key, budget, help
Marginal FREX: butat, cortesstev, fight, go, religion, valu, let, penni, blame, cut, singl, point, httpstcolmzfjphrt, thatdayin1992, clear
Marginal Lift: acosta, battl, butat, cortesstev, lollipop, perjuri, queri, clas, everybodi, foster, qampa, aft, httpstcolmzfjphrt, thatdayin1992, imm
Marginal Score: anyway, battl, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, go, let
Topic Kappa:
Kappa with Baseline:
Topic 17:
Marginal Highest Prob: one, fauci, elonmusk, great, thechiefnerd, caus, vaccin, rt, cancer, covid, cours, use, discuss, mrna, prosecut
Marginal FREX: great, elonmusk, mbalter, thechiefnerd, drelidavid, type, cancer, synthet, discuss, fauci, polio, cours, potenti, caus, idk
Marginal Lift: mbalter, aghuff, synthet, allbitenobark88, recip, bioreal, robertwright, sullydish, mediocr, drelidavid, email, polio, idk, scientif, tumor
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mbalter, mr, non, punishedmoth, reall, stra, trustwallet, elonmusk, thechiefnerd
Topic Kappa:
Kappa with Baseline:
Topic 18:
Marginal Highest Prob: like, rt, look, begin, someth, fdic, onli, sexual, project, rule, sign, roll, back, rape, girl
Marginal FREX: look, jinxland, rting, roll, project, sexual, someth, 125b, 126, patrickbetdavid, ewarren, begin, like, crazi, girl
Marginal Lift: carcinogen, ding, edit, injecti, jinxland, kyl33t, premier, rting, sigmat, virul, 800, aidanthejest, annaekstrom, brown, disembowel
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, sigmat, stra, trustwallet, like, look
Topic Kappa:
Kappa with Baseline:
Topic 19:
Marginal Highest Prob: year, rt, pandem, just, 3, dead, ago, last, die, climat, sudden, million, health, gender, unexpect
Marginal FREX: huy, quan, berniesand, dc, christma, unexpect, sudden, midst, drloupi, year, ago, dod, catastroph, gender, climat
Marginal Lift: huy, observ, quan, biontain, biontechgroup, dc, mileston, okd, rwanda, senatorf, spreadi, secblinken, senato, sensass, craigaspenc
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, observ, punishedmoth, reall, stra, trustwallet, year, dead
Topic Kappa:
Kappa with Baseline:
Topic 20:
Marginal Highest Prob: can, rt, lot, protect, develop, might, organ, gun, longer, volatil, damag, brain, longcovid, function, use
Marginal FREX: lot, can, permitl, shannonrwatt, volatil, longer, might, stress, develop, organ, 20230314, finger, carri, teen, gun
Marginal Lift: 10yearold, authent, baat, cognit, conceptualjam, impair, kg, libbi, mann, modi, nurseosavag, permitl, shannonrwatt, thewirein, unmitig
Marginal Score: anyway, bust, conceptualjam, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, can, lot
Topic Kappa:
Kappa with Baseline:
Topic 21:
Marginal Highest Prob: never, kill, guy, rt, mayb, hard, protect, assess, one, heard, covid, almost, wife, tri, diedsudden
Marginal FREX: assess, hard, guy, mayb, kill, diedsudden, never, wife, almost, anxieti, ft, medicin, object, heard, majstar7
Marginal Lift: gwenmommabear, majstar7, mourn, pharmacogenom, rosewind2007, tenebra99, thisisnothappen, tragic, disord, heartbreak, diedsudden, ft, assess, object, wife
Marginal Score: anyway, bust, cult, greedi, gwenmommabear, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, never, kill
Topic Kappa:
Kappa with Baseline:
Topic 22:
Marginal Highest Prob: work, virus, stop, rt, knew, bad, covid, ani, fauci, whi, gregrubini, bio, xi, bot, jinp
Marginal FREX: jinp, 2009, bio, sniper, work, 2015, humankind, coverup, knew, xi, bot, stop, jessekellydc, dictat, pmcculloughmd
Marginal Lift: 70k, competitor, dwr, humankind, maestro, mc, monterey, parjaro, sccounti, sniper, swipe, usacehq, usbr, watsonvillec, 2009
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, wendyor, work, virus
Topic Kappa:
Kappa with Baseline:
Topic 23:
Marginal Highest Prob: rt, last, time, covid, place, 4, wors, chanc, 6, 2023, remind, vaccin, mitig, storiesofinjuri, md
Marginal FREX: er, httpstco4l2wigdt6, 4, alterivan1, cholera, stakehold, chanc, md, last, 6, sustain, remind, wors, 25k, storiesofinjuri
Marginal Lift: alterivan1, cholera, crown9th, hrtcoup, kst, mucor, stakehold, delilah, dy, guardrail, hialeah, jeffroushpoetri, missi, missingkid, stage
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stage, stra, trustwallet, last, time
Topic Kappa:
Kappa with Baseline:
Topic 24:
Marginal Highest Prob: day, long, veri, next, rt, covid, test, studi, still, sever, peopl, just, number, sometim, larg
Marginal FREX: veri, next, site, chicago, mythic, popup, sashamackinnon, unmint, whitelist, long, refresh, test, day, studi, stone
Marginal Lift: chicago, mythic, popup, sashamackinnon, unmint, whitelist, alvi, arewer, height, jmetr22b, lo, zishan, deworm, 77, hors
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, lo, mr, non, punishedmoth, reall, stra, trustwallet, veri, next
Topic Kappa:
Kappa with Baseline:
Topic 25:
Marginal Highest Prob: protect, rt, countri, fail, cost, invest, help, uk, biden, educ, member, us, say, ani, polic
Marginal FREX: paulmitchellab, andrew, cost, bridgen, def, facil, countri, parliament, uk, abc, sub, educ, invest, wow, fail
Marginal Lift: bridgen, def, facil, grab, paulmitchellab, 368, andrew, appl, ash, billioncan, breakfast, cryptocurr, forb, gabi, independ
Marginal Score: anyway, bust, cult, gabi, greedi, jkirk, mr, non, reall, stra, trustwallet, protect, cost, invest, countri
Topic Kappa:
Kappa with Baseline:
Topic 26:
Marginal Highest Prob: rt, student, public, sinc, system, school, chang, pandem, lose, ten, mrandyngo, thousand, oregon, whole, wasnt
Marginal FREX: httpstco7eepdml4yd, oregon, prisonplanet, fearmong, selectiv, mrandyngo, public, late, school, sinc, student, sister, thousand, ten, 250000
Marginal Lift: httpstco7eepdml4yd, prisonplanet, 300b, ampcov, bund, cap, categori, cervic, classifi, concentra, ctas, depoprovera, diaphragm, dioxin, elid
Marginal Score: anyway, bust, cult, greedi, httpstco7eepdml4yd, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, student, thousand
Topic Kappa:
Kappa with Baseline:
Topic 27:
Marginal Highest Prob: im, rt, well, sure, mask, wear, befor, just, covid, record, wont, coupl, fuck, realli, say
Marginal FREX: tilda, im, sure, choir, preach, sho, swinton, workforc, coupl, note, wear, well, fuck, record, brownecfm
Marginal Lift: choir, merry123459, pierrepoilievr, preach, sho, 114f, 42c, 457c, aacoek, lean, matam, pgdyne, tilda, workforc, airbnb
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, pierrepoilievr, punishedmoth, reall, stra, trustwallet, im, sure
Topic Kappa:
Kappa with Baseline:
Topic 28:
Marginal Highest Prob: want, rt, p, fact, stori, protect, opinion, new, sever, buy, press, tell, jake, kill, everyth
Marginal FREX: justiceforeden, punishedmoth, younger, want, press, jake, stori, howl, fact, 250k, opinion, intend, brother, p, electricalwsop
Marginal Lift: electricalwsop, howl, justiceforeden, punishedmoth, riskavers, te, younger, baji, bigpharma, cutest, hyung, jakez0n, markhamppc, testdem, theatr
Marginal Score: anyway, baji, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, want, press
Topic Kappa:
Kappa with Baseline:
Topic 29:
Marginal Highest Prob: now, start, pandem, rt, covid, thought, vs, best, thread, ive, good, parent, https, surviv, build
Marginal FREX: now, vs, 104, depressionbutt, epochinspir, grandpa, wwii, thought, surviv, spanish, thread, start, minor, https, best
Marginal Lift: realest, 104, abdaniellesmith, chrisjmerch, depressionbutt, epochinspir, friedmanja, georgefreemanmp, grandpa, httpstco2smkyugb9n, httpstconn9ketzdez, httpstcowwdgnysdyx, nceoscienc, ndp, nercscienc
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, riskfre, stra, trustwallet, now, start
Topic Kappa:
Kappa with Baseline:
Topic 30:
Marginal Highest Prob: dont, rt, pleas, anyon, protect, allow, just, yo, access, pray, abil, man, dear, ausbassi, environ
Marginal FREX: ausbassi, environ, tonightpray, corybook, maggiehassan, senateforeign, senatormenendez, senatorshaheen, sfrcdem, usun, senatedem, pleas, beauti, yo, abil
Marginal Lift: ournewhomecoach, outperform, ausbassi, corybook, environ, joy, karenkho, maggiehassan, puppymaximum, senateforeign, senatormenendez, senatorshaheen, sfrcdem, tonightpray, usun
Marginal Score: anyway, beauti, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, anyon, pleas
Topic Kappa:
Kappa with Baseline:
Topic 31:
Marginal Highest Prob: peopl, rt, mani, around, just, protect, walk, million, world, die, absolut, pandem, hospit, 1lonerlifestyl, bitch
Marginal FREX: 1lonerlifestyl, bitch, essenc, soulless, tryna, around, peopl, sarah, 30, absolut, walk, mani, purchas, richer, senior
Marginal Lift: 125, 1lonerlifestyl, 265, 637, advantag, bake, bitch, cuban, essenc, gunnelswarren, mollyjongfast, risemelbourn, soulless, tryna, karen
Marginal Score: advantag, anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, peopl, mani
Topic Kappa:
Kappa with Baseline:
Topic 32:
Marginal Highest Prob: get, thing, rt, big, protect, two, presid, iowa, realdonaldtrump, alexbruesewitz, secur, social, ethanol, applaus, vow
Marginal FREX: ethanol, applaus, vow, alexbruesewitz, realdonaldtrump, iowa, get, thing, twitterhol, big, piec, presid, disappear, two, trhloffici
Marginal Lift: twitterhol, abejmorri, advent, applaus, bor, hhepplewhit, httpstco6lhq2m2apj, jonddo, propagan, rainnwilson, robbystarbuck, stomach, taylorvizionn, vow, ethanol
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stomach, stra, trustwallet, get, ethanol
Topic Kappa:
Kappa with Baseline:
Topic 33:
Marginal Highest Prob: virus, rt, woke, infect, mind, caus, like, everyth, one, onli, dont, gop, call, georgetakei, die
Marginal FREX: kdnerak33, 75, bola, elluu, obi, officialtelz, pastor, georgetakei, infect, woke, elhopkin, zombi, gop, mind, voldemorgoret
Marginal Lift: elhopkin, 2344b, 75, bola, clade, elluu, httpstco1d6pzpy91d, httpstcoqg2rc25xfv, kdnerak33, nya, obi, officialtelz, oobbbear, pastor, tryangregori
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, tryangregori, virus, woke
Topic Kappa:
Kappa with Baseline:
Topic 34:
Marginal Highest Prob: will, rt, fauci, 2017, said, trump, becaus, face, 11, gregrubini, januari, made, yet, 9, pandem
Marginal FREX: 2017, 48, atmospher, usstormwatch, rain, 11, closer, totalitarian, tweetiez, 9, will, jesus, river, januari, california
Marginal Lift: cw3escripp, elev, emiss, ggreenwald, hydrolog, ordinari, satur, 48, atmospher, bankcollaps, closer, drbashir2018, jesus, rohn, totalitarian
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, ordinari, punishedmoth, reall, stra, trustwallet, will, 2017
Topic Kappa:
Kappa with Baseline:
Topic 35:
Marginal Highest Prob: use, tell, yes, better, love, rt, protect, game, away, pass, just, befor, appear, one, us
Marginal FREX: yes, tell, away, game, better, definit, use, appear, seal, regret, pass, necessari, car, enjoy, main
Marginal Lift: beyblad, rip, energet, happyharpys, httpstco1ymrfhrnhr, iconoclast1919, johniadarola, muchan1207chan, shape, ter, uhh, willeckert99, yimbyhvg, lordoftheyeti1, eugenicist
Marginal Score: anyway, beyblad, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, tell, yes
Topic Kappa:
Kappa with Baseline:
Topic 36:
Marginal Highest Prob: 2, rt, 1, interest, pandem, bill, director, gate, rate, isol, network, side, popul, treat, 3
Marginal FREX: 300k, chrosthugo, microfluid, neuron, uniqu, director, popul, 55000000, biontech, conservativecsn, fastest, pac, gate, 2, network
Marginal Lift: 300k, chrosthugo, microfluid, neuron, uniqu, 2012z, 55000000, aftermath, biontech, conservativecsn, context, curent, fastest, hnx, httpstcocgjm2u
Marginal Score: aftermath, anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, 2, director
Topic Kappa:
Kappa with Baseline:
Topic 37:
Marginal Highest Prob: rt, keep, fed, market, free, forget, protect, safe, stock, case, t, price, judg, bitcoin, civil
Marginal FREX: keep, market, civil, free, harbor, ident, stackhodl, fed, judg, price, forget, reject, request, mspopok, stock
Marginal Lift: mspopok, 12x, 3m, ghopp, harbor, httpstcovggd3vl5lg, ident, kentlee47, lummhandi, maureenstroud, mercuri, ounassnoonpromonontoyoutofordealhmbathandbodyhampm, purplerose19999, royalti, stackhodl
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, ounassnoonpromonontoyoutofordealhmbathandbodyhampm, punishedmoth, reall, stra, trustwallet, keep, fed
Topic Kappa:
Kappa with Baseline:
Topic 38:
Marginal Highest Prob: someon, stay, love, rt, protect, inform, hide, decept, dishonest, givi, theholisticpsyc, term, total, recal, els
Marginal FREX: decept, dishonest, givi, theholisticpsyc, someon, stay, hide, stole, love, inform, term, cbcs, elkebabiuk, factchec, trudeaus
Marginal Lift: cbcs, elkebabiuk, factchec, fung4l, httpstcogv3yia0ggg, phys, trudeaus, decept, dishonest, givi, stole, theholisticpsyc, impend, alexgre36024221, mtnutz
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stole, stra, trustwallet, someon, decept
Topic Kappa:
Kappa with Baseline:
Topic 39:
Marginal Highest Prob: protect, rt, alway, plan, tri, ani, see, hand, think, hell, job, trust, father, shut, popular
Marginal FREX: hell, alway, interact, intimid, popular, father, tl, plan, armi, saw, hand, njbbari3, lolli, rotten, smil
Marginal Lift: alanthompson58, amongst, assministr, serenityin24, wootton, 911onfox, interact, intimid, lekeolushuyi, lolli, luminouskryst, rotten, sili, smil, stolen
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, luminouskryst, mr, non, punishedmoth, reall, stra, trustwallet, alway, protect
Topic Kappa:
Kappa with Baseline:
Topic 40:
Marginal Highest Prob: give, join, rt, flu, miss, f, diseas, small, virus, global, practic, also, dr, signific, vaccin
Marginal FREX: join, f, diseas, miss, small, sar, give, flu, pox, yan, infecti, accord, stevebigpond, thisweekabc, vaughnmis
Marginal Lift: anthrax, barrett, chimera, cocktail, ctv, drlimengyan1, hiv, httpstcoxmb1zjabwo, insight, lisa, mer, xis, pox, nile, 03
Marginal Score: anthrax, anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, diseas, join
Topic Kappa:
Kappa with Baseline:
Topic 41:
Marginal Highest Prob: whi, say, rt, truth, virus, arent, cdc, amp, cdcthe, david, hughmankind, martin, oper, factcheck, journalist
Marginal FREX: cdcthe, hughmankind, factcheck, truth, martin, david, arent, whi, toll, say, oper, 116, donat, via, disast
Marginal Lift: album, allforj, fanbas, hardwork, httpstcoeurywyez3, hughmankind, monkeyking67, neverthelessyea, pour, toast, youtub, 116, 35mph, 645, 722
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, youtub, say, truth
Topic Kappa:
Kappa with Baseline:
Topic 42:
Marginal Highest Prob: rt, said, right, call, irrespons, went, state, amp, fauci, cnn, arrest, protect, b, peopl, cook
Marginal FREX: cook, joshuaphill, irrespons, karilak, cnn, right, millionair, angri, breitbartnew, kansa, arrest, call, said, went, 1745549
Marginal Lift: breitbartnew, joshuaphill, kansa, 1745549, ccpvirus, cumul, decemb, excessdeath, germa, httpstcobvc6dvxtln, jojofromjerz, mostransl, ushimaru2020, cook, bin
Marginal Score: anyway, bust, cult, greedi, jkirk, jojofromjerz, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, irrespons, call
Topic Kappa:
Kappa with Baseline:
Topic 43:
Marginal Highest Prob: rt, show, secur, live, new, data, brought, protect, reveal, us, biden, impact, australia, covid, cancel
Marginal FREX: leadership, brought, show, smith, coguest, ian, madg, truli, data, reveal, jnkengasong, mome, cancel, tv, impact
Marginal Lift: coguest, ian, madg, 2023s, 2900, aespa, beck, comptrol, dinapoli, fourth, fud31, glenn, jnkengasong, mome, na
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, na, non, punishedmoth, reall, stra, trustwallet, show, secur
Topic Kappa:
Kappa with Baseline:
Topic 44:
Marginal Highest Prob: make, rt, bank, billion, collaps, spread, just, today, c, covid, profit, whether, regul, took, trump
Marginal FREX: dive, cheridinovo, 20192021, loblaw, net, underpay, make, billion, svbs, trend, silverg, profit, collaps, 43, monday
Marginal Lift: princesskelli, 20192021, assumpt, httpstcopqj0zoe28a, httpstcouzoqeaoegl, katrinapanova, lite, loblaw, medrop, minneapoli, minnesota, net, notif, oneunderscor, retweetfollow
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, oneunderscor, punishedmoth, reall, stra, trustwallet, make, billion
Topic Kappa:
Kappa with Baseline:
Topic 45:
Marginal Highest Prob: rt, pay, back, noth, much, job, shit, get, order, protect, due, treasuri, second, cant, insur
Marginal FREX: much, augustjpollak, pay, shit, noth, assert, rahraw999, wealthi, treasuri, order, greedi, job, ri, non, back
Marginal Lift: ada, algo, augustjpollak, avax, dot, facebookdown, futuretak, icloud, imessag, qanplatform, qanx, xbox, zellekag, sacrifi, assert
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, xbox, pay, much
Topic Kappa:
Kappa with Baseline:
Topic 46:
Marginal Highest Prob: covid, still, rt, cant, real, believ, peopl, opportun, 1, pandem, bohemianatmosp1, even, stupid, lie, safe
Marginal FREX: bs, real, bohemianatmosp1, utter, stupid, cant, ge, opportun, believ, still, economy4health, httpstcowwayqt5xtj, luminari, 19, repeat
Marginal Lift: confou, barstoolsport, economy4health, httpstcowwayqt5xtj, luminari, httpstcodaunmusrbh, rumbl, dowellml, iluminatibot, paxlovid, cjjohnson17th, davidkra, camera, morningjo, bs
Marginal Score: anyway, barstoolsport, bust, cult, greedi, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, cant, covid
Topic Kappa:
Kappa with Baseline:
Topic 47:
Marginal Highest Prob: virus, rt, oh, done, print, lab, becom, month, hes, 2, say, ever, event, inflat, censored4sur
Marginal FREX: becom, alobarkahn, danielbaronn, print, aleresnik, transitori, amitrippedcat, twitter, done, oh, censored4sur, hes, gone, 40, came
Marginal Lift: aleresnik, delthiarick, genom, managertact, pedophil, phrase, proun, rino, tomthunkitsmind, transitori, alobarkahn, danielbaronn, slo, mechan, darpa
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, managertact, mr, non, punishedmoth, reall, stra, trustwallet, virus, gone
Topic Kappa:
Kappa with Baseline:
Topic 48:
Marginal Highest Prob: know, rt, everyon, account, lie, part, govern, onli, just, becaus, corrupt, america, bank, tri, expos
Marginal FREX: badass, gat, omgpatriot, mccarthi, kevin, unnot, digit, currenc, corrupt, catturd2, everyon, part, account, know, investi
Marginal Lift: 13m, 330k, 77steez, denisrancourt, drs4covideth, httpstco1lo4ezd8dl, httpstcoekwqg, presentatio, rancourt, rattl, teaser, 76, badass, gat, httpstcowh7lvakmsr
Marginal Score: anyway, bust, cult, greedi, httpstco1lo4ezd8dl, jkirk, justiceforeden, mr, non, punishedmoth, reall, stra, trustwallet, know, badass
Topic Kappa:
Kappa with Baseline:
Topic 49:
Marginal Highest Prob: need, rt, democrat, protect, dont, bc, agre, run, just, handl, know, oh, covid, u, safeti
Marginal FREX: need, democrat, agre, handl, bc, chiron, kelli, censorship, safeti, tyrann, babi, focus, run, lore, voter
Marginal Lift: insanit, kiwuikiwi, mandatori, pat300000, spell, vash, 2nda, 3h, 7h, acct, alyalyoutnfre, chiron, compassionateconsist, custodi, degreedummi
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mr, non, poverti, punishedmoth, reall, stra, trustwallet, need, democrat
Topic Kappa:
Kappa with Baseline:
Topic 50:
Marginal Highest Prob: rt, think, protect, covid, pandem, just, becaus, see, mattwalshblog, live, amp, take, learn, dont, one
Marginal FREX: think, rt, mattwalshblog, protect, covid, becaus, see, live, learn, just, pandem, take, life, even, amp
Marginal Lift: mattwalshblog, think, live, learn, see, rt, protect, becaus, fear, covid, just, life, w, take, shock
Marginal Score: anyway, bust, cult, greedi, jkirk, justiceforeden, mattwalshblog, mr, non, punishedmoth, reall, stra, trustwallet, protect, think
Topic Kappa:
Kappa with Baseline:
Notice the absence of “Topic Kappa” and “Kappa with Baseline” in the output. This makes sense given the results of our checkBeta test, which showed no words loading exclusively onto 1 topic, but also in context of our exclusion of covariates from the model.
The last thing I can do is look at the correlation of topics to each other. The stm “topicCorr” function produces a matrix estimating the strength of network ties between topics. You’ll want to use the “igraph” package for visualizations.
I’ll start by storing the matrix in a workspace object called “Big50_tcorr”
<- topicCorr(BigSTM_50) Big50_tcorr
Next, attach the “igraph” package if you have not already done so.
library(igraph)
Now we can plot the topic correlation network.
plot(Big50_tcorr)
I can change the parameters to make the graph easier to read.
plot(Big50_tcorr,
vertex.color = c("pink"),
vertex.label.cex = 0.3,
vertex.size = 9,
asp = 0.4)
Finally, I can create an igraph object to give myself more control over the graph’s parameters, like the layout.
<- graph_from_adjacency_matrix(Big50_tcorr$posadj,
Big50_adj mode="undirected",
weighted= NULL,
diag = F)
I obtain the layout coordinates by using the “layout_as_” functions in igraph. I’ll show three different layouts: nicely, tree, and star
<- layout_nicely(Big50_adj)
Big50_nice <- layout_as_tree(Big50_adj)
Big50_tree <- layout_as_star(Big50_adj) Big50_star
plot(Big50_adj,
layout = Big50_nice,
vertex.color = c("pink"),
vertex.size = 10,
vertex.label.cex = 0.6,
vertex.shape = c("square"),
loops = F,
curved = F,
asp = 0.7
)
plot(Big50_adj,
layout = Big50_tree,
vertex.color = c("pink"),
vertex.size = 5,
vertex.label.cex = 0.4,
vertex.shape = c("circle"),
loops = F,
curved = F,
asp = 0.3
)
plot(Big50_adj,
layout = Big50_star,
vertex.color = c("pink"),
vertex.size = 6,
vertex.label.cex = 0.5,
vertex.shape = c("circle"),
loops = F,
curved = F,
asp = 0.6
)
Notice that there are several isolates, or topics that are not tied (correlated) to any other topics. These topics might warrant a closer look, so if I were doing this for a publication, I might explore these individual topics further to see if I can figure out why they weren’t correlated to the other topics (what about these topics was different?).