STM

Author

Giuseppe A. Veltri

STM in R

Structural Topic Modeling (STM) is a method for analyzing textual data that allows you to estimate topics while accounting for document-level covariates. The stm package in R is widely used for conducting STM analysis. Here’s an example of how to perform STM using the stm package in R:

The first step is to load data into R. The stm package represents a text corpus in three parts:

  • a documents list containing word indices and their associated counts,

  • a vocab character vector containing the words associated with the word indices,

  • a metadata matrix containing document covariates.

Reading in data from a “spreadsheet”:

The stm package provides a simple tool to quickly get started for a common use case:

  1. Get data: The researcher has stored text data alongside covariates related to the text in a spreadsheet, such as a .csv file that contains textual data and associated metadata.

  2. Read data: If the researcher reads this data into a R data frame,

  3. Convert and Process data: then the stm package’s function textProcessor can conveniently convert and processes the data to make it ready for analysis in the stm package.

Note: The model does not permit estimation when there are variables used in the model that have missing values. As such, it can be helpful to subset data to observations that do not have missing values for metadata that will be used in the stm model.

To illustrate how to use the stm package, a collection of blogposts about American politics that were written in 2008, from the CMU 2008 Political Blog Corpus (Eisenstein and Xing 2010) wil be used. The blogposts were gathered from six different blogs: American Thinker, Digby, Hot Air, Michelle Malkin, Think Progress, and Talking Points Memo. Each blog has its own particular political bent. The day within 2008 when each blog was written was also recorded. Thus for each blogpost, there is metadata available on the day it was written and the political ideology of the blog for which it was written. In this case, each blog post is a row in a .csv file, with the text contained in a variable called documents.

# Link spreadsheet:
# http://scholar.princeton.edu/sites/default/files/bstewart/files/poliblogs2008.csv
library(stm)
stm v1.3.6 successfully loaded. See ?stm for help. 
 Papers, resources, and other materials at structuraltopicmodel.com
data <- read.csv("poliblogs2008.csv")

Pre-processing text content

It is often useful to engage in some processing of the text data before modeling it. The most common processing steps are stemming (reducing words to their root form), dropping punctuation and stop word removal (e.g., the, is, at). The textProcessor function implements each of these steps across multiple languages by using the stm package.

processed <- textProcessor(data$documents, metadata = data) 
Building corpus... 
Converting to Lower Case... 
Removing punctuation... 
Removing stopwords... 
Removing numbers... 
Stemming... 
Creating Output... 
out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
Removing 83198 of 123990 terms (83198 of 2298953 tokens) due to frequency 
Your corpus now has 13246 documents, 40792 terms and 2215755 tokens.
docs <- out$documents
vocab <- out$vocab
meta <- out$meta

After reading in the data, use the utility function prepDocuments to process the loaded data to make sure it is in the right format. prepDocuments also removes infrequent terms depending on the user-set parameter lower.thresh. The utility function plotRemoved will plot the number of words and documents removed for different thresholds.

For example, the user can use the following code to evaluate how many words and documents would be removed from the data set at each word threshold, which is the minimum number of documents a word needs to appear in order for the word to be kept within the vocabulary. Then the user can select their preferred threshold within prepDocuments.

Importantly, prepDocuments will also re-index all metadata/document relationships if any changes occur due to processing. If a document is completely removed due to pre-processing (for example because it contained only rare words), then prepDocuments will drop the corresponding row in the metadata as well. After reading in and processing the text data, it is important to inspect features of the documents and the associated vocabulary list to make sure they have been correctly pre-processed.

plotRemoved(processed$documents, lower.thresh = seq(1, 200, by = 100))

out <- prepDocuments(processed$documents, processed$vocab, processed$meta, lower.thresh = 15)
Removing 114230 of 123990 terms (220223 of 2298953 tokens) due to frequency 
Your corpus now has 13246 documents, 9760 terms and 2078730 tokens.

Estimating the structural topic model


The data import process will output documents, vocabulary and metadata that can be used for an analysis. In this section, let’s see how to estimate the STM. Next we move to a range of functions to evaluate, understand, and visualize the fitted model object.

The key innovation of the STM is that it incorporates metadata into the topic modeling framework. In STM, metadata can be entered in the topic model in two ways:

  • topical prevalence: Metadata covariates for topical prevalence allow the observed metadata to affect the frequency with which a topic is discussed.

  • topical content: Covariates in topical content allow the observed metadata to affect the word rate use within a given topic – that is, how a particular topic is discussed.

Estimation for both topical prevalence and content proceeds via the workhorse stm function.

Estimation with topical prevalence parameter

In this example, let’s use the ratings variable (blog ideology) as a covariate in the topic prevalence portion of the model with the CMU Poliblog data described above. Each document is modeled as a mixture of multiple topics.

Topical prevalence captures how much each topic contributes to a document. Because different documents come from different sources, it is natural to want to allow this prevalence to vary with metadata that we have about document sources.

We will let prevalence be a function of the “rating” variable, which is coded as either “Liberal” or “Conservative,” and the variable “day” which is an integer measure of days running from the first to the last day of 2008. To illustrate, let’s estimate a 20 topic STM model. The user can then pass the output from the model, poliblogPrevFit, through the various functions discussed below (e.g., the plot method for ‘STM’ objects) to inspect the results.

If a user wishes to specify additional prevalence covariates, she would do so using the standard formula notation in R. A feature of the stm function is that “prevalence” can be expressed as a formula that can include multiple covariates and factorial or continuous covariates. For example, by using the formula setup you can enter other covariates additively. Additionally users can include more flexible functional forms of continuous covariates, including standard transforms like log(), as well as ns() or bs() from the splines package.

The stm package also includes a convenience function s(), which selects a fairly flexible b-spline basis. In the current example, the variable day is allowed to be estimated with a spline. Interactions between covariates can also be added using the standard notation for R formulas.

In the example below, we enter in the variables additively, by allowing for the day variable, an integer variable measuring which day the blog was posted, to have a non-linear relationship in the topic estimation stage.

poliblogPrevFit <- stm(documents = out$documents, vocab = out$vocab, K = 20, prevalence = ~rating + s(day), max.em.its = 15, data = out$meta, init.type = "Spectral")
Beginning Spectral Initialization 
     Calculating the gram matrix...
     Finding anchor words...
    ....................
     Recovering initialization...
    .................................................................................................
Initialization complete.
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.677) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.520, relative change = 2.045e-02) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.483, relative change = 4.961e-03) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.467, relative change = 2.095e-03) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.459, relative change = 1.116e-03) 
Topic 1: obama, mccain, poll, voter, democrat 
 Topic 2: peopl, one, like, think, polit 
 Topic 3: tax, will, mccain, economi, econom 
 Topic 4: one, ’re, get, ’ll, like 
 Topic 5: vote, elect, voter, state, republican 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, stori, time, like 
 Topic 8: democrat, senat, bill, republican, legisl 
 Topic 9: will, price, oil, govern, energi 
 Topic 10: court, law, case, right, will 
 Topic 11: mccain, campaign, john, million, oil 
 Topic 12: iran, nuclear, will, iranian, russia 
 Topic 13: palin, sarah, governor, report, state 
 Topic 14: israel, terrorist, attack, will, kill 
 Topic 15: school, obama, will, educ, work 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, climat, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: hillari, clinton, democrat, obama, will 
 Topic 20: will, allah, muslim, say, muhammad 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.454, relative change = 6.643e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.451, relative change = 4.272e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.448, relative change = 2.932e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.447, relative change = 2.069e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.446, relative change = 1.550e-04) 
Topic 1: obama, mccain, poll, voter, democrat 
 Topic 2: peopl, like, think, polit, one 
 Topic 3: tax, will, economi, econom, plan 
 Topic 4: one, ’re, get, don’t, like 
 Topic 5: vote, elect, voter, republican, state 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, stori, time, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: will, oil, energi, price, govern 
 Topic 10: court, law, case, right, rule 
 Topic 11: mccain, campaign, million, john, report 
 Topic 12: iran, nuclear, will, iranian, russia 
 Topic 13: palin, sarah, governor, state, alaska 
 Topic 14: terrorist, israel, attack, kill, pakistan 
 Topic 15: school, work, union, educ, obama 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, chang, climat 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: hillari, clinton, will, democrat, obama 
 Topic 20: will, muslim, allah, islam, say 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.445, relative change = 1.198e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.444, relative change = 9.612e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.443, relative change = 7.984e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.443, relative change = 6.582e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Model Terminated Before Convergence Reached 

The model is set to run for a maximum of 75 EM iterations (controlled by max.em.its). Convergence is monitored by the change in the approximate variational lower bound. Once the bound has a small enough change between iterations, the model is considered converged.

Model initialization for a fixed number of topics

As with all mixed-membership topic models, the posterior is intractable and non-convex, which creates a multi-modal estimation problem that can be sensitive to initialization. Put differently, the answers the estimation procedure comes up with may depend on starting values of the parameters (e.g., the distribution over words for a particular topic).

There are two approaches to dealing with this that the stm package facilitates.

  1. The first is to use an initialization based on the method of moments, which is deterministic and globally consistent under reasonable conditions (Arora et al. 2013; Roberts et al. 2016a). This is known as a spectral initialization because it uses a spectral decomposition (non-negative matrix factorization) of the word co-occurrence matrix. In practice, this initialization is very helpful. This can be chosen by setting init.type = "Spectral" in the stm function. This option is used in the above example. This means that no matter the seed that is set, the same results will be generated. When the vocabulary is larger than 10,000 words, the function will temporarily subset the vocabulary size for the duration of the initialization.

  2. The second approach is to initialize the model with a short run of a collapsed Gibbs sampler for LDA. For completeness researchers can also initialize the model randomly, but this is generally not recommended. In practice, the spectral initialization is generally recommended as it has been found that it to produce the best results consistently (Roberts et al. 2016a).

Model selection for a fixed number of topics

When not using the spectral initialization, the analyst should estimate many models, each from different initializations, and then evaluate each model according to some separate standard (several are provided below).

The function selectModel automates this process to facilitate finding a model with desirable properties. Users specify the number of “runs,” which in the example below is set to 20. selectModel first casts a net where “run” (below 10) models are run for two EM steps, and then models with low likelihoods are discarded.

Next, the default returns the 20% of models with the highest likelihoods, which are then run until convergence or the EM iteration maximum is reached. Notice that options for the stm function can be passed to selectModel, such as max.em.its. If users would like to select a larger number of models to be run completely, this can also be set with an option specified in the help file for this function.

poliblogSelect <- selectModel(out$documents, out$vocab, K = 20, prevalence = ~rating + s(day), max.em.its = 15, data = out$meta, runs = 20, seed = 8458159)
Casting net 
1 models in net 
2 models in net 
3 models in net 
4 models in net 
5 models in net 
6 models in net 
7 models in net 
8 models in net 
9 models in net 
10 models in net 
11 models in net 
12 models in net 
13 models in net 
14 models in net 
15 models in net 
16 models in net 
17 models in net 
18 models in net 
19 models in net 
20 models in net 
Running select models 
1 select model run 
Beginning LDA Initialization 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.568) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.562, relative change = 8.286e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.559, relative change = 3.654e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.557, relative change = 2.847e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.554, relative change = 3.087e-04) 
Topic 1: will, tax, american, economi, plan 
 Topic 2: poll, state, voter, new, race 
 Topic 3: will, year, energi, price, global 
 Topic 4: hous, money, million, financi, crisi 
 Topic 5: mccain, john, said, sen, mccain’ 
 Topic 6: obama, barack, polit, report, media 
 Topic 7: think, like, peopl, know, dont 
 Topic 8: will, clinton, democrat, hillari, obama 
 Topic 9: show, get, like, one, post 
 Topic 10: report, time, stori, new, one 
 Topic 11: iran, group, israel, polici, state 
 Topic 12: american, america, countri, peopl, will 
 Topic 13: ’re, don’t, question, think, say 
 Topic 14: one, day, will, man, men 
 Topic 15: attack, govern, terrorist, forc, nuclear 
 Topic 16: vote, republican, democrat, senat, elect 
 Topic 17: iraq, bush, war, militari, iraqi 
 Topic 18: law, said, hous, legisl, administr 
 Topic 19: palin, women, sarah, like, experi 
 Topic 20: obama, mccain, campaign, barack, john 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.551, relative change = 3.760e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.548, relative change = 4.394e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.544, relative change = 4.818e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.541, relative change = 5.066e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.537, relative change = 4.976e-04) 
Topic 1: will, tax, american, plan, economi 
 Topic 2: poll, state, voter, race, new 
 Topic 3: will, energi, year, price, global 
 Topic 4: money, hous, million, financi, crisi 
 Topic 5: mccain, john, said, sen, mccain’ 
 Topic 6: obama, barack, polit, media, report 
 Topic 7: think, like, peopl, know, just 
 Topic 8: will, clinton, democrat, hillari, parti 
 Topic 9: show, get, post, like, one 
 Topic 10: time, report, stori, new, one 
 Topic 11: iran, group, israel, polici, state 
 Topic 12: american, america, countri, will, peopl 
 Topic 13: ’re, don’t, question, say, want 
 Topic 14: one, will, day, man, men 
 Topic 15: attack, govern, terrorist, nuclear, forc 
 Topic 16: vote, republican, democrat, senat, elect 
 Topic 17: iraq, bush, war, militari, iraqi 
 Topic 18: law, said, hous, legisl, administr 
 Topic 19: palin, women, sarah, experi, run 
 Topic 20: obama, mccain, campaign, barack, john 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.533, relative change = 4.892e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.530, relative change = 4.780e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.526, relative change = 4.640e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.523, relative change = 4.530e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Model Terminated Before Convergence Reached 
2 select model run 
Beginning LDA Initialization 
....................................................................................................
Completed E-Step (7 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.567) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.562, relative change = 7.471e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.559, relative change = 3.222e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.557, relative change = 2.518e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.555, relative change = 2.733e-04) 
Topic 1: think, peopl, like, polit, dont 
 Topic 2: one, women, life, word, man 
 Topic 3: democrat, republican, senat, bill, vote 
 Topic 4: mccain, obama, campaign, john, said 
 Topic 5: oil, energi, compani, financi, price 
 Topic 6: obama, palin, barack, polit, campaign 
 Topic 7: nation, will, secur, foreign, polici 
 Topic 8: school, use, famili, peopl, get 
 Topic 9: said, report, today, former, meet 
 Topic 10: state, group, work, organ, public 
 Topic 11: ’re, get, don’t, think, question 
 Topic 12: iraq, war, militari, iran, iraqi 
 Topic 13: bush, presid, administr, said, white 
 Topic 14: media, time, stori, one, news 
 Topic 15: law, court, rule, state, case 
 Topic 16: will, can, america, countri, presid 
 Topic 17: will, world, one, nuclear, even 
 Topic 18: tax, plan, econom, economi, will 
 Topic 19: new, time, last, two, day 
 Topic 20: obama, clinton, hillari, democrat, voter 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.553, relative change = 3.265e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.550, relative change = 3.824e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.546, relative change = 4.355e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.543, relative change = 4.676e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.539, relative change = 4.721e-04) 
Topic 1: think, peopl, like, polit, dont 
 Topic 2: women, one, life, man, word 
 Topic 3: democrat, republican, senat, bill, vote 
 Topic 4: mccain, campaign, john, obama, said 
 Topic 5: oil, energi, compani, financi, price 
 Topic 6: obama, barack, palin, polit, campaign 
 Topic 7: nation, will, polici, foreign, secur 
 Topic 8: school, use, famili, peopl, said 
 Topic 9: said, report, today, former, meet 
 Topic 10: state, group, work, organ, public 
 Topic 11: get, ’re, don’t, think, question 
 Topic 12: iraq, war, militari, iran, iraqi 
 Topic 13: bush, presid, administr, said, white 
 Topic 14: media, time, stori, news, one 
 Topic 15: law, court, rule, case, state 
 Topic 16: will, can, america, presid, countri 
 Topic 17: will, world, nuclear, one, also 
 Topic 18: tax, plan, econom, economi, will 
 Topic 19: new, last, time, day, two 
 Topic 20: obama, clinton, hillari, democrat, voter 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.536, relative change = 4.576e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.533, relative change = 4.298e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.530, relative change = 3.972e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.527, relative change = 3.754e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Model Terminated Before Convergence Reached 
3 select model run 
Beginning LDA Initialization 
....................................................................................................
Completed E-Step (6 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.571) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.564, relative change = 8.730e-04) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.561, relative change = 3.836e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.559, relative change = 2.950e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.557, relative change = 3.266e-04) 
Topic 1: one, will, man, peopl, love 
 Topic 2: tax, will, plan, american, econom 
 Topic 3: mccain, republican, will, john, conserv 
 Topic 4: said, report, citi, offici, new 
 Topic 5: iraq, war, militari, forc, secur 
 Topic 6: govern, hous, crisi, financi, market 
 Topic 7: bush, presid, said, hous, administr 
 Topic 8: iran, terrorist, israel, foreign, polici 
 Topic 9: obama, barack, campaign, polit, obama’ 
 Topic 10: law, court, case, rule, justic 
 Topic 11: palin, media, news, report, press 
 Topic 12: question, ’re, get, don’t, say 
 Topic 13: clinton, hillari, democrat, obama, will 
 Topic 14: vote, elect, democrat, state, republican 
 Topic 15: peopl, think, know, dont, just 
 Topic 16: year, like, one, even, legisl 
 Topic 17: right, issu, can, peopl, will 
 Topic 18: mccain, john, campaign, obama, new 
 Topic 19: will, oil, energi, new, nation 
 Topic 20: time, report, public, new, york 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.554, relative change = 3.904e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.550, relative change = 4.590e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.546, relative change = 5.174e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.542, relative change = 5.462e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.538, relative change = 5.577e-04) 
Topic 1: will, one, man, peopl, love 
 Topic 2: tax, plan, will, american, econom 
 Topic 3: republican, mccain, will, conserv, john 
 Topic 4: said, report, citi, offici, new 
 Topic 5: iraq, war, militari, secur, iraqi 
 Topic 6: govern, hous, crisi, financi, market 
 Topic 7: bush, presid, said, hous, administr 
 Topic 8: iran, terrorist, israel, foreign, state 
 Topic 9: obama, barack, campaign, polit, obama’ 
 Topic 10: law, court, case, investig, justic 
 Topic 11: palin, media, news, report, press 
 Topic 12: question, get, ’re, say, don’t 
 Topic 13: clinton, hillari, democrat, obama, will 
 Topic 14: vote, elect, democrat, state, republican 
 Topic 15: peopl, think, know, just, like 
 Topic 16: year, like, one, even, legisl 
 Topic 17: right, issu, can, will, bill 
 Topic 18: mccain, john, campaign, new, poll 
 Topic 19: will, oil, energi, nation, new 
 Topic 20: time, report, public, new, york 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.534, relative change = 5.434e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.530, relative change = 5.143e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.526, relative change = 4.804e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.523, relative change = 4.580e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Model Terminated Before Convergence Reached 
4 select model run 
Beginning LDA Initialization 
....................................................................................................
Completed E-Step (6 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.571) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.564, relative change = 8.214e-04) 
....................................................................................................
Completed E-Step (5 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.562, relative change = 3.658e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.560, relative change = 2.856e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.557, relative change = 3.162e-04) 
Topic 1: poll, obama, state, mccain, among 
 Topic 2: time, report, use, new, said 
 Topic 3: bush, hous, oil, bill, energi 
 Topic 4: legisl, will, one, say, ayer 
 Topic 5: palin, new, state, sarah, rep 
 Topic 6: obama, clinton, hillari, will, win 
 Topic 7: say, think, want, talk, obama 
 Topic 8: iraq, war, militari, iraqi, troop 
 Topic 9: tax, econom, economi, govern, money 
 Topic 10: polit, obama, black, one, wright 
 Topic 11: american, will, countri, america, peopl 
 Topic 12: law, report, state, offici, investig 
 Topic 13: mccain, said, say, news, report 
 Topic 14: iran, terrorist, attack, israel, foreign 
 Topic 15: will, can, right, rule, court 
 Topic 16: like, know, think, just, thing 
 Topic 17: chicago, time, one, two, citi 
 Topic 18: obama, mccain, campaign, john, barack 
 Topic 19: democrat, republican, vote, elect, parti 
 Topic 20: get, even, one, like, want 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.554, relative change = 3.842e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.551, relative change = 4.323e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.547, relative change = 4.670e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.544, relative change = 4.821e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.540, relative change = 4.788e-04) 
Topic 1: poll, state, among, percent, obama 
 Topic 2: report, time, new, use, said 
 Topic 3: bush, hous, bill, oil, congress 
 Topic 4: legisl, will, say, one, ayer 
 Topic 5: palin, new, sarah, state, rep 
 Topic 6: obama, clinton, hillari, will, win 
 Topic 7: say, think, want, talk, get 
 Topic 8: iraq, war, militari, iraqi, troop 
 Topic 9: tax, econom, economi, govern, money 
 Topic 10: polit, black, obama, wright, one 
 Topic 11: american, will, countri, america, peopl 
 Topic 12: law, report, state, offici, investig 
 Topic 13: said, mccain, say, sen, mccain’ 
 Topic 14: iran, terrorist, attack, israel, foreign 
 Topic 15: will, right, rule, can, court 
 Topic 16: like, know, just, think, thing 
 Topic 17: chicago, citi, time, stori, two 
 Topic 18: obama, mccain, campaign, john, barack 
 Topic 19: democrat, republican, vote, elect, parti 
 Topic 20: get, even, one, want, like 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.537, relative change = 4.726e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.533, relative change = 4.670e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.530, relative change = 4.603e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.526, relative change = 4.494e-04) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Model Terminated Before Convergence Reached 

In order to select a model for further investigation, users must choose one of the candidate models’ outputs from selectModel. To do this, plotModels can be used to plot two scores: semantic coherence and exclusivity for each model and topic. Each of these criteria are calculated for each topic within a model run.

The plotModels function calculates the average across all topics for each run of the model and plots these by labeling the model run with a numeral. Often users will select a model with desirable properties in both dimensions (i.e., models with average scores towards the upper right side of the plot).

plotModels(poliblogSelect, pch = c(1, 2, 3, 4), legend.position = "bottomright")

As shown in the Figure , the plotModels function also plots each topic’s values, which helps give a sense of the variation in these parameters. For a given model, the user can plot the semantic coherence and exclusivity scores with the topicQuality function.

Next the user would want to select one of these models to work with. For example, the third model could be extracted from the object that is outputted by selectModel.

selectedmodel <- poliblogSelect$runout[[3]]

Model search across numbers of topics

STM assumes a fixed user-specified number of topics. There is not a "right" answer to the number of topics that are appropriate for a given corpus (Grimmer and Stewart 2013), but the function searchK uses a data-driven approach to selecting the number of topics. The function will perform several automated tests to help choose the number of topics including calculating the held-out log-likelihood (Wallach, Murray, Salakhutdinov, and Mimno 2009) and performing a residual analysis (Taddy 2012).

For example, one could estimate a STM model for 7 and 10 topics and compare the results along each of the criteria. The default initialization is the spectral initialization due to its stability.

This function will also calculate a range of quantities of interest, including the average exclusivity and semantic coherence.

storage <- searchK(out$documents, out$vocab, K = c(7, 10), prevalence = ~rating + s(day), data = meta)
Beginning Spectral Initialization 
     Calculating the gram matrix...
     Finding anchor words...
    .......
     Recovering initialization...
    .................................................................................................
Initialization complete.
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.758) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.640, relative change = 1.520e-02) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.606, relative change = 4.392e-03) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.592, relative change = 1.874e-03) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.584, relative change = 1.013e-03) 
Topic 1: obama, democrat, campaign, hillari, clinton 
 Topic 2: mccain, john, palin, campaign, said 
 Topic 3: peopl, will, one, american, can 
 Topic 4: one, like, get, time, just 
 Topic 5: will, hous, tax, govern, american 
 Topic 6: state, vote, elect, will, court 
 Topic 7: iraq, war, will, militari, bush 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.580, relative change = 6.297e-04) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.577, relative change = 4.132e-04) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.574, relative change = 2.851e-04) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.573, relative change = 2.007e-04) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.572, relative change = 1.471e-04) 
Topic 1: obama, democrat, campaign, will, clinton 
 Topic 2: mccain, john, palin, said, campaign 
 Topic 3: peopl, will, one, american, can 
 Topic 4: one, like, get, time, just 
 Topic 5: will, tax, hous, govern, year 
 Topic 6: state, vote, senat, elect, court 
 Topic 7: iraq, war, will, bush, militari 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.571, relative change = 1.133e-04) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.570, relative change = 8.973e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.570, relative change = 7.297e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.569, relative change = 6.099e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 15 (approx. per word bound = -7.569, relative change = 5.247e-05) 
Topic 1: obama, democrat, campaign, will, clinton 
 Topic 2: mccain, john, said, palin, campaign 
 Topic 3: peopl, one, will, american, can 
 Topic 4: one, like, time, get, just 
 Topic 5: will, tax, year, govern, american 
 Topic 6: state, senat, law, vote, court 
 Topic 7: iraq, war, will, militari, bush 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 16 (approx. per word bound = -7.568, relative change = 4.681e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 17 (approx. per word bound = -7.568, relative change = 4.309e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 18 (approx. per word bound = -7.568, relative change = 4.027e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 19 (approx. per word bound = -7.567, relative change = 3.799e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 20 (approx. per word bound = -7.567, relative change = 3.379e-05) 
Topic 1: obama, democrat, campaign, will, clinton 
 Topic 2: mccain, said, john, palin, campaign 
 Topic 3: peopl, will, one, american, can 
 Topic 4: one, like, time, get, media 
 Topic 5: will, tax, year, govern, american 
 Topic 6: state, senat, law, hous, report 
 Topic 7: iraq, war, will, militari, bush 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 21 (approx. per word bound = -7.567, relative change = 2.864e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 22 (approx. per word bound = -7.567, relative change = 2.560e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 23 (approx. per word bound = -7.567, relative change = 2.337e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 24 (approx. per word bound = -7.566, relative change = 2.130e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 25 (approx. per word bound = -7.566, relative change = 1.966e-05) 
Topic 1: obama, democrat, campaign, will, clinton 
 Topic 2: mccain, said, john, palin, campaign 
 Topic 3: peopl, will, one, american, think 
 Topic 4: one, like, time, get, media 
 Topic 5: will, tax, year, govern, american 
 Topic 6: state, hous, senat, law, report 
 Topic 7: iraq, war, will, militari, american 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 26 (approx. per word bound = -7.566, relative change = 1.811e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 27 (approx. per word bound = -7.566, relative change = 1.634e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 28 (approx. per word bound = -7.566, relative change = 1.361e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 29 (approx. per word bound = -7.566, relative change = 1.032e-05) 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Model Converged 
Beginning Spectral Initialization 
     Calculating the gram matrix...
     Finding anchor words...
    ..........
     Recovering initialization...
    .................................................................................................
Initialization complete.
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.740) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.599, relative change = 1.816e-02) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.563, relative change = 4.735e-03) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.550, relative change = 1.824e-03) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.543, relative change = 9.099e-04) 
Topic 1: obama, hillari, democrat, clinton, campaign 
 Topic 2: mccain, obama, campaign, john, palin 
 Topic 3: peopl, will, one, american, can 
 Topic 4: one, like, get, ’re, ’ll 
 Topic 5: hous, bush, presid, senat, bill 
 Topic 6: state, vote, elect, republican, court 
 Topic 7: iraq, war, said, militari, iraqi 
 Topic 8: will, tax, american, year, govern 
 Topic 9: media, time, stori, report, news 
 Topic 10: will, nation, govern, israel, world 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.539, relative change = 5.379e-04) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.536, relative change = 3.484e-04) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.534, relative change = 2.350e-04) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.533, relative change = 1.665e-04) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.532, relative change = 1.247e-04) 
Topic 1: obama, democrat, hillari, will, clinton 
 Topic 2: mccain, obama, john, campaign, palin 
 Topic 3: peopl, will, american, one, can 
 Topic 4: one, get, like, ’re, ’ll 
 Topic 5: bush, hous, presid, senat, administr 
 Topic 6: state, vote, elect, republican, court 
 Topic 7: iraq, war, militari, iraqi, said 
 Topic 8: will, tax, american, year, govern 
 Topic 9: media, time, like, stori, report 
 Topic 10: will, israel, iran, nation, world 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.531, relative change = 9.701e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.531, relative change = 7.895e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.530, relative change = 6.560e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.530, relative change = 5.515e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 15 (approx. per word bound = -7.529, relative change = 4.717e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, obama, campaign, palin 
 Topic 3: peopl, will, american, one, can 
 Topic 4: one, get, like, ’re, ’ll 
 Topic 5: bush, hous, presid, administr, senat 
 Topic 6: state, vote, elect, republican, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, american, govern 
 Topic 9: media, time, like, think, know 
 Topic 10: will, iran, israel, nuclear, world 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 16 (approx. per word bound = -7.529, relative change = 4.151e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 17 (approx. per word bound = -7.529, relative change = 3.802e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 18 (approx. per word bound = -7.529, relative change = 3.558e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 19 (approx. per word bound = -7.528, relative change = 3.257e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 20 (approx. per word bound = -7.528, relative change = 2.978e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, campaign, obama, palin 
 Topic 3: peopl, american, will, one, america 
 Topic 4: one, get, like, ’re, don’t 
 Topic 5: bush, hous, presid, administr, said 
 Topic 6: state, vote, elect, republican, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, govern, american 
 Topic 9: think, media, like, time, know 
 Topic 10: iran, will, israel, nuclear, world 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 21 (approx. per word bound = -7.528, relative change = 2.782e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 22 (approx. per word bound = -7.528, relative change = 2.574e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 23 (approx. per word bound = -7.528, relative change = 2.390e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 24 (approx. per word bound = -7.527, relative change = 2.234e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 25 (approx. per word bound = -7.527, relative change = 2.077e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, campaign, palin, obama 
 Topic 3: peopl, american, will, one, america 
 Topic 4: one, get, like, ’re, don’t 
 Topic 5: bush, hous, presid, administr, said 
 Topic 6: state, vote, elect, campaign, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, govern, american 
 Topic 9: think, like, media, know, say 
 Topic 10: iran, will, israel, world, nuclear 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 26 (approx. per word bound = -7.527, relative change = 1.930e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 27 (approx. per word bound = -7.527, relative change = 1.792e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 28 (approx. per word bound = -7.527, relative change = 1.677e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 29 (approx. per word bound = -7.527, relative change = 1.605e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 30 (approx. per word bound = -7.527, relative change = 1.534e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, campaign, palin, obama 
 Topic 3: american, will, peopl, one, america 
 Topic 4: one, get, like, ’re, don’t 
 Topic 5: bush, hous, presid, administr, said 
 Topic 6: state, vote, elect, campaign, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, govern, american 
 Topic 9: think, like, know, peopl, say 
 Topic 10: iran, will, israel, world, nuclear 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 31 (approx. per word bound = -7.526, relative change = 1.463e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 32 (approx. per word bound = -7.526, relative change = 1.394e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 33 (approx. per word bound = -7.526, relative change = 1.332e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 34 (approx. per word bound = -7.526, relative change = 1.268e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 35 (approx. per word bound = -7.526, relative change = 1.200e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, campaign, palin, obama 
 Topic 3: american, will, peopl, one, america 
 Topic 4: one, get, like, ’re, ’ll 
 Topic 5: bush, hous, presid, administr, said 
 Topic 6: state, vote, elect, campaign, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, govern, american 
 Topic 9: think, like, peopl, know, say 
 Topic 10: iran, will, israel, world, nuclear 
....................................................................................................
Completed E-Step (1 seconds). 
Completed M-Step. 
Completing Iteration 36 (approx. per word bound = -7.526, relative change = 1.146e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 37 (approx. per word bound = -7.526, relative change = 1.101e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 38 (approx. per word bound = -7.526, relative change = 1.078e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 39 (approx. per word bound = -7.526, relative change = 1.050e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 40 (approx. per word bound = -7.526, relative change = 1.045e-05) 
Topic 1: obama, democrat, will, hillari, clinton 
 Topic 2: mccain, john, campaign, palin, said 
 Topic 3: will, american, peopl, one, america 
 Topic 4: one, get, like, ’re, ’ll 
 Topic 5: bush, hous, presid, administr, said 
 Topic 6: state, vote, elect, campaign, senat 
 Topic 7: iraq, war, militari, iraqi, troop 
 Topic 8: will, tax, year, govern, american 
 Topic 9: think, like, peopl, know, say 
 Topic 10: iran, will, israel, world, nuclear 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 41 (approx. per word bound = -7.526, relative change = 1.024e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Completing Iteration 42 (approx. per word bound = -7.526, relative change = 1.003e-05) 
....................................................................................................
Completed E-Step (2 seconds). 
Completed M-Step. 
Model Converged 

There is another more preliminary selection strategy based on work by Lee and Mimno (2014). When initialization type is set to “Spectral” the user can specify K=0 to use the algorithm of Lee and Mimno (2014) to select the number of topics. The core idea of the spectral initialization is to approximately find the vertices of the convex hull of the word co-occurrences.

The algorithm of Lee and Mimno (2014) projects the matrix into a low dimensional space using t-distributed stochastic neighbor embedding (Van der Maaten 2014) and then exactly solves for the convex hull. This has the advantage of automatically selecting the number of topics. The added randomness from the projection means that the algorithm is not deterministic like the standard “Spectral” initialization type. Running it with a different seed can result in not only different results but a different number of topics. This procedure has no particular statistical guarantees and should not be seen as estimating the "true" number of topics. However it can be useful to start and has the computational advantage that it only needs to be run once.

Interpreting the STM by plotting and inspecting results


After choosing a model, the user must next interpret the model results. There are many ways to investigate the output, such as inspecting the words associated with topics or the relationship between metadata and topics. To investigate the output of the model, the stm package provides a number of options.

  1. Displaying words associated with topics (labelTopics, plot method for 'STM' objects with argument type = "labels", sageLabels, plot method for 'STM' objects with argument type = "perspectives") or documents highly associated with particular topics (findThoughts, plotQuote).

  2. Estimating relationships between metadata and topics as well as topical content (estimateEffect).

  3. Calculating topic correlations (topicCorr)

Understanding topics through words and example documents

Two approaches are describe for users to explore the topics that have been estimated. The first approach is to look at collections of words that are associated with topics. The second approach is to examine actual documents that are estimated to be highly associated with each topic. Both of these approaches should be used. Below, the 20 topic model estimated with the spectral initialization is used.

To explore the words associated with each topic, the labelTopics function can be used. For models where a content covariate is included sageLabels can also be used. Both these functions will print words associated with each topic to the console. The function by default prints several different types of word profiles, including highest probability words and FREX words.

FREX weights words by their overall frequency and how exclusive they are to the topic (calculated as given in Equation 6).13 Lift weights words by dividing by their frequency in other topics, therefore giving higher weight to words that appear less frequently in other topics. For more information on lift, see Taddy (2013).

Similar to lift, score divides the log frequency of the word in the topic by the log frequency of the word in other topics. For more information on score, see the lda R package. In order to translate these results to a format that can easily be used within a paper, the plot method for 'STM' objects with argument type = "labels" will print topic words to a graphic device. Notice that in this case, the labels option is specified as the plot method for 'STM' objects has several functionalities that is described below (the options for “perspectives” and “summary”).

labelTopics(poliblogPrevFit, c(6, 13, 18))
Topic 6 Top Words:
     Highest Prob: obama, mccain, campaign, barack, john, said, say 
     FREX: obama, barack, mccain, obama’, debat, wright, joe 
     Lift: oct, goolsbe, schieffer, town-hal, bayh, austan, robocal 
     Score: obama, mccain, oct, campaign, barack, biden, wright 
Topic 13 Top Words:
     Highest Prob: palin, sarah, governor, state, polit, alaska, blagojevich 
     FREX: blagojevich, palin, sarah, palin’, rezko, alaska, governor 
     Lift: monegan, blagojevich, wasilla, juneau, rezko’, blago, “palin 
     Score: palin, sarah, monegan, blagojevich, palin’, alaska, governor 
Topic 18 Top Words:
     Highest Prob: bush, presid, said, administr, hous, white, report 
     FREX: cheney, bush’, rove, cia, perino, tortur, interrog 
     Lift: addington, torture”, fratto, cia’, mcclellan, ashcroft, waterboard 
     Score: addington, bush, tortur, perino, cia, rove, presid 

Estimating metadata/topic relationships

Estimating the relationship between metadata and topics is a core feature of the stm package. These relationships can also play a key role in validating the topic model's usefulness (Grimmer 2010; Grimmer and Stewart 2013). While stm estimates the relationship for the (K − 1) simplex, the workhorse function for extracting the relationships and associated uncertainty on all K topics is estimateEffect. This function simulates a set of parameters which can then be plotted.

Typically, users will pass the same model of topical prevalence used in estimating the STM to the estimateEffect function. The syntax of the estimateEffect function is designed so users specify the set of topics they wish to use for estimation, and then a formula for metadata of interest. Different estimation strategies and standard plot design features can be used by calling the plot method for 'estimateEffect' objects.

estimateEffect can calculate uncertainty in several ways. The default is “Global”, which will incorporate estimation uncertainty of the topic proportions into the uncertainty estimates using the method of composition. If users do not propagate the full amount of uncertainty, e.g., in order to speed up computational time, they can choose uncertainty = "None", which will generally result in narrower confidence intervals because it will not include the additional estimation uncertainty. Calling summary on the 'estimateEffect' object will generate a regression table.

out$meta$rating <- as.factor(out$meta$rating)
prep <- estimateEffect(1:20 ~ rating + s(day), poliblogPrevFit, meta = out$meta, uncertainty = "Global")
summary(prep, topics = 1)

Call:
estimateEffect(formula = 1:20 ~ rating + s(day), stmobj = poliblogPrevFit, 
    metadata = out$meta, uncertainty = "Global")


Topic 1:

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.0672878  0.0116892   5.756 8.78e-09 ***
ratingLiberal -0.0040968  0.0029685  -1.380   0.1676    
s(day)1       -0.0055655  0.0225761  -0.247   0.8053    
s(day)2       -0.0332943  0.0137225  -2.426   0.0153 *  
s(day)3        0.0016440  0.0161324   0.102   0.9188    
s(day)4       -0.0298897  0.0138301  -2.161   0.0307 *  
s(day)5       -0.0210947  0.0147670  -1.429   0.1532    
s(day)6       -0.0062131  0.0147803  -0.420   0.6742    
s(day)7        0.0009988  0.0142191   0.070   0.9440    
s(day)8        0.0440837  0.0176700   2.495   0.0126 *  
s(day)9       -0.1014548  0.0176449  -5.750 9.13e-09 ***
s(day)10      -0.0217291  0.0167761  -1.295   0.1953    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The functions described previously to understand STM results can be leveraged to visualize results for formal presentation. In this section let’s focus on several of these visualization tools.

Summary visualization

Corpus level visualization can be done in several different ways. The first relates to the expected proportion of the corpus that belongs to each topic. This can be plotted using the plot method for 'STM' objects with argument type = "summary". An example from the political blogs data is given in next figure. We see, for example, that the Sarah Palin/vice president Topic 13 is actually a relatively minor proportion of the discourse. The most common topic is a general topic full of words that bloggers commonly use, and therefore is not very interpretable. The words listed in the figure are the top three words associated with the topic.

plot(poliblogPrevFit, type = "summary", xlim = c(0, 0.3))

In order to plot features of topics in greater detail, there are a number of options in the plot method for 'STM' objects, such as plotting larger sets of words highly associated with a topic or words that are exclusive to the topic. Furthermore, the cloud function will plot a standard word cloud of words in a topic16 and the plotQuote function provides an easy to use graphical wrapper such that complete examples of specific documents can easily be included in the presentation of results.

Metadata/topic relationship visualization

Now let’s discuss plotting metadata/topic relationships, as the ability to estimate these relationships is a core advantage of the STM model. The core plotting function is the plot method for 'estimateEffect' objects, which handles the output of estimateEffect.

First, users must specify the variable that they wish to use for calculating an effect. If there are multiple variables specified in estimateEffect, then all other variables are held at their sample median. These parameters include the expected proportion of a document that belongs to a topic as a function of a covariate, or a first difference type estimate, where topic prevalence for a particular topic is contrasted for two groups (e.g., liberal versus conservative). estimateEffect should be run and the output saved before plotting when it is time intensive to calculate uncertainty estimates and/or because users might wish to plot different quantities of interest using the same simulated parameters from estimateEffect. The output can then be plotted.

When the covariate of interest is binary, or users are interested in a particular contrast, the method = "difference" option will plot the change in topic proportion shifting from one specific value to another. Figure 6 gives an example. For factor variables, users may wish to plot the marginal topic proportion for each of the levels (“pointestimate”).

plot(prep, covariate = "rating", topics = c(6, 13, 18), model = poliblogPrevFit, method = "difference", cov.value1 = "Liberal", cov.value2 = "Conservative", xlab = "More Conservative ... More Liberal", main = "Effect of Liberal vs. Conservative", xlim = c(-0.1, 0.1), labeltype = "custom", custom.labels = c("Obama/McCain", "Sarah Palin", "Bush Presidency"))

We see Topic 6 is strongly used slightly more by liberals as compared to conservatives, while Topic 13 is close to the middle but still conservative-leaning. Topic 18, the discussion of Bush, was largely associated with liberal writers, which is in line with the observed trend of conservatives distancing from Bush after his presidency.

Notice how the function makes use of standard labeling options available in the native plot() function. This allows the user to customize labels and other features of their plots. Note that in the package, generics are leverage for the plot functions. As such, one can simply use plot and rely on method dispatch.

When users have variables that they want to treat continuously, users can choose between assuming a linear fit or using splines. In the previous example, the day variable is allowed to have a non-linear relationship in the topic estimation stage. Then plot its effect on topics. In Figure 7, the relationship between time and the vice president topic, Topic 13, is plotted. The topic peaks when Sarah Palin became John McCain's running mate at the end of August in 2008.

plot(prep, "day", method = "continuous", topics = 13, model = z, printlegend = FALSE, xaxt = "n", xlab = "Time (2008)")
monthseq <- seq(from = as.Date("2008-01-01"), to = as.Date("2008-12-01"), by = "month")
monthnames <- months(monthseq)
axis(1,at = as.numeric(monthseq) - min(as.numeric(monthseq)), labels = monthnames)

Plotting covariate interactions

Another modification that is possible in this framework is to allow for interactions between covariates such that one variable may "moderate" the effect of another variable. In this example, the STM is re-estimated to allow for an interaction between day (entered linearly) and rating. Then in estimateEffect() the same interaction is included. This allows us in the plot method for 'estimateEffect' objects to have this interaction plotted. The results are displayed in Figure 10 for Topic 20 (Bush administration). You can observe that conservatives never wrote much about this topic, whereas liberals discussed this topic a great deal, but over time the topic diminished in salience.

poliblogInteraction <- stm(out$documents, out$vocab, K = 20, prevalence =~ rating * day, max.em.its = 75, data = out$meta, init.type = "Spectral")
Beginning Spectral Initialization 
     Calculating the gram matrix...
     Finding anchor words...
    ....................
     Recovering initialization...
    .................................................................................................
Initialization complete.
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 1 (approx. per word bound = -7.677) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 2 (approx. per word bound = -7.520, relative change = 2.041e-02) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 3 (approx. per word bound = -7.483, relative change = 4.927e-03) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 4 (approx. per word bound = -7.468, relative change = 2.081e-03) 
....................................................................................................
Completed E-Step (4 seconds). 
Completed M-Step. 
Completing Iteration 5 (approx. per word bound = -7.459, relative change = 1.108e-03) 
Topic 1: obama, mccain, poll, voter, democrat 
 Topic 2: peopl, one, like, think, polit 
 Topic 3: tax, will, mccain, economi, econom 
 Topic 4: one, ’re, get, ’ll, like 
 Topic 5: vote, elect, voter, state, republican 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, stori, time, like 
 Topic 8: democrat, senat, bill, republican, legisl 
 Topic 9: will, price, oil, govern, energi 
 Topic 10: court, law, case, right, will 
 Topic 11: mccain, campaign, john, million, report 
 Topic 12: iran, nuclear, will, iranian, russia 
 Topic 13: palin, sarah, governor, report, state 
 Topic 14: israel, terrorist, attack, will, kill 
 Topic 15: school, obama, educ, will, work 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, climat, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: hillari, clinton, democrat, obama, will 
 Topic 20: will, allah, muslim, say, muhammad 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 6 (approx. per word bound = -7.455, relative change = 6.608e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 7 (approx. per word bound = -7.451, relative change = 4.231e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 8 (approx. per word bound = -7.449, relative change = 2.892e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 9 (approx. per word bound = -7.448, relative change = 2.060e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 10 (approx. per word bound = -7.447, relative change = 1.537e-04) 
Topic 1: obama, mccain, poll, voter, democrat 
 Topic 2: peopl, like, think, polit, one 
 Topic 3: tax, will, economi, econom, plan 
 Topic 4: one, ’re, get, don’t, like 
 Topic 5: vote, elect, voter, republican, state 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, stori, time, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: will, oil, energi, price, govern 
 Topic 10: court, law, case, right, rule 
 Topic 11: mccain, campaign, million, john, report 
 Topic 12: iran, nuclear, will, iranian, russia 
 Topic 13: palin, sarah, governor, state, alaska 
 Topic 14: terrorist, israel, attack, kill, pakistan 
 Topic 15: school, work, union, educ, year 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, climat, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: hillari, clinton, will, democrat, obama 
 Topic 20: will, muslim, allah, islam, say 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 11 (approx. per word bound = -7.446, relative change = 1.190e-04) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 12 (approx. per word bound = -7.445, relative change = 9.684e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 13 (approx. per word bound = -7.444, relative change = 7.868e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 14 (approx. per word bound = -7.444, relative change = 6.483e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 15 (approx. per word bound = -7.443, relative change = 5.447e-05) 
Topic 1: obama, mccain, poll, voter, state 
 Topic 2: peopl, think, like, polit, dont 
 Topic 3: tax, will, econom, economi, plan 
 Topic 4: one, ’re, like, get, don’t 
 Topic 5: vote, elect, voter, republican, state 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, stori, time, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, will, energi, price, drill 
 Topic 10: law, court, case, right, rule 
 Topic 11: mccain, campaign, million, money, report 
 Topic 12: iran, nuclear, will, world, iranian 
 Topic 13: palin, sarah, governor, state, polit 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: school, work, union, educ, famili 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: hillari, clinton, will, democrat, obama 
 Topic 20: will, muslim, allah, christian, god 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 16 (approx. per word bound = -7.443, relative change = 4.466e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 17 (approx. per word bound = -7.443, relative change = 3.849e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 18 (approx. per word bound = -7.443, relative change = 3.304e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 19 (approx. per word bound = -7.442, relative change = 2.940e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 20 (approx. per word bound = -7.442, relative change = 2.816e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: peopl, think, like, dont, polit 
 Topic 3: tax, will, econom, economi, plan 
 Topic 4: one, ’re, like, get, don’t 
 Topic 5: vote, elect, voter, republican, ballot 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, right, rule 
 Topic 11: mccain, campaign, million, money, report 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, sarah, governor, polit, state 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: school, work, famili, union, educ 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: will, muslim, christian, god, allah 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 21 (approx. per word bound = -7.442, relative change = 2.788e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 22 (approx. per word bound = -7.442, relative change = 2.963e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 23 (approx. per word bound = -7.442, relative change = 2.989e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 24 (approx. per word bound = -7.441, relative change = 2.972e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 25 (approx. per word bound = -7.441, relative change = 2.946e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: peopl, think, like, dont, polit 
 Topic 3: tax, will, econom, economi, plan 
 Topic 4: one, like, get, ’re, don’t 
 Topic 5: vote, elect, voter, republican, ballot 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, rule, right 
 Topic 11: campaign, million, mccain, money, report 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, sarah, governor, biden, polit 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: school, work, famili, year, union 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: will, muslim, christian, god, allah 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 26 (approx. per word bound = -7.441, relative change = 2.891e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 27 (approx. per word bound = -7.441, relative change = 2.709e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 28 (approx. per word bound = -7.440, relative change = 2.526e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 29 (approx. per word bound = -7.440, relative change = 2.368e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 30 (approx. per word bound = -7.440, relative change = 2.309e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: peopl, think, like, dont, know 
 Topic 3: tax, will, econom, economi, plan 
 Topic 4: one, like, get, ’re, don’t 
 Topic 5: vote, elect, voter, republican, ballot 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, rule, state 
 Topic 11: campaign, million, money, report, new 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, biden, sarah, governor, polit 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: school, famili, work, american, year 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, presid, said, administr, hous 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: church, will, muslim, christian, god 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 31 (approx. per word bound = -7.440, relative change = 2.248e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 32 (approx. per word bound = -7.440, relative change = 2.151e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 33 (approx. per word bound = -7.440, relative change = 2.077e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 34 (approx. per word bound = -7.439, relative change = 2.059e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 35 (approx. per word bound = -7.439, relative change = 2.044e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: peopl, think, like, dont, know 
 Topic 3: tax, will, econom, economi, govern 
 Topic 4: one, like, get, ’re, don’t 
 Topic 5: vote, elect, voter, republican, ballot 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, rule, state 
 Topic 11: million, campaign, money, report, new 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, biden, sarah, governor, obama 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: school, american, famili, work, year 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, said, presid, administr, hous 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: church, will, muslim, christian, god 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 36 (approx. per word bound = -7.439, relative change = 2.056e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 37 (approx. per word bound = -7.439, relative change = 1.960e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 38 (approx. per word bound = -7.439, relative change = 1.885e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 39 (approx. per word bound = -7.439, relative change = 1.845e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 40 (approx. per word bound = -7.439, relative change = 1.825e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: peopl, think, like, dont, know 
 Topic 3: tax, will, econom, economi, govern 
 Topic 4: one, like, get, ’re, don’t 
 Topic 5: vote, elect, voter, ballot, republican 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, bill, vote 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, state, rule 
 Topic 11: million, campaign, money, report, new 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, biden, sarah, governor, obama 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: american, school, famili, america, work 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, said, presid, hous, administr 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: church, will, muslim, christian, god 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 41 (approx. per word bound = -7.438, relative change = 1.784e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 42 (approx. per word bound = -7.438, relative change = 1.679e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 43 (approx. per word bound = -7.438, relative change = 1.532e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 44 (approx. per word bound = -7.438, relative change = 1.484e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 45 (approx. per word bound = -7.438, relative change = 1.365e-05) 
Topic 1: obama, poll, mccain, voter, state 
 Topic 2: think, peopl, like, dont, know 
 Topic 3: tax, will, econom, economi, govern 
 Topic 4: one, like, get, ’re, don’t 
 Topic 5: vote, elect, voter, ballot, republican 
 Topic 6: obama, mccain, campaign, barack, john 
 Topic 7: media, news, time, stori, report 
 Topic 8: democrat, senat, republican, vote, bill 
 Topic 9: oil, energi, will, price, drill 
 Topic 10: law, court, case, state, legal 
 Topic 11: million, money, campaign, report, new 
 Topic 12: iran, nuclear, world, will, nation 
 Topic 13: palin, biden, sarah, governor, obama 
 Topic 14: terrorist, attack, israel, kill, govern 
 Topic 15: american, school, america, famili, year 
 Topic 16: iraq, war, iraqi, troop, militari 
 Topic 17: global, warm, abort, gay, chang 
 Topic 18: bush, said, presid, hous, sen 
 Topic 19: clinton, hillari, will, democrat, primari 
 Topic 20: church, will, muslim, christian, god 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Completing Iteration 46 (approx. per word bound = -7.438, relative change = 1.178e-05) 
....................................................................................................
Completed E-Step (3 seconds). 
Completed M-Step. 
Model Converged