## Package version: 2.1.2
## Parallel computing: 2 of 8 threads used.
## See https://quanteda.io for tutorials and examples.
## 
## Attaching package: 'quanteda'
## The following object is masked from 'package:utils':
## 
##     View
## Loading required package: usethis
## 
## Attaching package: 'quanteda.textmodels'
## The following object is masked from 'package:quanteda':
## 
##     data_dfm_lbgexample
## 
## Attaching package: 'seededlda'
## The following object is masked from 'package:stats':
## 
##     terms
## 
## Attaching package: 'rsconnect'
## The following object is masked from 'package:devtools':
## 
##     lint
## 
## Attaching package: 'packrat'
## The following objects are masked from 'package:devtools':
## 
##     install, install_local

#create corpus

## readtext object consisting of 30 documents and 1 docvar.
## # Description: df[,3] [30 x 3]
##   doc_id          text                docvar1    
##   <chr>           <chr>               <chr>      
## 1 Catalonia1.pdf  "\"          \"..." Catalonia1 
## 2 Catalonia10.pdf "\"          \"..." Catalonia10
## 3 Catalonia11.pdf "\"          \"..." Catalonia11
## 4 Catalonia12.pdf "\"      'Cat\"..." Catalonia12
## 5 Catalonia13.pdf "\"     Catal\"..." Catalonia13
## 6 Catalonia14.pdf "\" Catalonia\"..." Catalonia14
## # ... with 24 more rows

#create corpus

## [1] 30
##       docvar1
## 1  Catalonia1
## 2 Catalonia10
## 3 Catalonia11
## 4 Catalonia12
## 5 Catalonia13
## 6 Catalonia14
## Corpus consisting of 30 documents, showing 30 documents:
## 
##             Text Types Tokens Sentences     docvar1
##   Catalonia1.pdf   245    496        16  Catalonia1
##  Catalonia10.pdf  1124   4368       237 Catalonia10
##  Catalonia11.pdf  1352   4398       223 Catalonia11
##  Catalonia12.pdf   291    677        22 Catalonia12
##  Catalonia13.pdf   512   1160        33 Catalonia13
##  Catalonia14.pdf   285    633        24 Catalonia14
##  Catalonia15.pdf   461   1047        34 Catalonia15
##  Catalonia16.pdf   448    992        30 Catalonia16
##  Catalonia17.pdf   475   1018        32 Catalonia17
##  Catalonia18.pdf  1427   5410       217 Catalonia18
##  Catalonia19.pdf   296    660        19 Catalonia19
##   Catalonia2.pdf   355    771        14  Catalonia2
##  Catalonia20.pdf   423    947        33 Catalonia20
##  Catalonia21.pdf   422    824        27 Catalonia21
##  Catalonia22.pdf   274    497        17 Catalonia22
##  Catalonia23.pdf   654   2128        63 Catalonia23
##  Catalonia24.pdf   667   1814        42 Catalonia24
##  Catalonia25.pdf   397    811        31 Catalonia25
##  Catalonia26.pdf   273    594        25 Catalonia26
##  Catalonia27.pdf   467    990        24 Catalonia27
##  Catalonia28.pdf   453   1095        34 Catalonia28
##  Catalonia29.pdf   436    914        31 Catalonia29
##   Catalonia3.pdf   461   1162        39  Catalonia3
##  Catalonia30.pdf   325    669        22 Catalonia30
##   Catalonia4.pdf   461   1029        40  Catalonia4
##   Catalonia5.pdf   353    663        20  Catalonia5
##   Catalonia6.pdf   487   1181        36  Catalonia6
##   Catalonia7.pdf   385    937        45  Catalonia7
##   Catalonia8.pdf   307    652        21  Catalonia8
##   Catalonia9.pdf   287    541        18  Catalonia9

#create dfm

## Length  Class   Mode 
##  97440    dfm     S4

#Cleaning up using tokens

##                 Length Class  Mode     
## Catalonia1.pdf   239   -none- character
## Catalonia10.pdf 1743   -none- character
## Catalonia11.pdf 1994   -none- character
## Catalonia12.pdf  339   -none- character
## Catalonia13.pdf  594   -none- character
## Catalonia14.pdf  346   -none- character
## Catalonia15.pdf  521   -none- character
## Catalonia16.pdf  513   -none- character
## Catalonia17.pdf  525   -none- character
## Catalonia18.pdf 2808   -none- character
## Catalonia19.pdf  325   -none- character
## Catalonia2.pdf   382   -none- character
## Catalonia20.pdf  474   -none- character
## Catalonia21.pdf  460   -none- character
## Catalonia22.pdf  285   -none- character
## Catalonia23.pdf 1112   -none- character
## Catalonia24.pdf  828   -none- character
## Catalonia25.pdf  428   -none- character
## Catalonia26.pdf  317   -none- character
## Catalonia27.pdf  446   -none- character
## Catalonia28.pdf  550   -none- character
## Catalonia29.pdf  443   -none- character
## Catalonia3.pdf   578   -none- character
## Catalonia30.pdf  362   -none- character
## Catalonia4.pdf   496   -none- character
## Catalonia5.pdf   360   -none- character
## Catalonia6.pdf   567   -none- character
## Catalonia7.pdf   466   -none- character
## Catalonia8.pdf   331   -none- character
## Catalonia9.pdf   289   -none- character

#kwic doesn’t work with dfm, so you have to use tokens

docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
Catalonia19.pdf 4 4 madrid seal spain fate salmond pro-independ catalan fate
Catalonia19.pdf 107 107 judg actual seal fate spanish state see fate
Catalonia19.pdf 227 227 madrid seal spain fate salmond pro-independ catalan fate
Catalonia2.pdf 325 325 govern surviv hing fate talk aim walk fate
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
Catalonia16.pdf 210 210 cover face scarv occupi high-spe railway track occupi
Catalonia24.pdf 568 568 referendum artur mas occupi presid catalan govern occupi
Catalonia26.pdf 229 229 would-b voter chant occupi forc vote other occupi
Catalonia28.pdf 42 42 support referendum organis occupi school region can occupi
Catalonia28.pdf 123 123 go ahead ballot occupi least school throughout occupi
Catalonia28.pdf 321 321 use children weekend occupi school catalonia can occupi
Catalonia28.pdf 330 330 station parent pupil occupi school polic dismantl occupi
Catalonia3.pdf 99 99 polic demonstr attempt occupi barcelona-el prat airport occupi
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
Catalonia26.pdf 163 163 rajoy said said suppress show dread extern suppress
Catalonia7.pdf 177 177 back control madrid suppress catalan languag catalan suppress
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
Catalonia15.pdf 401 401 voter spanish govern repres catalonia enric millo repres
Catalonia18.pdf 316 316 import step legitim repres citizen follow peopl repres
Catalonia18.pdf 1382 1382 spain expect send repres rule region long repres
Catalonia2.pdf 153 153 hold dialogu catalan repres puigdemont given hope repres
Catalonia23.pdf 270 270 import step legitim repres citizen follow peopl repres
Catalonia23.pdf 718 718 meet plenari session repres citizen sovereignti abl repres
Catalonia25.pdf 361 361 within spain catalonia repres fifth spain gross repres
Catalonia29.pdf 140 140 catalan offici turnout repres region million elig repres
Catalonia3.pdf 126 126 torra carri claim repres view region peopl repres
Catalonia4.pdf 322 322 said torrent us repres opportun take vote repres
Catalonia6.pdf 555 555 sole author necessarili repres duran http://infobrics.org/post/31770/ https://theduran.com/cdn-cgi/l/email- repres
Catalonia8.pdf 99 99 case state attorney-gener repres madrid govern call repres
## Tokens consisting of 30 documents and 1 docvar.
## Catalonia1.pdf :
##  [1] "catalan"   "elect"     "quim"      "torra"     "remov"     "may"      
##  [7] "plebiscit" "independ"  "nation"    "scotland"  "septemb"   "tuesday"  
## [ ... and 227 more ]
## 
## Catalonia10.pdf :
##  [1] "catalan"   "presid"    "outlin"    "challeng"  "spanish"   "daili"    
##  [7] "interview" "bbc"       "monitor"   "europ"     "polit"     "suppli"   
## [ ... and 1,731 more ]
## 
## Catalonia11.pdf :
##  [1] "catalan"  "agree"    "independ" "mean"     "atlantic" "online"  
##  [7] "decemb"   "friday"   "atlantic" "month"    "group"    "inc"     
## [ ... and 1,981 more ]
## 
## Catalonia12.pdf :
##  [1] "catalonia"  "want"       "independ"   "catalan"    "politician"
##  [6] "ignores"    "voter"      "amid"       "threat"     "express"   
## [11] "online"     "october"   
## [ ... and 327 more ]
## 
## Catalonia13.pdf :
##  [1] "catalonia" "independ"  "ten"       "thousand"  "take"      "barcelona"
##  [7] "street"    "support"   "unit"      "spain"     "dens"      "crowd"    
## [ ... and 582 more ]
## 
## Catalonia14.pdf :
##  [1] "catalonia" "live"      "webcam"    "watch"     "crowd"     "celebr"   
##  [7] "catalan"   "independ"  "spain"     "live"      "express"   "online"   
## [ ... and 334 more ]
## 
## [ reached max_ndoc ... 24 more documents ]
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern

Catalonia.tokens2

#create dfm from first tokenized steps

#word cloud

#co-occurance

##      spain     region       call  barcelona     declar parliament      rajoy 
##     107899     105823      68097      59083      52247      49770      49719 
##     madrid    protest       fear 
##      48936      46295      44430
##  [1] "spain"      "region"     "call"       "barcelona"  "declar"    
##  [6] "parliament" "rajoy"      "madrid"     "protest"    "fear"      
## [11] "puigdemont" "rule"       "peopl"      "minist"     "crisi"     
## [16] "control"    "support"    "back"       "state"      "forc"      
## [21] "spanish"    "parti"      "also"       "prime"      "caption"