## Package version: 2.1.2
## Parallel computing: 2 of 8 threads used.
## See https://quanteda.io for tutorials and examples.
## 
## Attaching package: 'quanteda'
## The following object is masked from 'package:utils':
## 
##     View
## Loading required package: usethis
## 
## Attaching package: 'quanteda.textmodels'
## The following object is masked from 'package:quanteda':
## 
##     data_dfm_lbgexample
## 
## Attaching package: 'seededlda'
## The following object is masked from 'package:stats':
## 
##     terms
## 
## Attaching package: 'rsconnect'
## The following object is masked from 'package:devtools':
## 
##     lint
## 
## Attaching package: 'packrat'
## The following objects are masked from 'package:devtools':
## 
##     install, install_local

#create corpus

## readtext object consisting of 30 documents and 2 docvars.
## # Description: df[,4] [30 x 4]
##   doc_id              text                docvar1 docvar2    
##   <chr>               <chr>               <chr>   <chr>      
## 1 New_Caledonia1.pdf  "\"   A bit o\"..." New     Caledonia1 
## 2 New_Caledonia10.pdf "\"          \"..." New     Caledonia10
## 3 New_Caledonia11.pdf "\"          \"..." New     Caledonia11
## 4 New_Caledonia12.pdf "\"    High t\"..." New     Caledonia12
## 5 New_Caledonia13.pdf "\"         H\"..." New     Caledonia13
## 6 New_Caledonia14.pdf "\" New Caled\"..." New     Caledonia14
## # ... with 24 more rows

#create corpus

## [1] 30
##   docvar1     docvar2
## 1     New  Caledonia1
## 2     New Caledonia10
## 3     New Caledonia11
## 4     New Caledonia12
## 5     New Caledonia13
## 6     New Caledonia14
## Corpus consisting of 30 documents, showing 30 documents:
## 
##                 Text Types Tokens Sentences docvar1     docvar2
##   New_Caledonia1.pdf   598   1375        61     New  Caledonia1
##  New_Caledonia10.pdf   424    898        23     New Caledonia10
##  New_Caledonia11.pdf   318    563        20     New Caledonia11
##  New_Caledonia12.pdf   387    776        30     New Caledonia12
##  New_Caledonia13.pdf   435   1085        43     New Caledonia13
##  New_Caledonia14.pdf   253    506        18     New Caledonia14
##  New_Caledonia15.pdf   556   1357        46     New Caledonia15
##  New_Caledonia16.pdf   557   1288        38     New Caledonia16
##  New_Caledonia17.pdf   256    516        16     New Caledonia17
##  New_Caledonia18.pdf   341    696        19     New Caledonia18
##  New_Caledonia19.pdf   259    511        15     New Caledonia19
##   New_Caledonia2.pdf   502   1264        43     New  Caledonia2
##  New_Caledonia20.pdf   448    865        14     New Caledonia20
##  New_Caledonia21.pdf   449   1019        39     New Caledonia21
##  New_Caledonia22.pdf   894   2536        83     New Caledonia22
##  New_Caledonia23.pdf   739   2006        70     New Caledonia23
##  New_Caledonia24.pdf   262    460        18     New Caledonia24
##  New_Caledonia25.pdf   427    873        29     New Caledonia25
##  New_Caledonia26.pdf   458   1098        35     New Caledonia26
##  New_Caledonia27.pdf  1382   4800       154     New Caledonia27
##  New_Caledonia28.pdf   517   1113        37     New Caledonia28
##  New_Caledonia29.pdf   413    872        26     New Caledonia29
##   New_Caledonia3.pdf   307    569        22     New  Caledonia3
##  New_Caledonia30.pdf   239    427        13     New Caledonia30
##   New_Caledonia4.pdf   205    367        13     New  Caledonia4
##   New_Caledonia5.pdf   518   1259        51     New  Caledonia5
##   New_Caledonia6.pdf   259    581        19     New  Caledonia6
##   New_Caledonia7.pdf   470   1186        39     New  Caledonia7
##   New_Caledonia8.pdf   462   1001        33     New  Caledonia8
##   New_Caledonia9.pdf   430    904        24     New  Caledonia9

#create dfm

## Length  Class   Mode 
##  88530    dfm     S4

#Cleaning up using tokens

##                     Length Class  Mode     
## New_Caledonia1.pdf   687   -none- character
## New_Caledonia10.pdf  458   -none- character
## New_Caledonia11.pdf  312   -none- character
## New_Caledonia12.pdf  386   -none- character
## New_Caledonia13.pdf  543   -none- character
## New_Caledonia14.pdf  260   -none- character
## New_Caledonia15.pdf  673   -none- character
## New_Caledonia16.pdf  645   -none- character
## New_Caledonia17.pdf  270   -none- character
## New_Caledonia18.pdf  378   -none- character
## New_Caledonia19.pdf  262   -none- character
## New_Caledonia2.pdf   614   -none- character
## New_Caledonia20.pdf  450   -none- character
## New_Caledonia21.pdf  517   -none- character
## New_Caledonia22.pdf 1252   -none- character
## New_Caledonia23.pdf 1001   -none- character
## New_Caledonia24.pdf  239   -none- character
## New_Caledonia25.pdf  409   -none- character
## New_Caledonia26.pdf  593   -none- character
## New_Caledonia27.pdf 2479   -none- character
## New_Caledonia28.pdf  587   -none- character
## New_Caledonia29.pdf  450   -none- character
## New_Caledonia3.pdf   307   -none- character
## New_Caledonia30.pdf  225   -none- character
## New_Caledonia4.pdf   178   -none- character
## New_Caledonia5.pdf   647   -none- character
## New_Caledonia6.pdf   286   -none- character
## New_Caledonia7.pdf   581   -none- character
## New_Caledonia8.pdf   504   -none- character
## New_Caledonia9.pdf   461   -none- character

#kwic doesn’t work with dfm, so you have to use tokens

docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
New_Caledonia28.pdf 516 516 papuan kanak consid fate intertwin believ can fate
docname from to pre keyword post pattern
docname from to pre keyword post pattern
New_Caledonia1.pdf 90 90 independ franc lost coloni possess sinc djibouti coloni
New_Caledonia10.pdf 349 349 territori legaci countri coloni histori sometim dub coloni
New_Caledonia10.pdf 398 398 caledonia throw shackl coloni author pari kanak coloni
New_Caledonia12.pdf 324 324 franc becam penal coloni new caledonia current coloni
New_Caledonia12.pdf 361 361 last fulli french coloni becom independ djibouti coloni
New_Caledonia15.pdf 98 98 iii use penal coloni much next half-centuri coloni
New_Caledonia16.pdf 58 58 year french territori coloni sinc final stage coloni
New_Caledonia20.pdf 135 135 self-determin throw shackl coloni author pari indigen coloni
New_Caledonia24.pdf 116 116 foreign affair french coloni archipelago becam french coloni
New_Caledonia24.pdf 129 129 use decad prison coloni kanak suffer tough coloni
New_Caledonia24.pdf 146 146 grant kanak old coloni tension fuel conflict coloni
New_Caledonia25.pdf 255 255 sovereignti sinc end coloni constitut much part coloni
New_Caledonia25.pdf 391 391 focus away former coloni power everi place coloni
New_Caledonia27.pdf 36 36 polit french pacif coloni hot support oppon coloni
New_Caledonia27.pdf 108 108 rose revolt french coloni 1980s’see noumea accord coloni
New_Caledonia27.pdf 1331 1331 french nation resid coloni can vote new coloni
New_Caledonia27.pdf 1766 1766 highlight link across coloni boundari dew gorod coloni
New_Caledonia28.pdf 304 304 cultur shut anti- coloni protest gave birth coloni
New_Caledonia28.pdf 344 344 rememb guillotin order coloni governor kanak men coloni
New_Caledonia28.pdf 510 510 melanesian nation still coloni rule west papuan coloni
New_Caledonia9.pdf 351 351 territori legaci countri coloni histori sometim dub coloni
New_Caledonia9.pdf 400 400 caledonia throw shackl coloni author pari kanak coloni
docname from to pre keyword post pattern
New_Caledonia2.pdf 393 393 year now aim occupi import place nickel occupi
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
New_Caledonia1.pdf 340 340 mile north noumea massacr took place attend massacr
New_Caledonia28.pdf 130 130 caledonia 1980s culmin massacr kanak french commando massacr
New_Caledonia28.pdf 417 417 band nodeak document massacr french special forc massacr
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
New_Caledonia1.pdf 42 42 french tricolor overlook repres assembl chamber gendarm repres
New_Caledonia1.pdf 484 484 one territori current repres french congress philipp repres
New_Caledonia11.pdf 223 223 includ nativ kanak repres per cent popul repres
New_Caledonia13.pdf 177 177 parliament new caledonia repres two deputi two repres
New_Caledonia15.pdf 454 454 senat french state repres high commission republ repres
New_Caledonia15.pdf 650 650 press caledonia elect repres express share desir repres
New_Caledonia16.pdf 101 101 signatori nouméa accord repres french state set repres
New_Caledonia2.pdf 124 124 referendum kanak independentist repres slight less half repres
New_Caledonia2.pdf 486 486 ampl new caledonia repres without doubt import repres
New_Caledonia2.pdf 586 586 colon indigen kanak repres percent popul archipelago repres
New_Caledonia21.pdf 444 444 new caledonia continu repres pro- independ kanak repres
New_Caledonia22.pdf 556 556 prepar committe compris repres union territori congress repres
New_Caledonia22.pdf 774 774 august october figur repres slight improv end repres
New_Caledonia27.pdf 235 235 origin sign may repres french state flnks repres
New_Caledonia27.pdf 919 919 creation consult bodi repres ethnic communiti refus repres
New_Caledonia27.pdf 1336 1336 vote new caledonia repres nation assembl senat repres
New_Caledonia27.pdf 2038 2038 commiss develop flag repres kanak ident futur repres
New_Caledonia29.pdf 173 173 caledonia indigen parti repres less half elector repres
New_Caledonia3.pdf 174 174 includ nativ kanak repres percent popul peopl repres
New_Caledonia5.pdf 58 58 econom growth doorstep repres promis kept also repres
New_Caledonia6.pdf 190 190 forrest told pacnews repres larg vote popul repres
## Tokens consisting of 30 documents and 2 docvars.
## New_Caledonia1.pdf :
##  [1] "bit"       "franc"     "south"     "pacif"     "vote"      "year"     
##  [7] "independ"  "new"       "caledonia" "chanc"     "go"        "way"      
## [ ... and 675 more ]
## 
## New_Caledonia10.pdf :
##  [1] "franc"     "pacif"     "pebbl"     "new"       "caledonia" "reject"   
##  [7] "independ"  "agenc"     "franc"     "press"     "english"   "novemb"   
## [ ... and 446 more ]
## 
## New_Caledonia11.pdf :
##  [1] "french"    "pacif"     "territori" "new"       "caledonia" "reject"   
##  [7] "independ"  "premium"   "official"  "new"       "novemb"    "friday"   
## [ ... and 300 more ]
## 
## New_Caledonia12.pdf :
##  [1] "high"       "turnourt"   "new"        "caledonia"  "independ"  
##  [6] "referendum" "vote"       "close"      "dpa"        "intern"    
## [11] "englischer" "dienst"    
## [ ... and 374 more ]
## 
## New_Caledonia13.pdf :
##  [1] "histor"     "independ"   "vote"       "new"        "caledonia" 
##  [6] "today"      "long-await" "referendum" "french"     "island"    
## [11] "territori"  "affect"    
## [ ... and 531 more ]
## 
## New_Caledonia14.pdf :
##  [1] "new"        "caledonia"  "french"     "pm"         "promis"    
##  [6] "new"        "caledonia"  "referendum" "return"     "rebel"     
## [11] "chief"      "head"      
## [ ... and 248 more ]
## 
## [ reached max_ndoc ... 24 more documents ]
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern
docname from to pre keyword post pattern

New_Caledonia.tokens2

#create dfm from first tokenized steps

#word cloud

#co-occurance

##      timor      parti        sea      kanak    support       east     remain 
##      63828      30145      29197      27777      27298      26624      25737 
##      elect referendum       also 
##      25608      25392      25040
##  [1] "timor"      "parti"      "sea"        "kanak"      "support"   
##  [6] "east"       "remain"     "elect"      "referendum" "also"      
## [11] "flnks"      "polit"      "rump"       "frogier"    "caledonia" 
## [16] "noumea"     "boundari"   "uc"         "accord"     "new"       
## [21] "gome"       "nickel"     "transfer"   "howev"      "provinc"