About file

## [1] "en_US.blogs.txt"   "en_US.corpus.rds"  "en_US.corpus.txt" 
## [4] "en_US.news.txt"    "en_US.sample.txt"  "en_US.twitter.txt"
## Warning in readLines(news_con, encoding = "UTF-8", skipNul = TRUE): incomplete
## final line found on 'D:/R Projects/Coursera/Data Science Capstone/data/final/
## en_US/en_US.news.txt'

File Size

File Lines LinesNEmpty Chars CharsNWhite TotalWords WPL_Min WPL_Mean WPL_Max
blogs 899288 899288 206824382 170389539 37570839 0 41.75109 6726
news 77259 77259 15639408 13072698 2651432 1 34.61779 1123
twitter 2360148 2360148 162096241 134082806 30451170 1 12.75065 47

Unigram, Bigrams and Trigrams graph

term freq
just just 2579
get get 2498
like like 2355
one one 2298
will will 2195
love love 1952
can can 1906
time time 1888
day day 1801
make make 1589
term freq
right now right now 223
cant wait cant wait 194
look like look like 164
last night last night 151
feel like feel like 145
look forward look forward 141
dont know dont know 126
im go im go 115
thank follow thank follow 113
happi birthday happi birthday 98
term freq
happi mother day happi mother day 37
cant wait see cant wait see 33
happi new year happi new year 21
let us know let us know 16
im pretti sure im pretti sure 14
look forward see look forward see 14
love love love love love love 14
feel like im feel like im 13
cinco de mayo cinco de mayo 12
dream come true dream come true 11