Fist, we have the number of lines, words and characters in the news file and an example.
I will explore this file to get the word frequency distribution and get the most commons, to do that I will take the 10% of the text to do an initial exploratory analysis. Keep in mind this file is taken of news.
## [1] "Lines: 1010242"
## [1] "Words: 34372598"
## [1] "Characters: 203791405"
## [1] "Example: Of course, Paul was 20 as a ro ..."