This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
blogs <- readLines(“en_US.blogs.txt”, n = 10000) ## Warning in readLines(“en_US.blogs.txt”, n = 10000): incomplete final line found ## on ‘en_US.blogs.txt’ news <- readLines(“en_US.news.txt”, n = 10000) ## Warning in readLines(“en_US.news.txt”, n = 10000): incomplete final line found ## on ‘en_US.news.txt’ twitter <- readLines(“en_US.twitter.txt”, n = 10000) ## Warning in readLines(“en_US.twitter.txt”, n = 10000): incomplete final line ## found on ‘en_US.twitter.txt’ blogs_char <- nchar(blogs) ggplot(data.frame(length=blogs_char), aes(x=length)) + geom_histogram(binwidth=50, fill=“blue”, color=“black”) + labs(title=“Distribution of line lengths in Blogs”, x=“Number of characters”, y=“Frequency”) blogs_words <- tibble(text = blogs) %>% unnest_tokens(word, text)
blogs_words %>% count(word, sort=TRUE) %>% top_n(10) ##
Selecting by n ## # A tibble: 1 × 2 ## word n ##
blogs_words %>% count(word, sort=TRUE) %>% top_n(10) ##
Selecting by n ## # A tibble: 1 × 2 ## word n ##