Text Minig

A Mineração de Texto (Text Mining) é a mineração de dados textuais, isto é, refere-se ao processo de obtenção de informações importantes de um texto por meio da análise de dados não estruturados, de padrões, associações, mudanças e anomalias úteis para a produção de conhecimento.

Pacote tm

O pacote tm é um clássico para o text mining em R, quando os dados se apresentam de forma não estrutura, necessitam de uma preparação prévia que pode ser considerada um tipo de pré-processamento.

Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como ‘documentos’ e cada ‘documento’ em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.

A transformação de um corpus em um conjunto de dados que possa ser submetido à procedimentos de análise consiste em um processo que gera uma representação capaz de descrever cada documento em termos de suas características.

# pacotes <- c("tm", "SnowballCC", "RColorBrewer", "wordcloud", "biclust", "cluster", "igraph", "fpc")
# install.packages(pacotes, dependencies = TRUE)

library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.1
## -- Attaching packages ---------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1     v purrr   0.3.2
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.1
## Warning: package 'tibble' was built under R version 3.6.1
## Warning: package 'tidyr' was built under R version 3.6.1
## Warning: package 'dplyr' was built under R version 3.6.1
## Warning: package 'stringr' was built under R version 3.6.1
## -- Conflicts ------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
library(tm)
## Warning: package 'tm' was built under R version 3.6.1
## Loading required package: NLP
## 
## Attaching package: 'NLP'
## The following object is masked from 'package:ggplot2':
## 
##     annotate
# Realizando a limpeza da base de dados:
# Acrescentar mais stopwords para retirada;
# novas=c()

#Tratamento do corpus

tratar_corpus=function(x){
x%>% 
  tm_map(stripWhitespace)%>%                                #remover excessos de espaços em branco
  tm_map(removePunctuation)%>%                              #remover pontuacao
  tm_map(removeNumbers)%>%                                  #remover numeros
  tm_map(removeWords, c(stopwords("portuguese")))%>%  #remmover as stopwords,crie um vetor chamado "novas" para incluir novas stopwords 
  tm_map(stripWhitespace)%>%                                #remover excessos de espaços em branco novamente
  tm_map(removeNumbers)                                     #remover numeros novamente
 # tm_map(content_transformer(tolower))%>%                  #colocar todos caracteres como minusculo
  #tm_map(stemDocument)                                     #Extraindo os radicais
}                                   
library(stringr)

frase <- 'Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.'

str_split(frase, fixed('.')) # separa em várias strings 
## [[1]]
## [1] "Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos"
## [2] ""
# Corpus

corpus<-VCorpus(VectorSource(frase))
as.VCorpus(corpus)
## <<VCorpus>>
## Metadata:  corpus specific: 0, document level (indexed): 0
## Content:  documents: 1
summary(corpus)
##   Length Class             Mode
## 1 2      PlainTextDocument list
inspect(corpus[1])
## <<VCorpus>>
## Metadata:  corpus specific: 0, document level (indexed): 0
## Content:  documents: 1
## 
## [[1]]
## <<PlainTextDocument>>
## Metadata:  7
## Content:  chars: 333
writeLines(as.character(frase[1]))
## Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.
# Tratando um corpus

corpus=tratar_corpus(corpus)
summary(corpus)
##   Length Class             Mode
## 1 2      PlainTextDocument list
inspect(corpus[[1]])
## <<PlainTextDocument>>
## Metadata:  7
## Content:  chars: 268
## 
## Em bases dados textuais conhecidos corpus corpora tratado documentos cada documento corpus pode assumir diferentes características relação tamanho texto sequências caracteres tipo conteúdo assunto abordado língua é escrito tipo linguagem adotada dentro outros exemplos
# Criando a matrix de termos:

corpus_tf=TermDocumentMatrix(corpus, control = list(minWordLength=2,minDocFreq=1))

# Transformando em matrix para permitir a manipulação:

matriz = as.matrix(corpus_tf)

# Organizar os dados de forma decrescente

matriz = sort(rowSums(matriz), decreasing=T)

# Criando um data.frame para a matriz

matriz = data.frame(word=names(matriz), freq = matriz)

head(matriz, n=10)
# Gráfico 

library(ggplot2)
head(matriz, n=5) %>%
  ggplot(aes(word, freq)) +
  geom_bar(stat = "identity", color = "black", fill = "#87CEFA") +
  geom_text(aes(hjust = 1.3, label = freq)) + 
  coord_flip() + 
  labs(title = "20 Palavras mais mencionadas",  x = "Palavras", y = "Número de usos")

Podemos construir um dicionário de bigrams, trigrams e quatro grams, coletivamente chamados de n-grams, que são frases de n palavras.

library(rJava)
library(RWeka)
## Warning: package 'RWeka' was built under R version 3.6.1
BigramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
TrigramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 3, max = 3))
FourgramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 4, max = 4))

Como exemplo, criaremos um dicionário de trigrams (frases de três palavras) e a função para construir um dicionário de n-gramas utilizando o pacote tm e o RWeka da forma: (maiores detalhes em:

https://gomesfellipe.github.io/post/2017-12-17-string/string/)

trigram.Tdm <- tm::TermDocumentMatrix(corpus, control = list(tokenize = TrigramTokenizer))

# Transformando em matrix para permitir a manipulação:
matriz = as.matrix(trigram.Tdm)

# Organizar os dados de forma decrescente
matriz = sort(rowSums(matriz), decreasing=T)

# Criando um data.frame para a matriz
matriz = data.frame(word=names(matriz), freq = matriz)

head(matriz, n=10)
head(matriz, n=5) %>%
  ggplot(aes(word, freq)) +
  geom_bar(stat = "identity", color = "black", fill = "#87CEFA") +
  geom_text(aes(hjust = 1.3, label = freq)) + 
  coord_flip() + 
  labs(title = "20 frases mais mensionadas",  x = "Palavras", y = "Número de usos")

Nuvem de palavras

Uma nuvem de palavras é um recurso gráfico que serve para descrever os termos mais frequentes de um determinado texto. O tamanho da fonte em que a palavra é apresentada é uma função da frequência da palavra no texto: palavras mais frequentes são desenhadas em fontes de tamanho maior, palavras menos frequentes são desenhadas em fontes de tamanho menor.

# install.packages(c("wordcloud", "tm", "textreadr", "tidytext"), dependencies = TRUE)
library(wordcloud)
## Warning: package 'wordcloud' was built under R version 3.6.1
## Loading required package: RColorBrewer
library(tm)
library(textreadr)
## Warning: package 'textreadr' was built under R version 3.6.1
library(tidytext)
## Warning: package 'tidytext' was built under R version 3.6.1
library(readr)
library(tidyverse)

arquivoPdf<-"http://objdigital.bn.br/Acervo_Digital/Livros_eletronicos/cortico.pdf"
texto<-read_pdf(arquivoPdf) 
texto<- as.tibble(texto)
## Warning: `as.tibble()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.
glimpse(texto)
## Observations: 5,975
## Variables: 3
## $ page_id    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ element_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ text       <chr> "MINISTÉRIO DA CULTURA", "Fundação Biblioteca Nacio...
texto <-as.character(texto$text)
# texto

# Limpando e arrumando o texto

texto.corpus <- Corpus(VectorSource(texto))
texto.corpus<-texto.corpus%>%
  tm_map(removePunctuation)%>% ##eliminar pontuacao
  tm_map(removeNumbers)%>% #sem numeros
  tm_map(stripWhitespace)# sem espaços
## Warning in tm_map.SimpleCorpus(., removePunctuation): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(., removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(., stripWhitespace): transformation drops
## documents
texto.corpus <-texto.corpus%>%
  tm_map(tolower)%>% ## coloca todas letras em minúsculo
  tm_map(removeWords, stopwords("por"))
## Warning in tm_map.SimpleCorpus(., tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(., removeWords, stopwords("por")):
## transformation drops documents
# outras palavras

texto.corpus <- tm_map(texto.corpus, removeWords, c("nao", "porque", "entao"))
## Warning in tm_map.SimpleCorpus(texto.corpus, removeWords, c("nao",
## "porque", : transformation drops documents
# Text stemming pode ser usada para reduzir multiplos/derivações da mesma palavra.

texto.corpus <- tm_map(texto.corpus, stemDocument)
## Warning in tm_map.SimpleCorpus(texto.corpus, stemDocument): transformation
## drops documents
# Frequências das palavras

texto.counts <- as.matrix(TermDocumentMatrix(texto.corpus))
texto.freq <- sort(rowSums(texto.counts), decreasing = TRUE)
head(texto.freq)
##  todo    lá  casa agora  logo outro 
##   263   246   241   187   177   177
# Nuvem de palavras

library(wordcloud)

set.seed(123)
wordcloud(words = names(texto.freq), freq = texto.freq, scale = c(10, 0.8), max.words = 250, random.order = TRUE)
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : agora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : então could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diabo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quas could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jerônimo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marido could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : novo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : romão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pequena could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quarto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : parecia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : havia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : preciso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedreira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fort could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vendeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalhador could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lavadeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : terra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : part could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ponto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : podia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : domingo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulher could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : léoni could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : volta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perna could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lado could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : serviço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : três could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hoje could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : logo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : miranda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pataca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ainda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : enquanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mal could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veze could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : firmo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bertoleza could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : toda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sangu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : milréi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coração could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : espera could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : afin could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : silêncio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vida could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : leocádia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : café could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : frent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : poi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : algun could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : garrafa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : conta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ali could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nunca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : assim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : grand could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : janela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãos could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : melhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : augusta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tard could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quero could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estalagem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perguntou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : corpo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : qualquer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dizer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : defront could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nova could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : junto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rosto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ser could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tudo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : entrar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trist could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lágrima could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : menina could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ond could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : começou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : aqui could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : botelho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : joão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dore could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : costa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : roupa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cama could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : modo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cheio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marciana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : chão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : braço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dinheiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pombinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : número could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tomar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : manhã could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : camisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : alexandr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabeça could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : gent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dava could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dentro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : todo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ombro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulata could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bom could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sempr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : noit could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jantar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ver could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : água could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ter could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vai could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lugar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pátio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cavouqueiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : torno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ninguém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respondeu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pouco could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deus could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : passo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tempo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saber could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sol could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : queria could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ocasião could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : portuguê could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : força could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãe could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desd could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabelo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : piedad could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : senhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : família could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pode could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : debaixo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : boa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amigo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cortiço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nada could not be fit on page. It will not be plotted.

library(wesanderson)
## Warning: package 'wesanderson' was built under R version 3.6.1
wordcloud(words = names(texto.freq), freq = texto.freq, scale = c(10, 0.8), max.words = 250,random.order = TRUE,rot.per = 0.6,color = wes_palette("Darjeeling1"))
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : toda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ainda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : miranda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : janela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pouco could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : parecia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ninguém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquel could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : leocádia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : então could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tomar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : preciso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bom could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : manhã could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : melhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lado could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : milréi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sei could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : piedad could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabeça could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sempr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : camisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dava could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : roupa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulher could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : três could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dentro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ali could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ombro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : palavra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sangu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : doi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quarto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : passo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : número could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cama could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amigo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : muita could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : enquanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : corpo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : podia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : silêncio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diss could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deus could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : conta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabelo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pés could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : baiana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : menina could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : aqui could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : gent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedreira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cima could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãe could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tudo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : todo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : isabel could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rosto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tard could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vendeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : noit could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trist could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : botelho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : logo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : família could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : entrar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ter could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marido could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dizer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : criança could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jantar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : instant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ano could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : domingo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quero could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : romão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dore could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ver could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : polícia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : volta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ponto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalhador could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : agora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : léoni could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respeito could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quatro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : debaixo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dinheiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : modo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marciana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nova could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desd could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tempo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mal could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : part could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lavadeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rita could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : queria could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : afin could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : carn could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : café could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : havia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fort could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pode could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diabo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : garrafa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perguntou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : espera could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : senhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : caso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : braço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estalagem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hoje could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : serviço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ser could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : chão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sol could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ond could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãos could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homen could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cortiço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : voz could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : poi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veze could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : algun could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pequena could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saber could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : machona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vint could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quas could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vai could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pé could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : alguma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : firmo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bertoleza could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respondeu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cheio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nunca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : assim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : frent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : joão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : durant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jerônimo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bruno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perna could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cavouqueiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : contra could not be fit on page. It will not be plotted.

#### Outra forma de fazer

# Função para normalizar texto

NormalizaParaTextMining <- function(texto){
  texto %>% 
    chartr(
    old = "áéíóúÁÉÍÓÚýÝàèìòùÀÈÌÒÙâêîôûÂÊÎÔÛãõÃÕñÑäëïöüÄËÏÖÜÿçÇ´`^~¨:.!?&$@#0123456789",
    new = "aeiouAEIOUyYaeiouAEIOUaeiouAEIOUaoAOnNaeiouAEIOUycC                       ",
    x = .) %>% # Elimina acentos e caracteres desnecessarios
    str_squish() %>% # Elimina espacos excedentes 
    tolower() %>% # Converte para minusculo
    return() # Retorno da funcao
}

# Lista de palavras para remover

palavrasRemover <- c(stopwords(kind = "pt"), letters) %>%
  as.tibble() %>% 
  rename(Palavra = value) %>% 
  mutate(Palavra = NormalizaParaTextMining(Palavra))

arquivoPdf<-"http://objdigital.bn.br/Acervo_Digital/Livros_eletronicos/cortico.pdf"

# Cria tabela com palavras e frequencias

frequenciaPalavras <- arquivoPdf %>% 
  read_pdf() %>% 
  as.tibble() %>% 
  select(text) %>% 
  unnest_tokens(Palavra, text) %>% 
  mutate(Palavra = NormalizaParaTextMining(Palavra)) %>% 
  anti_join(palavrasRemover) %>% 
  count(Palavra, sort = TRUE) %>% 
  filter(Palavra != "")
## Joining, by = "Palavra"
print(frequenciaPalavras)
## # A tibble: 10,915 x 2
##    Palavra     n
##    <chr>   <int>
##  1 la        343
##  2 casa      232
##  3 agora     187
##  4 logo      178
##  5 ainda     163
##  6 tudo      157
##  7 joao      154
##  8 bem       149
##  9 romao     148
## 10 homem     145
## # ... with 10,905 more rows
library(quanteda)
## Warning: package 'quanteda' was built under R version 3.6.1
## Package version: 1.5.1
## Parallel computing: 2 of 4 threads used.
## See https://quanteda.io for tutorials and examples.
## 
## Attaching package: 'quanteda'
## The following objects are masked from 'package:tm':
## 
##     as.DocumentTermMatrix, stopwords
## The following object is masked from 'package:utils':
## 
##     View
library(RColorBrewer)

wordcloud(words = frequenciaPalavras$Palavra, 
  freq = frequenciaPalavras$n,
  min.freq = 10,
  max.words = 180, 
  random.order = FALSE, 
  rot.per = 0.8, 
  colors = brewer.pal(10, "Dark2"))
## Warning in brewer.pal(10, "Dark2"): n too large, allowed maximum for palette Dark2 is 8
## Returning the palette you asked for with that many colors
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : sangue could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : pernas could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : amigo could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : durante could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : pequena could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : antes could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : familia could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : janela could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : queria could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : sentia could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : policia could not be fit on page. It will not be
## plotted.