A Mineração de Texto (Text Mining) é a mineração de dados textuais, isto é, refere-se ao processo de obtenção de informações importantes de um texto por meio da análise de dados não estruturados, de padrões, associações, mudanças e anomalias úteis para a produção de conhecimento.
O pacote tm é um clássico para o text mining em R, quando os dados se apresentam de forma não estrutura, necessitam de uma preparação prévia que pode ser considerada um tipo de pré-processamento.
Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como ‘documentos’ e cada ‘documento’ em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.
A transformação de um corpus em um conjunto de dados que possa ser submetido à procedimentos de análise consiste em um processo que gera uma representação capaz de descrever cada documento em termos de suas características.
# pacotes <- c("tm", "SnowballCC", "RColorBrewer", "wordcloud", "biclust", "cluster", "igraph", "fpc")
# install.packages(pacotes, dependencies = TRUE)
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.1
## -- Attaching packages ---------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 1.0.0 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.1
## Warning: package 'tibble' was built under R version 3.6.1
## Warning: package 'tidyr' was built under R version 3.6.1
## Warning: package 'dplyr' was built under R version 3.6.1
## Warning: package 'stringr' was built under R version 3.6.1
## -- Conflicts ------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(magrittr)
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
library(tm)
## Warning: package 'tm' was built under R version 3.6.1
## Loading required package: NLP
##
## Attaching package: 'NLP'
## The following object is masked from 'package:ggplot2':
##
## annotate
# Realizando a limpeza da base de dados:
# Acrescentar mais stopwords para retirada;
# novas=c()
#Tratamento do corpus
tratar_corpus=function(x){
x%>%
tm_map(stripWhitespace)%>% #remover excessos de espaços em branco
tm_map(removePunctuation)%>% #remover pontuacao
tm_map(removeNumbers)%>% #remover numeros
tm_map(removeWords, c(stopwords("portuguese")))%>% #remmover as stopwords,crie um vetor chamado "novas" para incluir novas stopwords
tm_map(stripWhitespace)%>% #remover excessos de espaços em branco novamente
tm_map(removeNumbers) #remover numeros novamente
# tm_map(content_transformer(tolower))%>% #colocar todos caracteres como minusculo
#tm_map(stemDocument) #Extraindo os radicais
}
library(stringr)
frase <- 'Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.'
str_split(frase, fixed('.')) # separa em várias strings
## [[1]]
## [1] "Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos"
## [2] ""
# Corpus
corpus<-VCorpus(VectorSource(frase))
as.VCorpus(corpus)
## <<VCorpus>>
## Metadata: corpus specific: 0, document level (indexed): 0
## Content: documents: 1
summary(corpus)
## Length Class Mode
## 1 2 PlainTextDocument list
inspect(corpus[1])
## <<VCorpus>>
## Metadata: corpus specific: 0, document level (indexed): 0
## Content: documents: 1
##
## [[1]]
## <<PlainTextDocument>>
## Metadata: 7
## Content: chars: 333
writeLines(as.character(frase[1]))
## Em bases de dados textuais, conhecidos como corpus ou corpora são tratado como documentos e cada documento em um corpus pode assumir diferentes características em relação ao tamanho do texto (sequências de caracteres), tipo de conteúdo (assunto abordado), língua na qual é escrito ou tipo de linguagem adotada dentro outros exemplos.
# Tratando um corpus
corpus=tratar_corpus(corpus)
summary(corpus)
## Length Class Mode
## 1 2 PlainTextDocument list
inspect(corpus[[1]])
## <<PlainTextDocument>>
## Metadata: 7
## Content: chars: 268
##
## Em bases dados textuais conhecidos corpus corpora tratado documentos cada documento corpus pode assumir diferentes características relação tamanho texto sequências caracteres tipo conteúdo assunto abordado língua é escrito tipo linguagem adotada dentro outros exemplos
# Criando a matrix de termos:
corpus_tf=TermDocumentMatrix(corpus, control = list(minWordLength=2,minDocFreq=1))
# Transformando em matrix para permitir a manipulação:
matriz = as.matrix(corpus_tf)
# Organizar os dados de forma decrescente
matriz = sort(rowSums(matriz), decreasing=T)
# Criando um data.frame para a matriz
matriz = data.frame(word=names(matriz), freq = matriz)
head(matriz, n=10)
# Gráfico
library(ggplot2)
head(matriz, n=5) %>%
ggplot(aes(word, freq)) +
geom_bar(stat = "identity", color = "black", fill = "#87CEFA") +
geom_text(aes(hjust = 1.3, label = freq)) +
coord_flip() +
labs(title = "20 Palavras mais mencionadas", x = "Palavras", y = "Número de usos")
Podemos construir um dicionário de bigrams, trigrams e quatro grams, coletivamente chamados de n-grams, que são frases de n palavras.
library(rJava)
library(RWeka)
## Warning: package 'RWeka' was built under R version 3.6.1
BigramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
TrigramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 3, max = 3))
FourgramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 4, max = 4))
Como exemplo, criaremos um dicionário de trigrams (frases de três palavras) e a função para construir um dicionário de n-gramas utilizando o pacote tm e o RWeka da forma: (maiores detalhes em:
https://gomesfellipe.github.io/post/2017-12-17-string/string/)
trigram.Tdm <- tm::TermDocumentMatrix(corpus, control = list(tokenize = TrigramTokenizer))
# Transformando em matrix para permitir a manipulação:
matriz = as.matrix(trigram.Tdm)
# Organizar os dados de forma decrescente
matriz = sort(rowSums(matriz), decreasing=T)
# Criando um data.frame para a matriz
matriz = data.frame(word=names(matriz), freq = matriz)
head(matriz, n=10)
head(matriz, n=5) %>%
ggplot(aes(word, freq)) +
geom_bar(stat = "identity", color = "black", fill = "#87CEFA") +
geom_text(aes(hjust = 1.3, label = freq)) +
coord_flip() +
labs(title = "20 frases mais mensionadas", x = "Palavras", y = "Número de usos")
Uma nuvem de palavras é um recurso gráfico que serve para descrever os termos mais frequentes de um determinado texto. O tamanho da fonte em que a palavra é apresentada é uma função da frequência da palavra no texto: palavras mais frequentes são desenhadas em fontes de tamanho maior, palavras menos frequentes são desenhadas em fontes de tamanho menor.
# install.packages(c("wordcloud", "tm", "textreadr", "tidytext"), dependencies = TRUE)
library(wordcloud)
## Warning: package 'wordcloud' was built under R version 3.6.1
## Loading required package: RColorBrewer
library(tm)
library(textreadr)
## Warning: package 'textreadr' was built under R version 3.6.1
library(tidytext)
## Warning: package 'tidytext' was built under R version 3.6.1
library(readr)
library(tidyverse)
arquivoPdf<-"http://objdigital.bn.br/Acervo_Digital/Livros_eletronicos/cortico.pdf"
texto<-read_pdf(arquivoPdf)
texto<- as.tibble(texto)
## Warning: `as.tibble()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.
glimpse(texto)
## Observations: 5,975
## Variables: 3
## $ page_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ element_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ text <chr> "MINISTÉRIO DA CULTURA", "Fundação Biblioteca Nacio...
texto <-as.character(texto$text)
# texto
# Limpando e arrumando o texto
texto.corpus <- Corpus(VectorSource(texto))
texto.corpus<-texto.corpus%>%
tm_map(removePunctuation)%>% ##eliminar pontuacao
tm_map(removeNumbers)%>% #sem numeros
tm_map(stripWhitespace)# sem espaços
## Warning in tm_map.SimpleCorpus(., removePunctuation): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(., removeNumbers): transformation drops
## documents
## Warning in tm_map.SimpleCorpus(., stripWhitespace): transformation drops
## documents
texto.corpus <-texto.corpus%>%
tm_map(tolower)%>% ## coloca todas letras em minúsculo
tm_map(removeWords, stopwords("por"))
## Warning in tm_map.SimpleCorpus(., tolower): transformation drops documents
## Warning in tm_map.SimpleCorpus(., removeWords, stopwords("por")):
## transformation drops documents
# outras palavras
texto.corpus <- tm_map(texto.corpus, removeWords, c("nao", "porque", "entao"))
## Warning in tm_map.SimpleCorpus(texto.corpus, removeWords, c("nao",
## "porque", : transformation drops documents
# Text stemming pode ser usada para reduzir multiplos/derivações da mesma palavra.
texto.corpus <- tm_map(texto.corpus, stemDocument)
## Warning in tm_map.SimpleCorpus(texto.corpus, stemDocument): transformation
## drops documents
# Frequências das palavras
texto.counts <- as.matrix(TermDocumentMatrix(texto.corpus))
texto.freq <- sort(rowSums(texto.counts), decreasing = TRUE)
head(texto.freq)
## todo lá casa agora logo outro
## 263 246 241 187 177 177
# Nuvem de palavras
library(wordcloud)
set.seed(123)
wordcloud(words = names(texto.freq), freq = texto.freq, scale = c(10, 0.8), max.words = 250, random.order = TRUE)
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : agora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : então could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diabo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quas could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jerônimo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marido could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : novo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : romão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pequena could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quarto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : parecia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : havia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : preciso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedreira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fort could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vendeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalhador could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lavadeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : terra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : part could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ponto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : podia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : domingo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulher could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : léoni could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : volta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perna could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lado could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : serviço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : três could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hoje could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : logo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : miranda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pataca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ainda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : enquanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mal could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veze could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : firmo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bertoleza could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : toda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sangu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : milréi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coração could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : espera could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : afin could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : silêncio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vida could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : leocádia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : café could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : frent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : poi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : algun could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : garrafa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : conta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ali could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nunca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : assim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : grand could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : janela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãos could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : melhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : augusta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tard could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quero could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estalagem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perguntou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : corpo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : qualquer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dizer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : defront could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nova could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : junto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rosto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ser could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tudo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : entrar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trist could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lágrima could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : menina could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ond could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : começou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : aqui could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : botelho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : joão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dore could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : costa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : roupa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cama could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : modo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cheio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marciana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : chão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : braço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dinheiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pombinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : número could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tomar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : manhã could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : camisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : alexandr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabeça could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : gent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dava could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dentro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : todo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ombro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulata could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bom could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sempr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : noit could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jantar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ver could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : água could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ter could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vai could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lugar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pátio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cavouqueiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : torno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ninguém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respondeu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pouco could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deus could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : passo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tempo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saber could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sol could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : queria could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ocasião could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : portuguê could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : força could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãe could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desd could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabelo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : piedad could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : senhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : família could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pode could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : debaixo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : boa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amigo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cortiço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nada could not be fit on page. It will not be plotted.
library(wesanderson)
## Warning: package 'wesanderson' was built under R version 3.6.1
wordcloud(words = names(texto.freq), freq = texto.freq, scale = c(10, 0.8), max.words = 250,random.order = TRUE,rot.per = 0.6,color = wes_palette("Darjeeling1"))
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : toda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ainda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : miranda could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : janela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pouco could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : parecia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ninguém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquel could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : leocádia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : então could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tomar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : preciso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bom could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : manhã could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : melhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lado could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : milréi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sei could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : piedad could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabeça could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sempr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : camisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dava could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : olho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : roupa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mulher could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : três could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dentro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ali could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ombro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : palavra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sangu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : coisa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : doi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quarto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : passo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : número could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cama could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : amigo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : muita could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : enquanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : corpo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : podia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : silêncio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diss could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deus could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : conta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cabelo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : daquela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pés could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : baiana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : menina could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : aqui could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : gent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pedreira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dua could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cima could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãe could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tudo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : todo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : isabel could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rosto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tard could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vendeiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : noit could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trist could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : botelho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : logo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : família could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : entrar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ter could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cada could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marido could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dizer could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : criança could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estela could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jantar could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : instant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vez could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ano could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : domingo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quero could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ficou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : romão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dore could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ver could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : polícia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : volta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filho could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ponto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : trabalhador could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : agora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mundo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : léoni could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respeito could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quatro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : debaixo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dinheiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : filha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : modo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : marciana could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nova could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desd could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tempo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fazia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mal could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : part could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lavadeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : rita could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : queria could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casinha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : outra could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : afin could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hora could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : carn could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : café could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : lá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : havia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fort could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pode could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : diabo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : garrafa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perguntou could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : espera could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : senhor could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : caso could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : braço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : estalagem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : hoje could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : serviço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ser could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : chão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sol could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : primeira could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ond could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mãos could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homen could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cortiço could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : voz could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : ant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : poi could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : veze could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : desta could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : dent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : sim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : algun could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pequena could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saber could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : velha could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : machona could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : casa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cá could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vint could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : quas could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : vai could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pé could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : alguma could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : firmo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bertoleza could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : pobr could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : respondeu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : saia could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cheio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : nunca could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : deu could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : assim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : fim could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : frent could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : tanto could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : joão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : durant could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : jerônimo could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : bruno could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mão could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : meio could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : perna could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : homem could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : mesa could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : porém could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : cavouqueiro could not be fit on page. It will not be plotted.
## Warning in wordcloud(words = names(texto.freq), freq = texto.freq, scale =
## c(10, : contra could not be fit on page. It will not be plotted.
#### Outra forma de fazer
# Função para normalizar texto
NormalizaParaTextMining <- function(texto){
texto %>%
chartr(
old = "áéíóúÁÉÍÓÚýÝàèìòùÀÈÌÒÙâêîôûÂÊÎÔÛãõÃÕñÑäëïöüÄËÏÖÜÿçÇ´`^~¨:.!?&$@#0123456789",
new = "aeiouAEIOUyYaeiouAEIOUaeiouAEIOUaoAOnNaeiouAEIOUycC ",
x = .) %>% # Elimina acentos e caracteres desnecessarios
str_squish() %>% # Elimina espacos excedentes
tolower() %>% # Converte para minusculo
return() # Retorno da funcao
}
# Lista de palavras para remover
palavrasRemover <- c(stopwords(kind = "pt"), letters) %>%
as.tibble() %>%
rename(Palavra = value) %>%
mutate(Palavra = NormalizaParaTextMining(Palavra))
arquivoPdf<-"http://objdigital.bn.br/Acervo_Digital/Livros_eletronicos/cortico.pdf"
# Cria tabela com palavras e frequencias
frequenciaPalavras <- arquivoPdf %>%
read_pdf() %>%
as.tibble() %>%
select(text) %>%
unnest_tokens(Palavra, text) %>%
mutate(Palavra = NormalizaParaTextMining(Palavra)) %>%
anti_join(palavrasRemover) %>%
count(Palavra, sort = TRUE) %>%
filter(Palavra != "")
## Joining, by = "Palavra"
print(frequenciaPalavras)
## # A tibble: 10,915 x 2
## Palavra n
## <chr> <int>
## 1 la 343
## 2 casa 232
## 3 agora 187
## 4 logo 178
## 5 ainda 163
## 6 tudo 157
## 7 joao 154
## 8 bem 149
## 9 romao 148
## 10 homem 145
## # ... with 10,905 more rows
library(quanteda)
## Warning: package 'quanteda' was built under R version 3.6.1
## Package version: 1.5.1
## Parallel computing: 2 of 4 threads used.
## See https://quanteda.io for tutorials and examples.
##
## Attaching package: 'quanteda'
## The following objects are masked from 'package:tm':
##
## as.DocumentTermMatrix, stopwords
## The following object is masked from 'package:utils':
##
## View
library(RColorBrewer)
wordcloud(words = frequenciaPalavras$Palavra,
freq = frequenciaPalavras$n,
min.freq = 10,
max.words = 180,
random.order = FALSE,
rot.per = 0.8,
colors = brewer.pal(10, "Dark2"))
## Warning in brewer.pal(10, "Dark2"): n too large, allowed maximum for palette Dark2 is 8
## Returning the palette you asked for with that many colors
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : sangue could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : pernas could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : amigo could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : durante could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : pequena could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : antes could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : familia could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : janela could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : queria could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : sentia could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = frequenciaPalavras$Palavra, freq =
## frequenciaPalavras$n, : policia could not be fit on page. It will not be
## plotted.