rm(list = ls())
library(tm)
## Loading required package: NLP
library(magrittr)
library(SnowballC)
docs1 <- Corpus(DirSource("/Users/maxineharlemon/AIOpt"))
docs1
## <<SimpleCorpus>>
## Metadata:  corpus specific: 1, document level (indexed): 0
## Content:  documents: 41
docs1[["Alice_in_Wonderland.csv"]]
## <<PlainTextDocument>>
## Metadata:  7
## Content:  chars: 2876
docs1[[2]]
## <<PlainTextDocument>>
## Metadata:  7
## Content:  chars: 56
content(docs1["Alice_in_Wonderland.csv"])[1:3]
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Alice_in_Wonderland.csv 
## "\"Once upon a time, in a charming English countryside, there lived a curious young girl named Alice. One sunny afternoon, Alice was sitting by a riverbank, feeling a little bored. She glanced up and noticed a white rabbit with pink eyes, dressed in a vest, hurrying by. The rabbit pulled out a pocket watch, saying loudly, <d2>Oh dear! Oh dear! I shall be late!<d3>\"\n\"Very interested, Alice jumped up and followed the rabbit. She watched as he disappeared into a large rabbit hole. Without thinking twice, Alice followed him down the hole and found herself falling down, down, down into a strange and wonderful world.\"\n\"Alice landed softly in a hallway filled with doors of all sizes. On a glass table, she found a small golden key and a bottle labeled <d2>Drink Me.<d3> She sipped the potion and began to shrink until she was just the right size to fit through a tiny door that led to a beautiful garden.\"\n\"Next, she met the Cheshire Cat, a grinning cat who could appear and disappear whenever he wanted. The Cheshire Cat told Alice to go to the March Hare<d5>s house, where she found the Mad Hatter and the Dormouse having a never-ending tea party. The Mad Hatter and the March Hare were quite silly indeed, constantly switching seats and talking in riddles. Alice found their actions amusing but also a bit frustrating.\"\n\"After leaving the tea party, Alice wandered into the garden of the Queen of Hearts. The Queen was a bossy ruler who loved to play croquet with flamingos as hammers and hedgehogs as balls. The Queen was quick to anger, frequently shouting, <d2>Off with their heads!<d3> whenever she was displeased. Despite the Queen<d5>s scary demeanor, Alice bravely stood up to her.\"\n\"During the game, Alice met the King of Hearts and the royal court, including a variety of living playing cards. The Queen said that the Knave of Hearts had stolen her tarts, and a meeting was held to decide if he was guilty. The court events were silly and chaotic, with witnesses like the Mad Hatter and the March Hare giving stories that didn<d5>t make sense.\"\n\"As the trial grew more ridiculous, Alice found herself growing larger and larger. The Queen demanded her execution, but Alice, now towering over everyone, simply declared, <d2>You<d5>re nothing but a pack of cards!<d3> At that moment, the entire court rose into the air and came tumbling down upon her.\"\n\"Suddenly, Alice woke up on the riverbank, back in the real world. She realized that her adventures in Wonderland had been a dream. She looked around and saw her sister still reading a book nearby.\"\n\"Alice smiled and thought about all the curious and wonderful characters she had met in her dream. Though it had all been an amazing and unreal adventure, Alice knew she would always remember her journey through Wonderland and the lessons she learned about curiosity, bravery, and standing up for herself.\"\nThe end." 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      <NA> 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        NA 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      <NA> 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        NA
docs1 <- docs1["Alice_in_Wonderland.csv"] %>%
        tm_map(removePunctuation) %>%
        tm_map(content_transformer(tolower)) %>%
        tm_map(removeNumbers) %>%
        tm_map(removeWords, stopwords("en")) %>%
        tm_map(stripWhitespace) %>%
        tm_map(stemDocument)
content(docs1["Alice_in_Wonderland.csv"])[1:3]
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Alice_in_Wonderland.csv 
## "upon time charm english countrysid live curious young girl name alic one sunni afternoon alic sit riverbank feel littl bore glanc notic white rabbit pink eye dress vest hurri rabbit pull pocket watch say loud doh dear oh dear shall late interest alic jump follow rabbit watch disappear larg rabbit hole without think twice alic follow hole found fall strang wonder world alic land soft hallway fill door size glass tabl found small golden key bottl label ddrink med sip potion began shrink just right size fit tini door led beauti garden next met cheshir cat grin cat appear disappear whenev want cheshir cat told alic go march hare hous found mad hatter dormous neverend tea parti mad hatter march hare quit silli inde constant switch seat talk riddl alic found action amus also bit frustrat leav tea parti alic wander garden queen heart queen bossi ruler love play croquet flamingo hammer hedgehog ball queen quick anger frequent shout doff headsd whenev displeas despit queend scari demeanor alic brave stood game alic met king heart royal court includ varieti live play card queen said knave heart stolen tart meet held decid guilti court event silli chaotic wit like mad hatter march hare give stori didndt make sens trial grew ridicul alic found grow larger larger queen demand execut alic now tower everyon simpli declar dyoudr noth pack cardsd moment entir court rose air came tumbl upon sudden alic woke riverbank back real world realiz adventur wonderland dream look around saw sister still read book nearbi alic smile thought curious wonder charact met dream though amaz unreal adventur alic knew alway rememb journey wonderland lesson learn curios braveri stand end" 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <NA> 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              NA 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <NA> 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              NA