In Text Mining with R, Chapter 2 looks at Sentiment Analysis. In this assignment, you should start by getting the primary example code from chapter 2 working in an R Markdown document. You should provide a citation to this base code. You’re then asked to extend the code in two ways:
Work with a different corpus of your choosing, and Incorporate at least one additional sentiment lexicon (possibly from another R package that you’ve found through research). As usual, please submit links to both an .Rmd file posted in your GitHub repository and to your code on rpubs.com.
The code chunks and texts below are from Chapter 2 of Text Mining with R (Silge and Robinson, 2020)
First, we will load the required libraries and take a look at the different sentiment lexicons.
library(janeaustenr)
library(tidyverse)
## -- Attaching packages ---------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(stringr)
library(tidytext)
library(jsonlite)
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
##
## flatten
library(dplyr)
library (ggplot2)
get_sentiments("afinn")
## # A tibble: 2,477 x 2
## word value
## <chr> <dbl>
## 1 abandon -2
## 2 abandoned -2
## 3 abandons -2
## 4 abducted -2
## 5 abduction -2
## 6 abductions -2
## 7 abhor -3
## 8 abhorred -3
## 9 abhorrent -3
## 10 abhors -3
## # ... with 2,467 more rows
.
get_sentiments("bing")
## # A tibble: 6,786 x 2
## word sentiment
## <chr> <chr>
## 1 2-faces negative
## 2 abnormal negative
## 3 abolish negative
## 4 abominable negative
## 5 abominably negative
## 6 abominate negative
## 7 abomination negative
## 8 abort negative
## 9 aborted negative
## 10 aborts negative
## # ... with 6,776 more rows
get_sentiments("nrc")
## # A tibble: 13,901 x 2
## word sentiment
## <chr> <chr>
## 1 abacus trust
## 2 abandon fear
## 3 abandon negative
## 4 abandon sadness
## 5 abandoned anger
## 6 abandoned fear
## 7 abandoned negative
## 8 abandoned sadness
## 9 abandonment anger
## 10 abandonment fear
## # ... with 13,891 more rows
Let’s look at the words with a joy score from the NRC lexicon. What are the most common joy words in Emma?
tidy_books <- austen_books() %>%
group_by(book) %>%
mutate(
linenumber = row_number(),
chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
ignore_case = TRUE
)))
) %>%
ungroup() %>%
unnest_tokens(word, text)
Now that the text is in a tidy format with one word per row, we are ready to do the sentiment analysis. First, let’s use the NRC lexicon and filter() for the joy words. Next, let’s filter() the data frame with the text from the books for the words from Emma and then use inner_join() to perform the sentiment analysis. What are the most common joy words in Emma?
nrc_joy <- get_sentiments("nrc") %>%
filter(sentiment == "joy")
tidy_books %>%
filter(book == "Emma") %>%
inner_join(nrc_joy) %>%
count(word, sort = TRUE)
## Joining, by = "word"
## # A tibble: 303 x 2
## word n
## <chr> <int>
## 1 good 359
## 2 young 192
## 3 friend 166
## 4 hope 143
## 5 happy 125
## 6 love 117
## 7 deal 92
## 8 found 92
## 9 present 89
## 10 kind 82
## # ... with 293 more rows
Next, we count up how many positive and negative words there are in defined sections of each book. We define an index here to keep track of where we are in the narrative; this index (using integer division) counts up sections of 80 lines of text. Small sections of text may not have enough words in them to get a good estimate of sentiment while really large sections can wash out narrative structure. For these books, using 80 lines works well, but this can vary depending on individual texts, how long the lines were to start with, etc. We then use spread() so that we have negative and positive sentiment in separate columns, and lastly calculate a net sentiment (positive - negative).
jane_austen_sentiment <- tidy_books %>%
inner_join(get_sentiments("bing")) %>%
count(book, index = linenumber %/% 80, sentiment) %>%
spread(sentiment, n, fill = 0) %>%
mutate(sentiment = positive - negative)
Now we can plot these sentiment scores across the plot trajectory of each novel.
ggplot(jane_austen_sentiment, aes(index, sentiment, fill = book)) +
geom_col(show.legend = FALSE) +
facet_wrap(~book, ncol = 2, scales = "free_x")
## Comparing the three sentiment dictionaries With several options for sentiment lexicons, you might want some more information on which one is appropriate for your purposes. Let’s use all three sentiment lexicons and examine how the sentiment changes across the narrative arc of Pride and Prejudice.
pride_prejudice <- tidy_books %>%
filter(book == "Pride & Prejudice")
pride_prejudice
## # A tibble: 122,204 x 4
## book linenumber chapter word
## <fct> <int> <int> <chr>
## 1 Pride & Prejudice 1 0 pride
## 2 Pride & Prejudice 1 0 and
## 3 Pride & Prejudice 1 0 prejudice
## 4 Pride & Prejudice 3 0 by
## 5 Pride & Prejudice 3 0 jane
## 6 Pride & Prejudice 3 0 austen
## 7 Pride & Prejudice 7 1 chapter
## 8 Pride & Prejudice 7 1 1
## 9 Pride & Prejudice 10 1 it
## 10 Pride & Prejudice 10 1 is
## # ... with 122,194 more rows
afinn <- pride_prejudice %>%
inner_join(get_sentiments("afinn")) %>%
group_by(index = linenumber %/% 80) %>%
summarise(sentiment = sum(value)) %>%
mutate(method = "AFINN")
bing_and_nrc <- bind_rows(
pride_prejudice %>%
inner_join(get_sentiments("bing")) %>%
mutate(method = "Bing et al."),
pride_prejudice %>%
inner_join(get_sentiments("nrc") %>%
filter(sentiment %in% c(
"positive",
"negative"
))) %>%
mutate(method = "NRC")
) %>%
count(method, index = linenumber %/% 80, sentiment) %>%
spread(sentiment, n, fill = 0) %>%
mutate(sentiment = positive - negative)
We now have an estimate of the net sentiment (positive - negative) in each chunk of the novel text for each sentiment lexicon. Let’s bind them together and visualize them.
bind_rows(
afinn,
bing_and_nrc
) %>%
ggplot(aes(index, sentiment, fill = method)) +
geom_col(show.legend = FALSE) +
facet_wrap(~method, ncol = 1, scales = "free_y")
Why is the result for the NRC lexicon biased so high in sentiment compared to the Bing et al. result? Let’s look briefly at how many positive and negative words are in these lexicons.
get_sentiments("nrc") %>%
filter(sentiment %in% c(
"positive",
"negative"
)) %>%
count(sentiment)
## # A tibble: 2 x 2
## sentiment n
## <chr> <int>
## 1 negative 3324
## 2 positive 2312
get_sentiments("bing") %>%
count(sentiment)
## # A tibble: 2 x 2
## sentiment n
## <chr> <int>
## 1 negative 4781
## 2 positive 2005
bing_word_counts <- tidy_books %>%
inner_join(get_sentiments("bing")) %>%
count(word, sentiment, sort = TRUE) %>%
ungroup()
## Joining, by = "word"
bing_word_counts
## # A tibble: 2,585 x 3
## word sentiment n
## <chr> <chr> <int>
## 1 miss negative 1855
## 2 well positive 1523
## 3 good positive 1380
## 4 great positive 981
## 5 like positive 725
## 6 better positive 639
## 7 enough positive 613
## 8 happy positive 534
## 9 love positive 495
## 10 pleasure positive 462
## # ... with 2,575 more rows
bing_word_counts %>%
group_by(sentiment) %>%
top_n(10) %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_col(show.legend = FALSE) +
facet_wrap(~sentiment, scales = "free_y") +
labs(
y = "Contribution to sentiment",
x = NULL
) +
coord_flip()
## Selecting by n
custom_stop_words <- bind_rows(
tibble(
word = c("miss"),
lexicon = c("custom")
),
stop_words
)
custom_stop_words
## # A tibble: 1,150 x 2
## word lexicon
## <chr> <chr>
## 1 miss custom
## 2 a SMART
## 3 a's SMART
## 4 able SMART
## 5 about SMART
## 6 above SMART
## 7 according SMART
## 8 accordingly SMART
## 9 across SMART
## 10 actually SMART
## # ... with 1,140 more rows
Let’s look at the most common words in Jane Austen’s works as a whole.
library(wordcloud)
## Warning: package 'wordcloud' was built under R version 4.0.3
## Loading required package: RColorBrewer
tidy_books %>%
anti_join(stop_words) %>%
count(word) %>%
with(wordcloud(word, n, max.words = 100))
## Joining, by = "word"
Let’s do the sentiment analysis to tag positive and negative words using an inner join, then find the most common positive and negative words. Until the step where we need to send the data to comparison.cloud(), this can all be done with joins, piping, and dplyr because our data is in tidy format.
library(reshape2)
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
tidy_books %>%
inner_join(get_sentiments("bing")) %>%
count(word, sentiment, sort = TRUE) %>%
acast(word ~ sentiment, value.var = "n", fill = 0) %>%
comparison.cloud(
colors = c("gray20", "gray80"),
max.words = 100
)
## Joining, by = "word"
## Looking at units beyond just words
We may want to tokenize text into sentences, and it makes sense to use a new name for the output column in such a case.
PandP_sentences <- tibble(text = prideprejudice) %>%
unnest_tokens(sentence, text, token = "sentences")
PandP_sentences$sentence[2]
## [1] "however little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or other of their daughters."
austen_chapters <- austen_books() %>%
group_by(book) %>%
unnest_tokens(chapter, text,
token = "regex",
pattern = "Chapter|CHAPTER [\\dIVXLC]"
) %>%
ungroup()
# unnest splits into tokens using a regex pattern
austen_chapters %>%
group_by(book) %>%
summarise(chapters = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 6 x 2
## book chapters
## <fct> <int>
## 1 Sense & Sensibility 51
## 2 Pride & Prejudice 62
## 3 Mansfield Park 49
## 4 Emma 56
## 5 Northanger Abbey 32
## 6 Persuasion 25
Let’s find the number of negative words in each chapter and divide by the total words in each chapter. For each book, which chapter has the highest proportion of negative words?
bingnegative <- get_sentiments("bing") %>%
filter(sentiment == "negative")
wordcounts <- tidy_books %>%
group_by(book, chapter) %>%
summarize(words = n())
## `summarise()` regrouping output by 'book' (override with `.groups` argument)
tidy_books %>%
semi_join(bingnegative) %>%
group_by(book, chapter) %>%
summarize(negativewords = n()) %>%
left_join(wordcounts, by = c("book", "chapter")) %>%
mutate(ratio = negativewords / words) %>%
filter(chapter != 0) %>%
top_n(1) %>%
ungroup()
## Joining, by = "word"
## `summarise()` regrouping output by 'book' (override with `.groups` argument)
## Selecting by ratio
## # A tibble: 6 x 5
## book chapter negativewords words ratio
## <fct> <int> <int> <int> <dbl>
## 1 Sense & Sensibility 43 161 3405 0.0473
## 2 Pride & Prejudice 34 111 2104 0.0528
## 3 Mansfield Park 46 173 3685 0.0469
## 4 Emma 15 151 3340 0.0452
## 5 Northanger Abbey 21 149 2982 0.0500
## 6 Persuasion 4 62 1807 0.0343
I would like to extend my assignment from Week 9, for which I looked at movie reviews for movies that were released in 2019. For this assignment, I will perform sentiment analysis on the summary of the NYT movie review, for movies released in 2019.
url <- "https://api.nytimes.com/svc/movies/v2/reviews/search.json?opening-date=2019-01-01;2020-01-01"
key <- "OkVf8SLjqbsAbQAvVbiJBn6yRY7azROI"
addurl <- paste0(url, "&api-key=")
# fetched using json + key call
data <- fromJSON(paste0(addurl, key))
df <- data$results
knitr:: kable (df)
## Warning in `[<-.data.frame`(`*tmp*`, , j, value = structure(list(type =
## structure(c("article", : provided 3 variables to replace 1 variables
## Warning in `[<-.data.frame`(`*tmp*`, , j, value = structure(list(type =
## structure(c("mediumThreeByTwo210", : provided 4 variables to replace 1 variables
| display_title | mpaa_rating | critics_pick | byline | headline | summary_short | publication_date | opening_date | date_updated | link | multimedia |
|---|---|---|---|---|---|---|---|---|---|---|
| The Devil Has a Name | R | 0 | Ben Kenigsberg | ‘The Devil Has a Name’ Review: A Little Guy Takes On Big Oil | A farmer sues an oil company in this well-meaning but muddled drama directed by Edward James Olmos. | 2020-10-15 | 2019-08-04 | 2020-10-15 11:04:07 | article | mediumThreeByTwo210 |
| The Cuban | 0 | Glenn Kenny | ‘The Cuban’ Review: Memories Lost and Reignited | Louis Gossett Jr. plays a musician with Alzheimer’s disease whose new nurse helps him reach back into his past. | 2020-07-30 | 2019-12-07 | 2020-07-30 11:04:04 | article | mediumThreeByTwo210 | |
| Spark | 0 | Ben Kenigsberg | ‘Spark’ and ‘The Observer’ Review: A Filmmaker’s Past, and China’s | A pair of documentaries serve as an introduction to Hu Jie, a documentarian whose films memorialize the horrors of the Mao era. | 2020-07-02 | 2019-12-31 | 2020-07-02 15:16:02 | article | mediumThreeByTwo210 | |
| Unsettled: Seeking Refuge in America | 0 | Ben Kenigsberg | ‘Unsettled: Seeking Refuge in America’ Review: Embracing a New Home | A documentary on L.G.B.T.Q. refugees becomes progressively engaging as its subjects’ paths diverge. | 2020-06-29 | 2019-04-01 | 2020-06-29 20:02:02 | article | mediumThreeByTwo210 | |
| Parkland Rising | 0 | Teo Bugbee | ‘Parkland Rising’ Review: A Close-Up on Activism After a Tragedy | A documentary profiles students and parents who became organizers after the school shooting, but doesn’t provide a lot of fresh insight. | 2020-06-04 | 2019-10-04 | 2020-06-04 11:04:03 | article | mediumThreeByTwo210 | |
| Citizen K | 0 | Ben Kenigsberg | ‘Citizen K’ Review: Trying to Pin Down a Russian Oligarch | A detailed documentary on Mikhail Khodorkovsky proves slightly unsatisfying. | 2020-01-14 | 2019-11-22 | 2020-02-12 17:44:01 | article | mediumThreeByTwo210 | |
| Ghost Stories | 0 | Bilal Qureshi | ‘Ghost Stories’ Review: Bollywood Aims for Frights | With this Netflix anthology, four directors from Indian cinema draw horror from a country’s lived reality. | 2020-01-02 | 2019-12-31 | 2020-01-02 12:04:02 | article | mediumThreeByTwo210 | |
| One Cut of the Dead | Not Rated | 1 | Elisabeth Vincentelli | ‘One Cut of the Dead’ Review: A Fresh Take on the Zombie Flick | A one-take movie stunt is justified in the Japanese director Shinichiro Ueda’s fast and furious backstage comedy. | 2019-12-25 | 2019-09-24 | 2019-12-25 14:04:02 | article | mediumThreeByTwo210 |
| Clemency | R | 0 | Manohla Dargis | ‘Clemency’ Review: No Place for Mercy | A tremendous Alfre Woodard plays a warden at a prison whose world is upended by the fate of death-row inmates. | 2019-12-25 | 2019-12-27 | 2020-01-17 17:44:02 | article | mediumThreeByTwo210 |
| The 21st Annual Animation Show of Shows | 0 | Glenn Kenny | Review: Animated Shorts of Every Stripe and Feather | Find a pen-and-ink dog, stop-motion girl and a C.G.I. fox in “The 21st Annual Animation Show of Shows.” | 2019-12-24 | 2019-12-25 | 2020-01-13 17:44:01 | article | mediumThreeByTwo210 | |
| What She Said: The Art of Pauline Kael | 0 | Jeannette Catsoulis | ‘What She Said’ Review: Pauline Kael, Screen Queen | Kael’s distinctively passionate voice, competing with movie fragments, is disastrously muffled, as are those of her admirers and detractors. | 2019-12-24 | 2019-12-25 | 2020-01-15 17:44:02 | article | mediumThreeByTwo210 | |
| 1917 | R | 0 | Manohla Dargis | ‘1917’ Review: Paths of Technical Glory | Sam Mendes directs this visually extravagant drama about young British soldiers on a perilous mission in World War I. | 2019-12-24 | 2019-12-25 | 2020-01-24 17:44:02 | article | mediumThreeByTwo210 |
| Spies in Disguise | PG | 0 | Glenn Kenny | ‘Spies in Disguise’ Review: Smug Agent Meets Gadget Geek | Will Smith and Tom Holland are an action odd couple in this animated comedy. | 2019-12-24 | 2019-12-25 | 2020-01-24 17:44:02 | article | mediumThreeByTwo210 |
| The Song of Names | PG-13 | 0 | Ben Kenigsberg | ‘The Song of Names’ Review: A Prodigy, a War and a Mystery | A young violinist goes missing in London in 1951. The eventual answer as to why is powerful. | 2019-12-24 | 2019-12-25 | 2020-01-17 17:44:02 | article | mediumThreeByTwo210 |
| Little Women | PG | 1 | A.O. Scott | ‘Little Women’ Review: This Movie Is Big | Greta Gerwig refreshes a literary classic with the help of a dazzling cast that includes Saoirse Ronan, Florence Pugh, Laura Dern and Meryl Streep. | 2019-12-23 | 2019-12-25 | 2020-01-23 17:44:01 | article | mediumThreeByTwo210 |
| Dabangg 3 | 0 | Rachel Saltz | ‘Dabangg 3’ Review: A Hero From the School of Knock ’em Hard | In this Bollywood action flick, Salman Khan is a one-man wrecking crew. When not knocking heads, he dances. | 2019-12-22 | 2019-12-20 | 2020-01-09 17:44:01 | article | mediumThreeByTwo210 | |
| Invisible Life | R | 1 | Glenn Kenny | ‘Invisible Life’ Review: Sisterhood Is Stronger Than Patriarchy | Two sisters living in 1950s Brazil are kept apart by their father but can’t be spiritually separated. | 2019-12-19 | 2019-12-20 | 2020-01-19 17:44:02 | article | mediumThreeByTwo210 |
| Togo | PG | 0 | Jason Bailey | ‘Togo’ Review: A Man, His Dogs and a Very Bad Storm | Willem Dafoe stars in the latest addition to Disney’s sled dog canon. | 2019-12-19 | 2019-12-20 | 2019-12-19 12:04:02 | article | mediumThreeByTwo210 |
| She’s Missing | 0 | Jeannette Catsoulis | ‘She’s Missing’ Review: Gone Girl | An ominous atmosphere of impermanence marks this story of a New Mexico waitress who embarks on a perilous search for her vanished friend. | 2019-12-19 | 2019-12-20 | 2019-12-19 12:04:03 | article | mediumThreeByTwo210 | |
| Cats | PG | 0 | Manohla Dargis | ‘Cats’ Review: They Dance, They Sing, They Lick Their Digital Fur | Tom Hooper’s movie is not a catastrophe. It’s not even an epic hairball. | 2019-12-19 | 2019-12-20 | 2020-01-19 17:44:02 | article | mediumThreeByTwo210 |
We will use the sentimentr package to try to understand the sentiments conveyed in the reviews as a whole. The Sentimentr package allows the users to quickly perform sentiment analysis on sentences and it corrects for inversions. It assigns a score from -1 to 1 that indicates whether the sentiment is negative, neutral or positive.
library(sentimentr)
## Warning: package 'sentimentr' was built under R version 4.0.3
library(data.table)
## Warning: package 'data.table' was built under R version 4.0.3
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:reshape2':
##
## dcast, melt
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
sentiment <- sentiment_by(df$summary_short)
View(sentiment)
The first column (element_id) in our case are the movies as they appear in the table above. Word_Count is the number of words in each sentences. The sentimentr package looks at each sentence in the review separately and calculates the overall average score and the standard deviation for the reviews. Most of the reviews in our case were one sentence long which is why our sd column is mostly empty.
I want to convert the average sentiment scores into the following categories: positive, neutral and negative.
#function that generates a sentiment class based on average score
sentiment_df<- setDF(sentiment)
get_sentiment_class <- function(ave_sentiment){
sentiment_class="Positive"
if (ave_sentiment < -.3){
sentiment_class = "Negative"}
else if (ave_sentiment<.3){
sentiment_class = "Neutral"
}
sentiment_class
}
sentiment_df$ave_sentiment <-
sapply(sentiment_df$ave_sentiment,get_sentiment_class)
sentiment_df
## element_id word_count sd ave_sentiment
## 1 1 18 NA Negative
## 2 2 20 NA Neutral
## 3 3 22 NA Neutral
## 4 4 17 NA Neutral
## 5 5 22 NA Positive
## 6 6 9 NA Neutral
## 7 7 17 NA Neutral
## 8 8 19 NA Neutral
## 9 9 21 NA Negative
## 10 10 23 NA Neutral
## 11 11 20 NA Neutral
## 12 12 19 NA Neutral
## 13 13 14 NA Neutral
## 14 14 16 0.21250000 Neutral
## 15 15 24 NA Neutral
## 16 16 19 0.04902903 Neutral
## 17 17 18 NA Neutral
## 18 18 13 NA Neutral
## 19 19 23 NA Neutral
## 20 20 15 0.40130899 Neutral
ggplot(data=sentiment_df,aes(x=ave_sentiment,fill=ave_sentiment))+geom_bar()
It seems like most reviews were neutral. However, it is also interesting to see that there were more negative reviews than positive ones.
Let’s see if we see similar results with the ‘Afinn’ lexicon:
x <- tibble (txt=df$summary_short)
x <-x %>% unnest_tokens(word,txt)
library(plyr)
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following object is masked from 'package:purrr':
##
## compact
y <-join(x,get_sentiments("afinn"),type="inner")
## Joining by: word
y
## word value
## 1 helps 2
## 2 reach 1
## 3 fresh 1
## 4 justified 2
## 5 furious -3
## 6 comedy 1
## 7 prison -2
## 8 death -2
## 9 stop -1
## 10 passionate 2
## 11 war -2
## 12 odd -2
## 13 comedy 1
## 14 missing -2
## 15 powerful 2
## 16 help 2
## 17 ominous 3
## 18 catastrophe -3
y_df<- setDF(y)
get_sentiment_class <- function(value){
sentiment_class="Positive"
if (value < (-3)){
sentiment_class = "Negative"}
else if (value < (3)){
sentiment_class = "Neutral"
}
sentiment_class
}
y_df$value <-
sapply(y_df$value,get_sentiment_class)
y_df
## word value
## 1 helps Neutral
## 2 reach Neutral
## 3 fresh Neutral
## 4 justified Neutral
## 5 furious Neutral
## 6 comedy Neutral
## 7 prison Neutral
## 8 death Neutral
## 9 stop Neutral
## 10 passionate Neutral
## 11 war Neutral
## 12 odd Neutral
## 13 comedy Neutral
## 14 missing Neutral
## 15 powerful Neutral
## 16 help Neutral
## 17 ominous Positive
## 18 catastrophe Neutral
ggplot(data=y_df,aes(x=value,fill=value))+geom_bar()
Even with Afinn, we are seeing more neutral reviews. However, unlike with Sentimentr, Afinn did not detect any negative reviews.
I thought the Sentimentr package was more useful for my corpus since it evaluates the entire sentence and is thus able to account for the context in which a word is being used.
Moreover,with Afinn,it seems like we are only limited to the list of words the lexicon contains. It is also interesting to see that Afinn gave the word ‘ominous’ a positive value.
References Robinson, Julia Silge and David. “Text Mining with R.” 2 Sentiment Analysis with Tidy Data, 29 Oct. 2020, www.tidytextmining.com/sentiment.html.