Introduction

F. Scott Fitzgerald was a modern American writer. This research will analyze the popular words, word count and sentiments of the novels that he published while he was alive - hence why The Last Tycoon will not be analyzed.

Hypotheses

I hypothesize that his novels will have mentions of time, love - two themes common in modernist literature. I also hypothesize that the overall sentiment will be negative,as modernist literature is composed of sad themes.

Research Objectives

This research aims to determine if Fitzgerald’s sentiment in his novel becomes more negative as he began to become a more prominent author, struggled in the spotlight and lived in raucous and roaring 1920s.

Methods

After downloading each dataset as a text file from Project Gutenberg and Project Gutenberg Australia, I renamed the data set for ease of use throughout the project. In this case, I named “This Side of Paradise” the phrase “tsop”, “The Beautiful and the Damned” the phrase “bad” and so on and so forth. I unnested the text so I could get a proper word count of each novel, filtering out the stop words and character names. Next, I created a data table that displays the frequency of the most used words in each novel. Also, I included a quote from each novel that contextualizes one of the most used words - who doesn’t love a good Fitzgerald quote?

Sentiment Analysis Methods

I looked at each novel through the three sentiment lexicons: bing, afinn and nrc. I wanted to see if there was a common sentiment throughout his novels and if that sentiment changed at any point. Interestingly enough, bing and nrc lexicons show contrasting results - this could be due to the fact that nrc has eight categories in which words are analyzed, while bing only shows two overarching sentiment categories. I also looked at the mean afinn value of each novel to gauge the overall tone of the novel as well.

Visualizations

The graphs depict the bing lexicon and the nrc lexicon for each novel. Also, I created wordclouds of the most used positive and most used negative words as defined by the afinn lexicon.

“This Side of Paradise”

Published in 1920, F. Scott Fitzgerald’s first novel tells the early life of Amory Blaine. This text was pulled from Project Gutenberg’s American website. The novel is organized into sections represented by stages in Amory’s life.

tsop%>% 
  unnest_tokens(word,X1) -> tsopwords

Word Count:

count(tsopwords)
## # A tibble: 1 x 1
##       n
##   <int>
## 1 59902

Top Words

The top words are night, people, life, and eyes.

tsopwords %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words) %>% 
  filter(!word %in% c("amory","rosalind", "dick", "maury", "gloria", "anthony")) %>% 
  arrange(desc(n)) %>%
  head(10) %>% 
  knitr::kable()
## Joining, by = "word"
word n
don’t 137
night 118
i’m 114
people 106
life 99
it’s 90
eyes 87
day 84
you’re 79
love 76

Quote

“The unwelcome November rain had perversely stolen the day’s last hour and pawned it with that ancient fence, the night.”

Sentiment

Bing

tsop_bing <- tsopwords %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments("bing"))
## Joining, by = "word"
## Joining, by = "word"
  ggplot(tsop_bing) + geom_bar(aes(sentiment))

NRC

tsop_nrc <-tsopwords %>% 
anti_join(stop_words) %>% 
inner_join(get_sentiments("nrc"))
## Joining, by = "word"
## Joining, by = "word"
ggplot(tsop_nrc) + geom_bar(aes(sentiment)) 

Afinn

 tsopwords%>% 
   anti_join(stop_words) %>% 
   inner_join(get_sentiments("afinn")) -> tsop_afinn 
## Joining, by = "word"
## Joining, by = "word"

Mean Value:

 mean(tsop_afinn$value)
## [1] -0.09489832

This mean is closer 0 - meaning that the afinn sentiment is negative, almost neutral. The overall sentiment for “This Side of Paradise” in the bing lexicon is negative.

Most commonly used positive words - Afinn

“Love” is clearly the most used, with 76 mentions, with “God” 34 times.

tsop_afinn %>% 
  filter(value > 0) %>% 
  count(word, sort = TRUE) %>% 
  head(10) %>% 
  knitr::kable()
word n
love 76
god 34
kiss 31
beautiful 22
care 22
laughed 22
pretty 20
reached 17
strong 17
matter 16

Most commonly used negative words - Afinn

The top 10 most common words have that have a value below 0 range from “poor” to “war” to “gray”. “Afraid” is the most used word, with 30 mentions. “Poor” is mentioned 29 times.

tsop_afinn %>% 
  filter(value < 0) %>% 
  count(word, sort = TRUE) %>% 
  head(10) %>% 
  knitr::kable()
word n
afraid 30
poor 29
gray 25
cried 23
damn 21
dead 20
war 20
bad 17
lost 17
tired 17

Wordcloud

tsop_afinn %>%
filter(value > 0) %>% 
count(word, sort = TRUE) %>%
wordcloud2()
tsop_afinn %>%
filter(value < 0) %>% 
count(word, sort = TRUE) %>%
wordcloud2()

“The Beautiful and the Damned”

Published in 1922, Fitzgerald’s second novel concerns a handsome young married couple who choose to wait for an expected inheritance rather than involve themselves in productive, meaningful lives." This title was shortened to “bad”. Source for book information: https://www.britannica.com/topic/The-Beautiful-and-Damned

Word Count:

## # A tibble: 1 x 1
##       n
##   <int>
## 1 93859

Top Words

The top words are time, eyes, day, and night.

word n
time 166
eyes 137
day 130
night 114
life 111
sort 96
voice 94
people 85
found 82
half 82

Quote

“Rather nice night, after all. Stars are out and everything. Exceptionally tasty assortment of them.”

Sentiment

Bing

NRC

## Joining, by = "word"
## Joining, by = "word"

Afinn

Mean Value:

## [1] -0.2205915

Most commonly used positive words - Afinn

“Matter” is mentioned 65 times and “love” is mentioned 63 times.

word n
matter 65
love 63
beautiful 43
laughed 42
god 40
pretty 35
reached 32
care 30
kiss 27
cool 24

Most commonly used negative words - Afinn

“Cried” is mentioned 58 times and gray is mentioned 48 times.

word n
cried 58
gray 48
broken 29
demanded 28
tired 26
broke 25
fire 24
hate 24
war 24
bad 22

Wordcloud

“The Great Gatsby”

“The Great Gatsby”, Fitzgerald’s third novel, was published in 1925. The text was pulled from Project Gutenberg Australia. “Set in Jazz Age New York, the novel tells the tragic story of Jay Gatsby, a self-made millionaire, and his pursuit of Daisy Buchanan, a wealthy young woman whom he loved in his youth. Unsuccessful upon publication, the book is now considered a classic of American fiction and has often been called the Great American Novel.” Source: https://www.britannica.com/topic/The-Great-Gatsby

gatsby %>% 
   unnest_tokens(word,X1) -> gatsbywords

Word Count:

## # A tibble: 1 x 1
##       n
##   <int>
## 1 43549

Top Words

The top words are house, eyes, time and looked.

Quote

“The eyes of Doctor T. J. Eckleburg are blue and gigantic— their retinas are one yard high. They look out of no face, but, instead, from a pair of enormous yellow spectacles which pass over a nonexistent nose”

word n
house 91
eyes 80
looked 79
time 73
car 69
door 67
night 67
moment 63
hand 56
people 56

Sentiment

Bing

NRC

Afinn

Mean Value

## [1] -0.2263697

Most commonly used positive words - Afinn

The word “love” appears 24 times, while the word “god” is mentioned 21 times.

word n
love 21
matter 21
god 20
loved 19
laughed 14
pretty 14
reached 14
care 13
cool 13
nice 13

Most commonly used negative words - Afinn

The word “miss” appears 38 times and “cried” appears 31 times.

word n
miss 32
cried 30
demanded 24
broke 22
stopped 19
hard 18
crazy 14
war 13
dead 12
stop 12

Wordcloud

“Tender is the Night”

“Tender is the Night” is the final book written by Fitzgerald while he was alive in 1934, published six before his death in 1940. This text was pulled from Project Gutenberg Australia. Arguably his most autobiographical novel, “ Tender Is the Night tells the story of Dick and Nicole Diver’s crumbling marriage. Though not well received at the time of its 1934 serial publication, both readers and critics have since recognized the novel as one of the twentieth century’s best. More than a simple story of estrangement and infidelity, Tender Is the Night grapples with the complexity of human relationships and the manipulations and ministrations of those closest to us.” Source: https://study.com/academy/lesson/tender-is-the-night-summary-characters-themes-analysis.html#:~:text=The%20darkest%20and%20most%20autobiographical,of%20the%20twentieth%20century’s%20best.

## # A tibble: 1 x 1
##       n
##   <int>
## 1 61923

Top Words

The top words are time and doctor.

word n
time 113
doctor 92
people 91
looked 81
love 76
girl 70
hotel 67
night 66
day 62
mother 62

Quote

“When you’re older you’ll know what people who love suffer. The agony. It’s better to be cold and young than to love. It’s happened to me before but never like this - so accidental - just when everything was going well.”

Sentiment

Bing

Afinn

Mean Value:

## [1] -0.7811258

“Tender is the Night” has the mean afinn value farthest away from zero, meaning that the afinn lexicon analyzed this book as incredibly negative.

NRC

Most commonly used positive words - Afinn

“Love” is clearly the most used - at over 76 mentions, with “nice” at 36.

word n
love 76
nice 36
laughed 33
matter 32
fine 27
agreed 25
care 21
fun 21
god 21
glad 19

Most Common Negative Words - Afinn

“Afraid” is the most used word, with 30 mentions. “Poor” is mentioned 29 times.

word n
dick 509
demanded 29
cried 28
hard 28
leave 26
dead 24
war 23
bad 21
afraid 19
miss 19

Wordclouds

Conclusion

My first hypothesis is correct - Fitzgerald mentions “time” and “love” multiple times in each novel. Another interesting trend that I saw was the use of the word “war” throughout the novels - this makes sense as modernist literature was the genre that exploded post-World War I. It is clear that the war affected his writing. Fitzgerald also wrote “cried” and “god” frequently. It is harder to track my second hypothesis due to the differences in how the lexicons analyze words into different categories.

If we were to base the analysis on just the mean afinn value - Fitzgerald’s second and fourth novels are the most negative. His most negative novel was his last novel, “Tender is the Night” - this could be as a result of it being the most “autobiographical” version and was published during an incredibly hard time in the author’s life. Regardless, Fitzgerald left an incredible impact on American literature.

“So we beat on, boats against the current, borne back ceaseslessly into the past…”