Goal
Identify underlying trends in society’s attitude toward mental illnesses using the words that appear in the lines of books with mad, madness, crazy, manic, hysterical, or melancholy.
Senstiment Analysis

Examining the Words Used in the Analysis
Methodology
The above analysis uses the words that appear in the lines of fiction that have mad, madness, crazy, manic, hysterical, or melancholy. There are 1,161 such lines from 99 books written by 49 different authors over the 12 decades between the 1830s and the 1940s. All the books are from Project Gutenberg. The following describes how the text data are collected.
- Select the first ten books per decade in the fiction categories (e.g., science fiction and historical fiction). The books are in the ascending order of Gutenberg id.
- Select the lines of the books that contain mad, madness, crazy, manic, hysterical, or melancholy.
- Tokenize the lines in words.
- Score each word in sentiment using the AFINN sentiment lexicon. The AFINN lexicon assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment.
- Take the mean sentiment score of the words by decade.