Text Analysis of Swami Vivekananda’s Chicago Speech

Author Durgesh
Enrollment No. M2023ANLT008
Program MS Analytics, I Year

0.1 Introduction

In this work, we try to delve into the profound words of Swami Vivekananda’s historic Chicago Speech, delivered in 1893 at the World’s Parliament of Religions. This speech holds enduring significance for its timeless message of universal spirituality and tolerance. Swami Vivekananda’s eloquent articulation of India’s ancient wisdom captivated the audience and remains an inspiration worldwide.

Analyzing this speech through text mining techniques offers a unique perspective. By dissecting the text, we aim to uncover patterns, themes, and key insights that may have gone unnoticed. Text analysis provides a powerful tool to understand the nuances of Vivekananda’s discourse and the impact it had on the audience. Through this exploration, we hope to shed light on the profound wisdom embedded in his words and its relevance in today’s globalized world.

0.2 Rationale Behind This Work

  1. Historical Significance: Swami Vivekananda’s speech at the Parliament of the World’s Religions in Chicago in 1893 is a pivotal moment in history. It marked the introduction of Indian philosophy and spirituality to the Western world, making it a compelling subject for analysis.

  2. Cultural Exchange: The speech embodies the idea of cultural exchange and dialogue, making it relevant in today’s globalized world. Understanding the linguistic and thematic elements can shed light on how cultures communicate and share ideas.

  3. Impact on Indian Philosophy: By examining the text, we can gain insights into how Swami Vivekananda presented and represented Indian philosophy and spirituality on an international stage, potentially influencing subsequent developments.

  4. Contemporary Relevance: The themes of religious tolerance, universal brotherhood, and the role of spirituality in addressing societal challenges remain relevant today. Analyzing this speech can provide timeless insights into these topics.

  5. Text Analysis Necessity: Text analysis techniques allow us to dissect the speech at a granular level, unveiling patterns, frequencies, and hidden meanings that may not be evident through casual reading. It enables a structured examination of the language and ideas presented.

0.3 Objectives

  1. Linguistic Analysis: To analyze the speech linguistically, identifying the choice of words, sentence structures, and any specific linguistic patterns that Swami Vivekananda employed to convey his message effectively.

  2. Thematic Exploration: To uncover and explore the key themes and concepts presented in the speech, such as religious pluralism, harmony, and the role of spirituality in human life.

  3. Word Frequency Analysis: To identify and quantify the most frequently used words in the speech, allowing us to discern which concepts or ideas were emphasized the most.

  4. Visual Representation: To visualize the text data through word clouds, bar/frequency charts, or other visualizations, enabling a more intuitive understanding of the speech’s content.

  5. Comparative Analysis: If applicable, to compare this speech with other speeches or writings by Swami Vivekananda to discern any unique characteristics or changes in his communication style over time.

  6. Contextual Understanding: To consider the historical context of the speech, including the audience and the purpose, and how these factors may have influenced the language and content.

0.4 Steps Involved

0.4.1 Step 0: Input Text Source

Downloading the text PDF from https://arunshanbhag.files.wordpress.com/2009/07/vivekananda_chicagospeech.pdf, followed by converting to a .txt file.

0.4.2 Step 1: Load the Text, Packages, Libraries

Below libraries are utilized in this project

library(tm)
library(slam)
library(stringr)
library(wordcloud)
library(ggplot2)

I loaded the text from the provided file and converted it to lowercase for uniformity in our analysis.

speech_text <- tolower(readLines("C:/Users/Durgesh/Downloads/Swami Vivekananda Chicago Speech Text.txt", warn = FALSE))

0.4.3 Step 2: Combine the Text

Next, I combine the text lines into a single string for ease of processing.

speech_text <- paste(speech_text, collapse = " ")

0.4.4 Step 3: Text Cleaning

To clean the text, I remove punctuation, numbers and special characters from the text to focus on the words themselves.

speech_text <- gsub("[[:punct:]]", " ", speech_text) speech_text <- gsub("“|”|’|‘|\"|!|“|”|\\(|\\)|\\[|\\]|\\{|\\}", " ", speech_text) speech_text <- gsub("“|”|’|‘|\"|!|“|”|\\(|\\)|\\[|\\]|\\{|\\}", " ", speech_text)

0.4.5 Step 4: Tokenization

The text is then tokenized into individual words, creating a list of tokens.

speech_tokens <- unlist(str_split(speech_text, "\\s+"))

0.4.6 Step 5: Filtering

Removing empty tokens, stopwords, and single-character words to clean the tokenized list further.

speech_tokens <- speech_tokens[!speech_tokens %in% c("", " ")] speech_tokens <- speech_tokens[!speech_tokens %in% stopwords("en")] speech_tokens <- speech_tokens[str_detect(speech_tokens, "[a-z]")] speech_tokens <- speech_tokens[sapply(speech_tokens, nchar) > 2]

0.4.7 Step 6: Word Frequency Analysis

Word frequencies are calculated to identify the most frequently used words in the speech.

word_freq <- table(speech_tokens)

0.4.8 Step 7: Create a Corpus for Text Mining

I created a Corpus, a fundamental data structure for text mining, to facilitate further analysis.

speech_corpus <- Corpus(VectorSource(speech_tokens))

0.4.9 Step 8: Text Preprocessing

The text corpus undergoes preprocessing to remove stopwords and apply stemming for analysis.

speech_corpus <- tm_map(speech_corpus, content_transformer(tolower)) speech_corpus <- tm_map(speech_corpus, removePunctuation) speech_corpus <- tm_map(speech_corpus, removeNumbers) speech_corpus <- tm_map(speech_corpus, removeWords, stopwords("english")) speech_corpus <- tm_map(speech_corpus, stemDocument)

0.4.10 Step 9: Document-Term Matrix (DTM)

To create a Document-Term Matrix (DTM), I convert our preprocessed text corpus into a numerical representation that represents the frequency of words in documents.

dtm <- DocumentTermMatrix(speech_corpus)

0.4.11 Step 10: Analyze Word Frequency

Calculating word frequencies based on the DTM to identify the most frequently used words in the speech.

word_freq <- row_sums(as.matrix(dtm)) word_freq <- sort(word_freq, decreasing = TRUE)

0.5 Creating a Word Cloud

To visually represent the most frequently used words in speech, I created a word cloud using the wordcloud2 library in R. A word cloud is a graphical representation that displays words in varying sizes based on their frequencies in the text.

0.6 Creating a Bar Plot

To visualize the most frequent words in speech, I created a bar plot using the ggplot2 library in R. A bar plot is a suitable choice for displaying the top 20 words and their frequencies, allowing for easy comparison.

0.7 Interpretation of Findings

Our analysis unveiled the following insights:

0.8 Conclusion

The text analysis of Swami Vivekananda’s Chicago Speech illuminated the enduring relevance of its themes. It serves as a timeless reminder of the importance of universal spirituality, tolerance, and global unity.

0.9 Future Work

For future exploration, I wish to apply: