Introduction

One piece is a popular manga, anime, and now live action created by Eiichiro Oda. It is one of the longest-running and most successful anime franchises in history. It follows the adventures of Monkey D. Luffy and his crew, the Straw Hat Pirates, as they search for the legendary treasure called the One Piece, which would make Luffy the King of Pirates. Along the way, they navigate complex conflicts, form deep bonds, and challenge oppressive forces.

Significance of One Piece

The popularity and longevity of One Piece make it a remarkable case study for understanding how a narrative can captivate audiences for over two decades. One of the key reasons for its enduring success is the exploration of universal themes such as Friendship, Loyalty, Rebellion, Betrayal, Sacrifice, and Dreams. These themes transcend cultural boundaries, allowing viewers to connect emotionally with the characters and storylines. By examining these themes across pivotal sagas, we can gain insights into how One Piece maintains its emotional resonance and why it continues to engage audiences across generations.

Research Question

How does the emotional tone and thematic depth of One Piece contribute to its enduring popularity and audience engagement across its sagas?

Data Collection

The metadata for this project was collected from Kaggle, which provided data scraped from IMDb. This dataset includes key information such as the season, episode number, episode title, year released, total votes, and average rating for each episode. Since One Piece has over 1,000 episodes, I focused on the most popular episodes based on their average ratings to conduct a meaningful text analysis.

However, obtaining episode transcripts presented several challenges. Popular transcript websites like Subslikescript did not have all the episodes I wanted to analyze, while Forever Dreaming provided paraphrased content instead of accurate transcripts. Additionally, since I needed English-dubbed versions, finding usable transcripts became even more complex.

To overcome these obstacles, I screen-recorded the selected episodes, converted the recordings into MP3 audio files, and in Python I utilized Whisper, an open-source speech recognition model developed by OpenAI, to transcribe the audio into text files. Whisper efficiently transcribes spoken language into written text, enabling me to generate accurate transcripts for the selected episodes and proceed with the text analysis.

Importing Data

knitr::opts_chunk$set(echo = TRUE)
#import dataset 
library(readr)
ONE_PIECE <- read_csv("ONE PIECE.csv")
## New names:
## Rows: 958 Columns: 9
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (2): trend, name dbl (5): ...1, season, episode, start, average_rating num (2):
## rank, total_votes
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
View(ONE_PIECE)

After importing the dataset, I removed unnecessary columns. I then created a Saga category to group episodes by the saga they belong to.Episodes that did not fit into the main story arcs were categorized as Filler episodes, which are episodes that are not part of the original manga’s storyline. They are often created to give the manga time to advance or to add extra content that doesn’t affect the main plot. While filler episodes can provide additional character development or side stories, they are generally considered non-essential to the core narrative.

knitr::opts_chunk$set(echo = TRUE)
#install.packages("dplyr")
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
#remove cplumns 
Luffy <- ONE_PIECE %>% select(-1, -2, -3,-4)
# add arc
Luffy <- Luffy %>%
  mutate(saga = case_when(
    episode >= 1 & episode <= 61 ~ "East Blue",
    episode >= 61 & episode <= 130 ~ "Alabasta",
    episode >= 144 & episode <= 195 ~ "Skypiea",
    episode >= 207 & episode <= 325 ~ "Water 7",
    episode >= 337 & episode <= 381 ~ "Thriller Bark",
    episode >= 385 & episode <= 516 ~ "Summit War",
    episode >= 517 & episode <= 574 ~ "Fish-Man Island",
    episode >= 579 & episode <= 746 ~ "Dressrosa",
    episode >= 751 & episode <= 877 ~ "Whole Cake Island",
    episode >= 892 ~ "Wano",
    TRUE ~ "Filler" 
  ))

Exploratory Statistics

One Piece is known as the anime that gets better over time, both in animation and story. To explore the trends in its popularity, I analyzed the average episode ratings across different years,

knitr::opts_chunk$set(echo = TRUE)

# year avg_rating 
year_stats <- Luffy %>%
  group_by(start) %>%
  summarize(avg_rating = mean(average_rating, na.rm = TRUE)) %>%
  arrange(desc(avg_rating))

#install.packages("ggplot2")
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.2
ggplot(year_stats, aes(x = start, y = avg_rating)) +
  geom_line(color = "purple") + 
  geom_point(color = "red") + 
  labs(title = "Average Ratings Over Time",
       x = "Year",
       y = "Average Rating") +
  scale_x_continuous(breaks = seq(min(year_stats$start), max(year_stats$start), by = 1)) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

This line chart displays the average ratings of One Piece episodes from 1999 to 2021. Each point represents the mean rating for episodes released in a given year. The chart reveals several key patterns:

1.Consistency with Gradual Growth: In the early years, the ratings remained relatively stable, hovering around 7.5 to 8.0. This suggests a consistent level of viewer satisfaction during the show’s initial growth phase.

2.Noticeable Peaks and Troughs: There are clear peaks, such as in 2010 and 2015, indicating years where episodes were particularly well-received. Conversely, occasional dips suggest periods of slower pacing or less impactful episodes.

3.Significant Rise in Recent Years: The dramatic increase in ratings around 2021 suggests renewed interest or major improvements, possibly due to enhanced animation quality or compelling story developments.

Overall, this chart highlights One Piece’s enduring appeal and its ability to captivate audiences over decades, with ratings steadily climbing as the series progresses.

Similar analysis is done to see the average rating across Sagas.

knitr::opts_chunk$set(echo = TRUE)
library(dplyr)
# avg_rating by saga
saga_stats <- Luffy %>%
  group_by(saga) %>%
  summarize(avg_rating = mean(average_rating, na.rm = TRUE)) %>%
  arrange(desc(avg_rating))

ggplot(saga_stats, aes(x = avg_rating, y = reorder(saga, avg_rating), color = saga)) +
  geom_point(size = 5) +
  labs(
    title = "Average Ratings by Saga",
    x = "Average Rating",
    y = "Saga"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    axis.text = element_text(size = 12),
    axis.title = element_text(size = 14),
    legend.position = "none"
  )

This dot plot displays the average episode ratings for each saga. Filler episodes not surprisingly has the lowest average rating as they are not essential to the narrative. East Blue and Alabsasta are earlier sagas and it shows exponential growth with each Saga with Wano one of the newer Saga with the highest rating. The only outlier I see is Fish-Man Island as it is the first saga after the time skip and it has the lowest low rating among the acutal sagas.

knitr::opts_chunk$set(echo = TRUE)
#combine viz
ggplot(Luffy %>%
         group_by(start, saga) %>%
         summarize(avg_rating = mean(average_rating, na.rm = TRUE)) %>%
         ungroup(), aes(x = start, y = avg_rating, color = saga)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  labs(
    title = "Average Ratings Over Time by Saga",
    x = "Year",
    y = "Average Rating"
  ) +
  scale_x_continuous(breaks = seq(min(Luffy$start), max(Luffy$start), by = 1)) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5)
  )
## `summarise()` has grouped output by 'start'. You can override using the
## `.groups` argument.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

This line chart tracks the average episode ratings over time, segmented by saga,which is a combination of the two graphs above. It shows the earlier sagas start with moderate ratings but show improvement as the series gains momentum. Ratings fluctuate for filler episodes, reflecting their varying quality. More recent sagas show a noticeable upward trend, suggesting that One Piece has maintained or even improved its storytelling quality over the years.This visualization highlights both the series’ growth in popularity and how different sagas have impacted its ratings trajectory.

Top Sagas

The top sagas were identified as —Dressrosa, Summit War, Whole Cake Island, and Wano—, known for their emotional depth, high-stakes battles, and significant plot developments. To further explore what makes these sagas stand out, I will identify the top three highest-rated episodes from each saga. By analyzing these episodes, I aim to uncover key narrative elements, pivotal moments, or character-driven events that contribute to their exceptional ratings.

knitr::opts_chunk$set(echo = TRUE)
# identify top ep in top saga 
top_episodes_sagas <- Luffy %>%
  filter(saga %in% c("Summit War", "Whole Cake Island", "Wano","Dressrosa")) %>%
  group_by(saga) %>%
  arrange(desc(average_rating)) %>%
  slice(1:3) 

print(top_episodes_sagas)
## # A tibble: 12 × 6
## # Groups:   saga [4]
##    episode name                           start total_votes average_rating saga 
##      <dbl> <chr>                          <dbl>       <dbl>          <dbl> <chr>
##  1     726 Gear Fourth! Kyoui no Bounce …  2016         328            9.2 Dres…
##  2     719 Kuuchuu Kessen: Zoro Shin His…  2015         262            9.1 Dres…
##  3     663 Luffy Kyougaku: Ace no Ishi o…  2014         201            8.9 Dres…
##  4     483 Kotae o Sagashite: Hiken Ace …  2011         524            9.3 Summ…
##  5     485 Kejime o Tsukeru: Shirohige v…  2011         370            9.3 Summ…
##  6     405 Kesareta Nakama-tachi: Mugiwa…  2009         358            9.2 Summ…
##  7     958 &quot;The Legendary Battle! G…  2021         746            9.4 Wano 
##  8     892 Wano Country! To the Land of …  2019         340            9.2 Wano 
##  9     914 Finally Clashing! The Ferocio…  2019         397            9.2 Wano 
## 10     808 Kanashiki Kettou: Luffy tai S…  2017         571            9.6 Whol…
## 11     870 A Fist of Divine Speed! Anoth…  2019         683            9.5 Whol…
## 12     804 East Blue e: Sanji Ketsui no …  2017         215            9.2 Whol…

Transcript of Top Episodes

I have obtained the transcripts for the 12 episodes I intend to analyze. With the support of ChatGPT, I developed a function to streamline the process of transforming the transcripts into a tidy format. This function efficiently compiles the text into a tibble, tokenizes the text into individual words, and removes stop words and punctuation. By breaking the text down in this way, I can effectively analyze sentiments using the NRC and Bing lexicons, which will be instrumental in identifying recurring themes and emotional tones within these sagas.

knitr::opts_chunk$set(echo = TRUE)

#install.packages("tidytext")  
#install.packages("stringr")
#install.packages("dplyr") 
library(tidytext)
## Warning: package 'tidytext' was built under R version 4.4.2
library(stringr)
## Warning: package 'stringr' was built under R version 4.4.2
library(dplyr)
library(readr) 
library(tibble)

# Define the function to process each episode
process_episode <- function(ep_text_file) {
  EP_text <- readLines(ep_text_file, warn = FALSE)
  # Create a tibble from the text
  text_df <- tibble(line = 1:length(EP_text), text = EP_text)
  # Tokenize the text into words
  tidy_text <- text_df %>%
    unnest_tokens(word, text)
  # Remove stop words
  data("stop_words")
  tidy_text_clean <- tidy_text %>%
    anti_join(stop_words, by = "word")
  # Remove punctuation and numbers, then filter out empty strings
  tidy_text_clean <- tidy_text_clean %>%
    mutate(word = str_replace_all(word, "[^a-zA-Z]", "")) %>%
    filter(word != "")
  # Convert to lowercase for consistency
  tidy_text_clean <- tidy_text_clean %>%
    mutate(word = tolower(word))
  # Perform word frequency analysis
  word_counts <- tidy_text_clean %>%
    count(word, sort = TRUE)
  # Perform NRC sentiment analysis
  nrc_sentiment <- tidy_text_clean %>%
    inner_join(get_sentiments("nrc"), by = "word") %>%
    count(sentiment, sort = TRUE)
  # Perform Bing sentiment analysis
  bing_sentiment <- tidy_text_clean %>%
    inner_join(get_sentiments("bing"), by = "word") %>%
    count(sentiment, sort = TRUE)
  # Return the results (word counts, NRC sentiment, and Bing sentiment)
  return(list(
    word_counts = word_counts, 
    nrc_sentiment = nrc_sentiment,
    bing_sentiment = bing_sentiment
  ))
}

episodes <- list(
  "Summit War" = c("EP 405.txt", "EP 483.txt", "EP 485.txt"),
  "Whole Cake Island" = c("EP 804.txt", "EP 808.txt", "EP 870.txt"),
  "Wano" = c("EP 892.txt", "EP 914.txt", "EP 958.txt"),
  "Dressrosa" = c("EP 663.txt", "EP 719.txt", "EP 726.txt")
)

# Process each episode in each saga
episode_results <- list()

for (saga in names(episodes)) {
  episode_results[[saga]] <- lapply(episodes[[saga]], process_episode)
}
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 7 of `x` matches multiple rows in `y`.
## ℹ Row 6053 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 5 of `x` matches multiple rows in `y`.
## ℹ Row 6368 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 3 of `x` matches multiple rows in `y`.
## ℹ Row 13756 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 3 of `x` matches multiple rows in `y`.
## ℹ Row 5143 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 5 of `x` matches multiple rows in `y`.
## ℹ Row 12495 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 3 of `x` matches multiple rows in `y`.
## ℹ Row 5481 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 10 of `x` matches multiple rows in `y`.
## ℹ Row 9181 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 4977 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 5 of `x` matches multiple rows in `y`.
## ℹ Row 10206 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 2 of `x` matches multiple rows in `y`.
## ℹ Row 5026 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 2 of `x` matches multiple rows in `y`.
## ℹ Row 13676 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
## Warning in inner_join(., get_sentiments("nrc"), by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 5 of `x` matches multiple rows in `y`.
## ℹ Row 5600 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
episode_results_summary <- lapply(episode_results, function(res) {
  list(
    Word_Frequency = head(res[[1]], 5),
    NRC_Sentiment = res[[2]],
    Bing_Sentiment = res[[3]]
  )
})
print(episode_results_summary)
## $`Summit War`
## $`Summit War`$Word_Frequency
## $`Summit War`$Word_Frequency$word_counts
## # A tibble: 212 × 2
##    word        n
##    <chr>   <int>
##  1 stop       15
##  2 luffy      12
##  3 friends     5
##  4 hell        5
##  5 saji        5
##  6 damn        4
##  7 hey         4
##  8 life        4
##  9 sora        4
## 10 captain     3
## # ℹ 202 more rows
## 
## $`Summit War`$Word_Frequency$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        46
##  2 positive        42
##  3 trust           30
##  4 sadness         26
##  5 fear            25
##  6 anger           24
##  7 anticipation    21
##  8 disgust         21
##  9 joy             17
## 10 surprise         9
## 
## $`Summit War`$Word_Frequency$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     39
## 2 positive     17
## 
## 
## $`Summit War`$NRC_Sentiment
## $`Summit War`$NRC_Sentiment$word_counts
## # A tibble: 202 × 2
##    word        n
##    <chr>   <int>
##  1 gonna      14
##  2 die        10
##  3 live        6
##  4 brother     5
##  5 hey         5
##  6 born        4
##  7 save        4
##  8 son         4
##  9 yeah        4
## 10 aces        3
## # ℹ 192 more rows
## 
## $`Summit War`$NRC_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 positive        47
##  2 negative        45
##  3 trust           35
##  4 fear            32
##  5 sadness         30
##  6 anticipation    24
##  7 anger           21
##  8 joy             20
##  9 disgust         18
## 10 surprise        10
## 
## $`Summit War`$NRC_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     52
## 2 positive     18
## 
## 
## $`Summit War`$Bing_Sentiment
## $`Summit War`$Bing_Sentiment$word_counts
## # A tibble: 312 × 2
##    word      n
##    <chr> <int>
##  1 ha       26
##  2 hey       5
##  3 teach     5
##  4 ah        4
##  5 body      4
##  6 real      4
##  7 roger     4
##  8 ya        4
##  9 beard     3
## 10 black     3
## # ℹ 302 more rows
## 
## $`Summit War`$Bing_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        57
##  2 positive        45
##  3 trust           42
##  4 anticipation    31
##  5 fear            28
##  6 anger           27
##  7 sadness         22
##  8 joy             17
##  9 disgust         15
## 10 surprise        15
## 
## $`Summit War`$Bing_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     58
## 2 positive     16
## 
## 
## 
## $`Whole Cake Island`
## $`Whole Cake Island`$Word_Frequency
## $`Whole Cake Island`$Word_Frequency$word_counts
## # A tibble: 234 × 2
##    word      n
##    <chr> <int>
##  1 heh      25
##  2 gonna    20
##  3 fine     16
##  4 sanji    13
##  5 tired     7
##  6 blue      5
##  7 cook      5
##  8 food      5
##  9 stop      5
## 10 yeah      5
## # ℹ 224 more rows
## 
## $`Whole Cake Island`$Word_Frequency$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        50
##  2 positive        47
##  3 sadness         30
##  4 trust           30
##  5 anticipation    27
##  6 fear            25
##  7 joy             17
##  8 anger           16
##  9 disgust         14
## 10 surprise         2
## 
## $`Whole Cake Island`$Word_Frequency$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     45
## 2 positive     39
## 
## 
## $`Whole Cake Island`$NRC_Sentiment
## $`Whole Cake Island`$NRC_Sentiment$word_counts
## # A tibble: 250 × 2
##    word          n
##    <chr>     <int>
##  1 sanji        17
##  2 bluefinch    10
##  3 time          7
##  4 gonna         6
##  5 leave         6
##  6 leaving       5
##  7 pirates       5
##  8 sonsi         5
##  9 uh            5
## 10 cook          4
## # ℹ 240 more rows
## 
## $`Whole Cake Island`$NRC_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 positive        49
##  2 negative        45
##  3 anticipation    33
##  4 trust           31
##  5 anger           20
##  6 joy             20
##  7 fear            19
##  8 sadness         19
##  9 disgust         15
## 10 surprise        15
## 
## $`Whole Cake Island`$NRC_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     32
## 2 positive     26
## 
## 
## $`Whole Cake Island`$Bing_Sentiment
## $`Whole Cake Island`$Bing_Sentiment$word_counts
## # A tibble: 212 × 2
##    word       n
##    <chr>  <int>
##  1 yeah       5
##  2 gonna      4
##  3 luffy      4
##  4 punch      4
##  5 ah         3
##  6 ass        3
##  7 fourth     3
##  8 gear       3
##  9 huh        3
## 10 kick       3
## # ℹ 202 more rows
## 
## $`Whole Cake Island`$Bing_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        49
##  2 positive        32
##  3 fear            26
##  4 anger           23
##  5 anticipation    20
##  6 sadness         16
##  7 trust           15
##  8 joy             12
##  9 surprise        12
## 10 disgust          9
## 
## $`Whole Cake Island`$Bing_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     39
## 2 positive     24
## 
## 
## 
## $Wano
## $Wano$Word_Frequency
## $Wano$Word_Frequency$word_counts
## # A tibble: 312 × 2
##    word         n
##    <chr>    <int>
##  1 blood        5
##  2 sword        4
##  3 blade        3
##  4 business     3
##  5 ha           3
##  6 sir          3
##  7 slasher      3
##  8 stop         3
##  9 yeah         3
## 10 ago          2
## # ℹ 302 more rows
## 
## $Wano$Word_Frequency$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 positive        56
##  2 negative        53
##  3 fear            32
##  4 disgust         28
##  5 sadness         28
##  6 anger           26
##  7 trust           26
##  8 anticipation    24
##  9 joy             24
## 10 surprise        12
## 
## $Wano$Word_Frequency$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     47
## 2 positive     31
## 
## 
## $Wano$NRC_Sentiment
## $Wano$NRC_Sentiment$word_counts
## # A tibble: 104 × 2
##    word        n
##    <chr>   <int>
##  1 left        8
##  2 earlier     7
##  3 luffy       7
##  4 town        7
##  5 gonna       5
##  6 fine        4
##  7 sky         4
##  8 die         3
##  9 food        3
## 10 kairos      3
## # ℹ 94 more rows
## 
## $Wano$NRC_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        31
##  2 positive        23
##  3 fear            15
##  4 anger           14
##  5 sadness         14
##  6 trust           11
##  7 disgust          9
##  8 anticipation     7
##  9 joy              6
## 10 surprise         3
## 
## $Wano$NRC_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     18
## 2 positive     10
## 
## 
## $Wano$Bing_Sentiment
## $Wano$Bing_Sentiment$word_counts
## # A tibble: 514 × 2
##    word        n
##    <chr>   <int>
##  1 pirates    20
##  2 rocks      15
##  3 captain     9
##  4 hundred     9
##  5 crew        8
##  6 pirate      8
##  7 world       8
##  8 ah          7
##  9 roger       7
## 10 time        7
## # ℹ 504 more rows
## 
## $Wano$Bing_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 positive       136
##  2 trust           79
##  3 negative        72
##  4 anticipation    58
##  5 fear            53
##  6 joy             44
##  7 anger           37
##  8 sadness         23
##  9 disgust         18
## 10 surprise        16
## 
## $Wano$Bing_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     62
## 2 positive     48
## 
## 
## 
## $Dressrosa
## $Dressrosa$Word_Frequency
## $Dressrosa$Word_Frequency$word_counts
## # A tibble: 359 × 2
##    word        n
##    <chr>   <int>
##  1 luffy       8
##  2 gonna       7
##  3 hobby       7
##  4 yeah        7
##  5 plan        6
##  6 bellamy     5
##  7 follow      5
##  8 fruit       5
##  9 huh         5
## 10 toys        5
## # ℹ 349 more rows
## 
## $Dressrosa$Word_Frequency$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 positive        79
##  2 negative        76
##  3 anticipation    49
##  4 trust           48
##  5 joy             45
##  6 anger           42
##  7 fear            41
##  8 sadness         33
##  9 surprise        23
## 10 disgust         19
## 
## $Dressrosa$Word_Frequency$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     70
## 2 positive     41
## 
## 
## $Dressrosa$NRC_Sentiment
## $Dressrosa$NRC_Sentiment$word_counts
## # A tibble: 231 × 2
##    word          n
##    <chr>     <int>
##  1 winner       11
##  2 die          10
##  3 riku         10
##  4 king          7
##  5 rua           6
##  6 world         5
##  7 kill          4
##  8 technique     4
##  9 time          4
## 10 worlds        4
## # ℹ 221 more rows
## 
## $Dressrosa$NRC_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        60
##  2 positive        58
##  3 fear            38
##  4 sadness         36
##  5 anticipation    32
##  6 trust           31
##  7 surprise        26
##  8 joy             24
##  9 anger           18
## 10 disgust         12
## 
## $Dressrosa$NRC_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     60
## 2 positive     31
## 
## 
## $Dressrosa$Bing_Sentiment
## $Dressrosa$Bing_Sentiment$word_counts
## # A tibble: 231 × 2
##    word            n
##    <chr>       <int>
##  1 yeah            6
##  2 thoflamingo     5
##  3 ah              3
##  4 ass             3
##  5 battle          3
##  6 center          3
##  7 cover           3
##  8 gonna           3
##  9 gotta           3
## 10 kingdom         3
## # ℹ 221 more rows
## 
## $Dressrosa$Bing_Sentiment$nrc_sentiment
## # A tibble: 10 × 2
##    sentiment        n
##    <chr>        <int>
##  1 negative        47
##  2 positive        33
##  3 trust           27
##  4 fear            26
##  5 anger           23
##  6 sadness         16
##  7 anticipation    15
##  8 disgust         12
##  9 joy              9
## 10 surprise         8
## 
## $Dressrosa$Bing_Sentiment$bing_sentiment
## # A tibble: 2 × 2
##   sentiment     n
##   <chr>     <int>
## 1 negative     29
## 2 positive     23

One previous assignment we looked at an example of tidy text using term frequency-inverse document frequency (TF-IDF). This would highlight the importance of words within each saga relative to the others. Higher TF-IDF scores indicate words that are highly specific to a saga.

knitr::opts_chunk$set(echo = TRUE)
library(dplyr)
library(ggplot2)
library(tidytext)
library(stringr)
library(forcats)


# Combine all word frequency data across sagas
combined_word_counts <- bind_rows(
  lapply(names(episode_results), function(saga) {
    word_counts <- episode_results[[saga]][[1]]$word_counts
    word_counts$saga <- saga  # Add saga column
    word_counts
  })
)

# Define stopwords to remove
mystopwords <- tibble(word = c("eq", "co", "rc", "ac", "ak", "bn", 
                               "fig", "file", "cg", "cb", "cm",
                               "ab", "_k", "_k_", "_x"))

# Remove stopwords
cleaned_words <- anti_join(combined_word_counts, mystopwords, by = "word")

# Calculate TF-IDF and process data for plotting
plot_words <- cleaned_words %>%
  bind_tf_idf(word, saga, n) %>%  # Calculate TF-IDF
  mutate(word = str_remove_all(word, "_")) %>%  # Clean underscores from words
  group_by(saga) %>% 
  slice_max(tf_idf, n = 15, with_ties = FALSE) %>%  # Select top 15 words per saga
  ungroup() %>%
  mutate(word = fct_reorder(word, tf_idf))  # Reorder for better visualization

# Plot the data
ggplot(plot_words, aes(tf_idf, word, fill = saga)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~saga, ncol = 2, scales = "free") +
  labs(x = "TF-IDF", y = NULL, title = "Top Words by TF-IDF Across Sagas") +
  scale_fill_brewer(palette = "Set2") +  # Use a clean and distinct palette
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    strip.background = element_rect(fill = "grey85", color = "grey30"),  # Grey strip background
    strip.text = element_text(size = 14, face = "bold"),  # Bold facet labels
    axis.text.y = element_text(size = 10),  # Smaller y-axis text
    axis.text.x = element_text(size = 10),  # Consistent x-axis text
    panel.grid.major = element_blank(),  # Simplify gridlines
    panel.grid.minor = element_blank()
  )

Dressrosa

The words “Doflamingo,” “toys,” and “underground” highlight the saga’s focus on Doflamingo’s ruthless control over Dressrosa and his ties to the black market. Terms like “fruit” and “barrier” connect to key plot points, such as Sugar’s Toy-Toy Fruit, which allowed her to turn citizens into toys, enslaving them as part of the larger scheme to produce SMILE fruits.

Summit War

“Crew,” “friends,” and “erased” reflect powerful themes of camaraderie and loss, marking the saga as the last before the time skip and the crew’s two-year separation. “Kuma,” “Sora,” and “bastard” underscore pivotal emotional moments, such as Kuma’s secret role as a double agent for both the Warlords and the Revolutionary Army.

Whole Cake Island

The words “Sanji,” “cook,” and “father” center on Sanji’s personal struggles with his family and his significant role in the story. Terms like “food” and “speak” emphasize the saga’s unique focus on diplomacy, alliances, and the importance of cuisine.

Wano

“Sword,” “blade,” and “blood” reflect the saga’s samurai-centric themes and intense battles. Meanwhile, words like “funny” and “folks” mask the tragic truth of the SMILE fruits, whose devastating effects are revealed in this arc.

The TF-IDF analysis highlights how each saga’s vocabulary reflects its core themes and narrative focus. Summit War and Whole Cake Island stand out with highly specific terms, emphasizing their distinct and emotionally charged storylines. In contrast, Dressrosa and Wano prioritize words tied to combat, leadership, and strategy, aligning with their action-packed and transformative arcs.

Bing & NRC Sentiment

By aggregating the Bing sentiment scores, I will be able to see the distribution of positive and negative sentiments across major sagas, providing a comparative overview of emotional tones in each arc.

knitr::opts_chunk$set(echo = TRUE)

# aggregate Bing sentiment results by saga
combined_bing_sentiment <- bind_rows(
  lapply(names(episode_results), function(saga) {
    bind_rows(lapply(episode_results[[saga]], `[[`, "bing_sentiment")) %>%
      group_by(sentiment) %>%
      summarize(total_count = sum(n)) %>%
      mutate(saga = saga)
  })
)

# positive vs negative sentiment by saga
ggplot(combined_bing_sentiment, aes(x = saga, y = total_count, fill = sentiment)) +
  geom_col(position = "dodge") +
  scale_fill_viridis_d(option = "D", begin = 0.2, end = 0.8) + 
  labs(
    title = "Positive vs Negative Sentiment Across Sagas",
    x = "Saga",
    y = "Sentiment Count"
  ) +
  theme_light() +
  theme(
    plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1, size = 12),
    axis.text.y = element_text(size = 12),
    axis.title = element_text(size = 14, face = "bold"),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10)
  )

The chart shows that negative sentiment consistently outweighs positive sentiment across all sagas, which aligns with the intense and often dramatic events that unfold.Dressrosa has the highest negative sentiment, possibly due to the dark themes of oppression, slavery, and rebellion. A close second in negative sentiment,is Sumit Wars reflecting the climactic and tragic events of this arc, including the loss of key characters. Despite its focus on family conflict and betrayal, WCI has a relatively high positive sentiment, possibly reflecting lighter moments involving Sanji’s character arc and the unique culinary themes. While still predominantly negative, Wano shows a more balanced sentiment, likely due to moments of humor and cultural exploration alongside the grim revelations and battles.

knitr::opts_chunk$set(echo = TRUE)

library(tidyr)
library(ggplot2)

# Combine NRC sentiment results for each saga
combined_nrc_sentiment <- bind_rows(
  lapply(names(episode_results), function(saga) {
    bind_rows(lapply(episode_results[[saga]], `[[`, "nrc_sentiment")) %>%
      group_by(sentiment) %>%
      summarize(total_count = sum(n)) %>%
      mutate(saga = saga)
  })
) %>%
  filter(!sentiment %in% c("positive", "negative"))  # Exclude positive and negative sentiments

library(RColorBrewer)

ggplot(combined_nrc_sentiment, aes(x = sentiment, y = total_count, fill = saga)) +
  geom_col(position = "dodge") +
  scale_fill_brewer(palette = "Spectral") + 
  labs(
    title = "Key Sentiments Across Sagas",
    x = "Sentiment",
    y = "Count"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1, size = 12),
    axis.text.y = element_text(size = 12),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10)
  )

By breaking down emotional tones into these distinct key sentiments, the chart highlights the emotional complexity and narrative focus of each saga. Dressrosa stands out with high levels of anger, trust, and anticipation, which is helping draw together the themes of rebellion, hope, and camaraderie against oppression. Summit War is marked by significant fear, sadness, and anticipation. Wano features a balanced distribution, with peaks in anticipation, fear, and trust, aligning with the tension-filled buildup and alliances against Kaido. Whole Cake Island emphasizes trust and anticipation, highlighting the saga’s focus on alliances and resolutions, while joy reflects its culinary whimsy alongside underlying family conflicts.

knitr::opts_chunk$set(echo = TRUE)

theme_mapping <- tibble(
  sentiment = c("trust", "joy", "anticipation", "surprise","anger", "fear", "sadness", "disgust"),
  theme = c("Bonds & Loyalty","Family & Unity","Dreams & Ambition","Freedom & Discovery","Oppression & Resistance", "Tyranny & Control","Sacrifice & Loss","Betrayal & Deception")
)

# map themes to df
thematic_sentiments <- combined_nrc_sentiment %>%
  inner_join(theme_mapping, by = "sentiment") %>%
  group_by(theme, saga) %>%
  summarize(total_count = sum(total_count), .groups = "drop")


ggplot(thematic_sentiments, aes(x = saga, y = theme, size = total_count, color = theme)) +
  geom_point(alpha = 0.7) +
  scale_size_continuous(range = c(3, 15)) +
  labs(
    title = "Bubble Chart of Themes Across Sagas",
    x = "Saga",
    y = "Theme",
    size = "Total Count"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1, size = 12)
  )

Using sentiment analysis as a foundation, themes were derived by mapping specific emotions to broader narrative elements. This bubble chart visualizes, which I never seen used in class, demostrates the distribution of these themes across the different sagas.

Bonds & Loyalty

Stands out as one of the most prominent themes across the sagas. This reflects the core relationships between the Straw Hat crew and their allies, emphasizing trust and camaraderie as central to the story. Time and again, the crew demonstrates unwavering loyalty, whether by risking their lives for one another or standing together against overwhelming odds.

Tyranny & Control

Highlights the darker aspects of One Piece, such as systemic discrimination, enslavement, and injustice. Arcs like “Dressrosa” and “Wano” showcase oppressive regimes, shedding light on the suffering of those under authoritarian rule. These sagas explore the consequences of such tyranny, as well as the resilience and determination of those who rise against it. By focusing on these struggles, the series provides a critique of power dynamics and the fight for freedom, adding a layer of depth to its narrative.

Oppression & Resistance

Further complements the narrative of One Piece by showcasing the ongoing battles against corrupt systems and oppressive forces. Whether it is the Revolutionary Army’s defiance of the World Government or the Straw Hats liberating oppressed communities, resistance is a central motif. This theme is especially prominent in arcs like “Summit War,” where characters’ sacrifices emphasize the high stakes and moral weight of their struggles against injustice.

Dreams & Ambition

Another recurring theme, representing the driving force behind many characters’ motivations. From Luffy’s quest to become the Pirate King to the personal aspirations of each crew member, this theme underscores the importance of determination and perseverance. This ambition is evident across sagas, particularly in arcs that delve into character backstories (WCI), reminding viewers that dreams are a source of strength even in the face of adversity.

Regression

I am using a regression to explore my reserach question to see how much does theme explain the variation in the average rating.In this analysis, the various themes serve as independent or predictor variables, while the average rating of each episode acts as the dependent variable. The goal is to see if narrative elements drive One Piece’s enduring appeal.

knitr::opts_chunk$set(echo = TRUE)
#combine theme and ep rating
theme_ep <- thematic_sentiments %>%
  left_join(top_episodes_sagas, by = "saga")
## Warning in left_join(., top_episodes_sagas, by = "saga"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
head(theme_ep)
## # A tibble: 6 × 8
##   theme         saga  total_count episode name  start total_votes average_rating
##   <chr>         <chr>       <int>   <dbl> <chr> <dbl>       <dbl>          <dbl>
## 1 Betrayal & D… Dres…          43     726 Gear…  2016         328            9.2
## 2 Betrayal & D… Dres…          43     719 Kuuc…  2015         262            9.1
## 3 Betrayal & D… Dres…          43     663 Luff…  2014         201            8.9
## 4 Betrayal & D… Summ…          54     483 Kota…  2011         524            9.3
## 5 Betrayal & D… Summ…          54     485 Keji…  2011         370            9.3
## 6 Betrayal & D… Summ…          54     405 Kesa…  2009         358            9.2
theme_ep$theme <- as.factor(theme_ep$theme)

theme_rating_model <- lm(average_rating ~ theme + total_count, data = theme_ep)
summary(theme_rating_model)
## 
## Call:
## lm(formula = average_rating ~ theme + total_count, data = theme_ep)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.39794 -0.10289  0.00768  0.09228  0.29546 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   9.676429   0.081208 119.156  < 2e-16 ***
## themeBonds & Loyalty          0.473108   0.099314   4.764 7.54e-06 ***
## themeDreams & Ambition        0.332276   0.082847   4.011 0.000128 ***
## themeFamily & Unity           0.143033   0.066968   2.136 0.035503 *  
## themeFreedom & Discovery     -0.085820   0.064331  -1.334 0.185675    
## themeOppression & Resistance  0.222251   0.072457   3.067 0.002878 ** 
## themeSacrifice & Loss         0.226652   0.072817   3.113 0.002509 ** 
## themeTyranny & Control        0.374085   0.087433   4.279 4.81e-05 ***
## total_count                  -0.008802   0.001431  -6.149 2.30e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1538 on 87 degrees of freedom
## Multiple R-squared:  0.303,  Adjusted R-squared:  0.2389 
## F-statistic: 4.727 on 8 and 87 DF,  p-value: 7.77e-05

This model indicates that several themes have a significant positive impact on ratings. For instance, episodes with the theme “Bonds & Loyalty” see an average increase of 0.47 points in ratings compared to the baseline theme, while “Dreams & Ambition” and “Tyranny & Control” contribute 0.33 points and 0.37 points, respectively. On the other hand, “Freedom & Discovery” does not significantly affect ratings (p = 0.19), suggesting it may not resonate as strongly with audiences. Additionally, the total count of themes in an episode shows a small but significant negative effect, with each additional theme count reducing ratings by 0.0088 points on average (p < 0.001). The model explains 30% of the variation in episode ratings (adjusted R-squared = 0.24), emphasizing the importance of well-chosen thematic focus in driving viewer engagement, while other factors should be considered to the animes long success.

MANOVA

A one-way ANOVA, tests one IV with two or more groups on a single DV. A factorial ANOVA: Tests two or more IVs on a single DV. A MANOVA: Tests two or more DVs against one or more IVs.

MANOVA, Multivariate Analysis of Variance, is a statistical test used to determine if there are any differences between groups in terms of multiple dependent variables at the same time.

In my test the variables are: (DVs): average_rating and total_count (IVs): theme and saga

knitr::opts_chunk$set(echo = TRUE)

#multiple ratings by theme
manova_model <- manova(cbind(average_rating, total_count) ~ theme + saga, data = theme_ep)
summary(manova_model)
##           Df  Pillai approx F num Df den Df    Pr(>F)    
## theme      7 0.89789   9.8928     14    170 5.982e-16 ***
## saga       3 0.81049  19.3052      6    170 < 2.2e-16 ***
## Residuals 85                                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The MANOVA analysis utilizes Pillai’s trace, a statistic that ranges from 0 to 1, where higher values indicate a stronger contribution of the independent variables (theme and saga) to the model. In this analysis:

Theme has a Pillai’s trace value of 0.897, indicating a strong relationship with the dependent variables, average_rating and total_count. Saga has a Pillai’s trace value of 0.810, also suggesting a strong influence on the dependent variables. The F-tests for both independent variables evaluate whether they have statistically significant effects on the dependent variables:

The F-statistic for theme is 9.89, and for saga, it is 19.305. Both are highly statistically significant (p-values < 0.001), which means that both theme and saga have a substantial and statistically significant impact on the dependent variables at an alpha level of 0.05. These results indicate that theme and saga significantly influence average_rating and total_count, and the effects are both strong and meaningful.

However, when examining the denominator degrees of freedom (85), it is clear that there is still unexplained variance in the dependent variables that is not accounted for by either theme or saga. This suggests that while theme and saga are important contributors, additional factors may be influencing the average ratings and episode counts, warranting further investigation.

Pairwise

A pairwise comparisons must be completed after MANOVA to understand specifically how the independent variables differ in their effects on the dependent variables.

knitr::opts_chunk$set(echo = TRUE)
pairwise_theme <- pairwise.t.test(theme_ep$average_rating, theme_ep$theme)
pairwise_theme
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  theme_ep$average_rating and theme_ep$theme 
## 
##                         Betrayal & Deception Bonds & Loyalty Dreams & Ambition
## Bonds & Loyalty         1                    -               -                
## Dreams & Ambition       1                    1               -                
## Family & Unity          1                    1               1                
## Freedom & Discovery     1                    1               1                
## Oppression & Resistance 1                    1               1                
## Sacrifice & Loss        1                    1               1                
## Tyranny & Control       1                    1               1                
##                         Family & Unity Freedom & Discovery
## Bonds & Loyalty         -              -                  
## Dreams & Ambition       -              -                  
## Family & Unity          -              -                  
## Freedom & Discovery     1              -                  
## Oppression & Resistance 1              1                  
## Sacrifice & Loss        1              1                  
## Tyranny & Control       1              1                  
##                         Oppression & Resistance Sacrifice & Loss
## Bonds & Loyalty         -                       -               
## Dreams & Ambition       -                       -               
## Family & Unity          -                       -               
## Freedom & Discovery     -                       -               
## Oppression & Resistance -                       -               
## Sacrifice & Loss        1                       -               
## Tyranny & Control       1                       1               
## 
## P value adjustment method: holm

These results indicate where statistically significant differences exist between the themes in terms of average rating. For example, Bonds & Loyalty with Betrayal & Deception is 1. The Holm adjustment ensures that the findings are not due to chance,so no false positives (type I error) and the “1” indicates significant differences between the pairwise comparisons.

Conclusion

In conclusion, the emotional tone and thematic depth of One Piece are key factors in its enduring popularity and audience engagement. The series’ ability to balance complex emotional narratives with rich, evolving themes has contributed to its growth over the years, as evidenced by the steadily increasing average ratings across its sagas. Themes like Bonds & Loyalty, Tyranny & Control, and Dreams & Ambition resonate strongly with viewers, driving higher ratings and engagement. The statistical analysis further underscores the significant impact of both theme and saga on episode ratings, revealing how these elements work together to create a compelling, emotionally charged viewing experience that keeps audiences invested. However, while theme and saga play substantial roles, additional factors may also influence ratings, suggesting areas for further exploration in understanding the show’s long-term success.

Further Analysis

This project offers numerous opportunities for expansion and deeper exploration:

Granular Episode-Level Analysis

Conduct detailed analyses at the individual episode level, by arc, and across entire sagas to uncover more specific patterns in emotional tone and thematic development.

Character-Centric Analysis

Explore the emotional depth and thematic contributions of key characters, examining how their personal arcs influence audience engagement and ratings.

Comparative Analysis with Other Animes

Compare One Piece’s emotional tone and thematic elements with those of other long-running anime series, such as Bleach, to identify unique storytelling strategies and viewer engagement trends.

Time-Series Analysis of Anime Trends

Perform a time-series analysis to examine how trends in emotional tone, themes, and audience ratings have evolved over time, offering insights into the broader cultural and industry shifts within the anime landscape.

Citations

GeeksforGeeks. “Understanding TF-IDF (Term Frequency-Inverse Document Frequency).” Accessed December 6, 2024. https://www.geeksforgeeks.org/understanding-tf-idf-term-frequency-inverse-document-frequency/.

One Piece Wiki. “One Piece Wiki.” Fandom. Accessed December 6, 2024. https://onepiece.fandom.com/wiki/One_Piece_Wiki.

Ratingraph. “One Piece Ratings.” Accessed December 6, 2024. https://www.ratingraph.com/tv-shows/one-piece-ratings-17673/.

Silge, Julia, and David Robinson. Text Mining with R: A Tidy Approach. O’Reilly Media. Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. Accessed December 6, 2024. https://www.tidytextmining.com/.

Sonkin, Phillip. Sentiment Analysis: Welcome to Text Mining with R. Accessed December 6, 2024. https://bookdown.org/psonkin18/berkshire/sentiment.html.

Stack Exchange. “What Particular Measure to Use: Multiple Regression or MANOVA?” Cross Validated, August 2, 2012. Accessed December 6, 2024. https://stats.stackexchange.com/questions/69145/what-particular-measure-to-use-multiple-regression-or-manova.

Statistic How To. “Pillai’s Trace.” Accessed December 6, 2024. https://www.statisticshowto.com/pillais-trace/.

Statistics Solutions. “One-Way MANOVA.” Accessed December 6, 2024. https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/one-way-manova/.

ChatGBT Prompts

How to create a function to process text files for sentiment analysis? How affective are bubble charts? What are possible solutions to this error: Error in if (anova_p_value <- summary(theme_anova)[[1]][“Pr(>F)”][1] < : the condition has length > 1 I want to examine theme and average rating would a linear regression or multivariable regression would be the best?