Pikachu

How I selected data

The original dataset provided for this assignment, encompassing a comprehensive array of Pokémon statistics, did not include a temporal dimension that would allow for a direct analysis of trends and patterns over time. As per the assignment’s guidelines, the absence of time-encoded data necessitated an alternative approach to explore the time aspect of Pokémon’s influence and popularity. In alignment with the prescribed course of action for such scenarios, I turned to external sources to enrich the analysis.

To this end, I selected the Wikipedia page views data for Pikachu, one of the most iconic characters of the Pokémon franchise, as a proxy for temporal analysis. This choice was rooted in the premise that Wikipedia page traffic serves as a reliable indicator of public interest and engagement, thereby providing a valuable lens through which to examine fluctuations in Pikachu’s popularity over the course of 2023. The following sections present a detailed time series analysis of Pikachu’s Wikipedia page views, offering insights into the character’s enduring appeal and the factors driving audience interest throughout the year.

We will start by plotting the raw data to see the overall pattern of page views before moving on to regression and seasonal analysis

pikachu_pageviews <- read_csv('~/Downloads/pageviews-pikachu-20230101-20231109.csv')
## Rows: 313 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (1): Pikachu
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# https://pageviews.wmcloud.org/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=0&start=2023-01-01&end=2023-11-09&pages=Pikachu

# Converting the date column
pikachu_pageviews$Date <- ymd(pikachu_pageviews$Date)

# Converting to a tsibble for time series analysis
pikachu_tsibble <- as_tsibble(pikachu_pageviews, index = Date)
ggplot(pikachu_tsibble, aes(x = Date, y = Pikachu)) +
  geom_line() +
  labs(title = "Daily Wikipedia Page Views for Pikachu", x = "Date", y = "Page Views")

The plot above displays the daily Wikipedia page views for Pikachu throughout the year 2023. The data shows variability over time with some noticeable spikes which may correspond to specific events or releases related to Pikachu or the Pokémon franchise.

Regression Analysis Insights:

  1. The model’s R-squared value is approximately 0.025, which indicates that only about 2.5% of the variability in page views is explained by the time variable. This suggests that there may not be a strong linear trend in the data.
  2. The coefficient for DateNum is approximately -0.8548, indicating a slight decrease in page views over time. However, the trend is very slight, and with the low R-squared value, we can infer that the trend is not strongly pronounced in the data.

Apply smoothing to detect seasonality

# Calculate the 7-day rolling average
pikachu_tsibble$RollingMean <- rollmean(pikachu_tsibble$Pikachu, 7, fill = NA)

# Plot the rolling average
ggplot(pikachu_tsibble, aes(x = Date)) +
  geom_line(aes(y = Pikachu), alpha = 0.5) +
  geom_line(aes(y = RollingMean), color = "#FF674D") +
  labs(title = "7-Day Rolling Average of Wikipedia Page Views for Pikachu", x = "Date", y = "Page Views")

Decomposing the series to illustrate seasonality and trend:

# Seasonal decomposition
decomposed <- msts(pikachu_tsibble$Pikachu, seasonal.periods = 7)
decomposed_components <- stl(decomposed, s.window = "periodic")

# Plot the decomposed components
autoplot(decomposed_components)

Rolling Average Insights:

  1. The 7-day rolling average shows the weekly pattern in the page views data, with some weeks having higher averages than others. This could be indicative of weekly cycles in user behavior or events that affect interest in Pikachu.
  2. The seasonal decomposition plot breaks down the time series into three components: trend, seasonal, and residual. The trend component shows the overall direction of the data, the seasonal component highlights repeating patterns over the 7-day cycle, and the residual component captures the irregularities that are not explained by the trend or seasonal components.

Seasonal Decomposition Insights:

  1. The trend component from the seasonal decomposition suggests that there isn’t a strong trend, the page views do not show a significant long-term increase or decrease within the analyzed timeframe.
  2. The seasonal component may indicate some weekly patterns, which could be explored to understand what drives interest in Pikachu on a regular weekly basis.
  3. The residuals show us the noise or the data’s randomness after the trend and seasonal components have been accounted for.

Illustrating seasonality using ACF or PACF

# ACF and PACF plots
acf(pikachu_tsibble$Pikachu, main = "ACF for Pikachu Page Views")

pacf(pikachu_tsibble$Pikachu, main = "PACF for Pikachu Page Views")

ACF Plot Insights:

The ACF plot shows a significant initial lag and then tapers off, which is typical for a time series with a high day-to-day variation but not necessarily a clear longer-term pattern. This suggests that the number of page views today has a strong relationship with the number of page views yesterday, but this relationship diminishes as the lags increase.

PACF Plot Insights:

The PACF plot shows a significant spike at the first lag, which then drops off sharply. This indicates that the number of page views has a strong relationship with the number of page views from the day before, after controlling for all the lags in between.

What This Means for Pikachu’s Wikipedia Page:

The strong initial correlation in both the ACF and PACF suggests that daily page views for Pikachu are influenced by the views from the previous day, which may reflect ongoing discussions, news, or events related to Pikachu that maintain interest over consecutive days.

The lack of additional significant lags in the PACF plot suggests that beyond this immediate effect, there’s no strong evidence of weekly or monthly cycles in the page views based on this year’s data.


Story Behind the Data: Pikachu’s Wikipedia Page Views in 2023

Throughout 2023, the interest in Pikachu, as represented by Wikipedia page views, exhibited intriguing patterns that, upon further analysis, reveal the impact of various events and the enduring popularity of this iconic Pokémon character.

The linear regression analysis indicated a slight but not statistically significant downward trend in page views. While at first glance this may suggest a waning interest, it’s important to consider the broader context of the Pokémon franchise. In March, coinciding with a minor dip in page views, The Pokémon Company released new game - Pokemon Scarlet and Violet Launch New Limited-Time Tera Raid Battle which may have temporarily shifted fans focus away from Pikachu to new game features and characters.

The 7-day rolling average smoothed out daily fluctuations and highlighted more pronounced weekly cycles. Notably, there were spikes in page views during holiday weeks and summer months, times when school holidays could lead to increased online activity among younger fans.

Seasonal decomposition didn’t show strong seasonality, which aligns with the year-round appeal of Pikachu. However, specific peaks could correlate with international Pokémon events, such as Pokémon Day on February 27th, which celebrates the franchise’s anniversary and often features special Pikachu related content across games and media.

The ACF and PACF plots revealed a significant correlation in page views from one day to the next, suggesting that daily events, such as episodes of the Pokémon anime featuring Pikachu, could drive consistent, short-term interest.

To further understand the drivers behind Pikachu’s popularity, future research could explore:

  1. Correlation with Media Releases: Aligning page view data with the dates of Pokémon movie releases, new anime episodes, and game updates could help identify how these events impact Pikachu’s popularity.

  2. Comparative Analysis with Other Pokémon: Analyzing the page views of other Pokémon may provide insights into how Pikachu’s popularity compares to others in the franchise.

  3. Longitudinal Study: Investigating Pikachu’s page views over multiple years could provide a broader view of the character’s enduring popularity and how it has evolved.

  4. Multivariate Analysis : Incorporating other variables, such as social media mentions, game sales, or search engine trends, could provide a more comprehensive view of what drives interest in Pikachu.

Pikachu