Introduction

This document presents an analysis of YouTube video performance metrics, focusing on key indicators such as views, watch time, subscribers, and impressions. By visualizing trends and conducting a time series decomposition, this analysis aims to identify patterns in video performance and provide insights for optimizing content strategy and audience engagement.

Preprocessing

Load the dataset

# Load the dataset
Youtube<- read.csv("C:/Users/user/Documents/youtube.csv")
head(Youtube)
##       Content
## 1       Total
## 2 O1CJJ4EoBDM
## 3 hCBy20w5aI8
## 4 5IoWGS9zshA
## 5 vqXEtnCl4hk
## 6 IoMNrDAqrdo
##                                                              Video.title
## 1                                                                       
## 2      DOWNLOAD any VIDEO on the INTERNET through TELEGRAM without Bots!
## 3                       How Pinterest is the secret weapon for designers
## 4 SECRET to adding GLOW LIGHT to your  GRAPHIC DESIGNS with Smartphone 🤯
## 5                              BLEND Anything On Any Suface In PIXELLAB.
## 6                3 pro background design edits in pixellab || Nachristos
##   Video.publish.time  Views Watch.time..hours. Subscribers Impressions
## 1                    218288          6323.1041        8026     2395955
## 2           9-Aug-23  38617           326.7360          54      182291
## 3          23-May-24  11201           432.6788         356      117939
## 4          23-Feb-23  10871           520.9378         432       99877
## 5          20-Mar-23   7215           311.2541         269       64728
## 6          21-Aug-20   7070           195.2356         125       89626
##   Impressions.click.through.rate....
## 1                               4.71
## 2                               5.27
## 3                               6.59
## 4                               5.63
## 5                               6.45
## 6                               5.34
tail(Youtube)
##        Content
## 83 kyQDgSlCiMc
## 84 tOPs3A0OEgk
## 85 fQFBJuefzSE
## 86 -0OsO2OGxBA
## 87 _gZV5qc77q8
## 88 Om1o7q7dAgU
##                                                                                    Video.title
## 83                                             FREE AI Tool That's About to Change Everything!
## 84                                      Minimal logo design using Pixellab? Yes it's possible!
## 85                                        Professional Facebook Cover Art design with Pixellab
## 86                                       FASTEST METHOD to create SUNBURST EFFECT in Photoshop
## 87 5 mind-blowing gadgets i got from Amazon    #amazon #amazonprime #amazongadgets #nachristos
## 88                              4 Simple Tricks to Instantly Improve Your Flyer Design Skills!
##    Video.publish.time Views Watch.time..hours. Subscribers Impressions
## 83          30-Aug-24   377             6.9085           1        3525
## 84          11-Sep-20   357             7.6698           7       12068
## 85          11-Jul-20   345            10.2699           7       11549
## 86          22-Feb-24   328             3.4294           0        9260
## 87           1-Mar-24   327             2.0048           0        6826
## 88          14-Sep-24   228            10.8319           1        2330
##    Impressions.click.through.rate....
## 83                               5.62
## 84                               1.44
## 85                               1.73
## 86                               2.20
## 87                               2.23
## 88                               5.54
str(Youtube)
## 'data.frame':    88 obs. of  8 variables:
##  $ Content                           : chr  "Total" "O1CJJ4EoBDM" "hCBy20w5aI8" "5IoWGS9zshA" ...
##  $ Video.title                       : chr  "" "DOWNLOAD any VIDEO on the INTERNET through TELEGRAM without Bots!" "How Pinterest is the secret weapon for designers" "SECRET to adding GLOW LIGHT to your  GRAPHIC DESIGNS with Smartphone 🤯" ...
##  $ Video.publish.time                : chr  "" "9-Aug-23" "23-May-24" "23-Feb-23" ...
##  $ Views                             : int  218288 38617 11201 10871 7215 7070 6984 6646 6621 6070 ...
##  $ Watch.time..hours.                : num  6323 327 433 521 311 ...
##  $ Subscribers                       : int  8026 54 356 432 269 125 110 273 152 97 ...
##  $ Impressions                       : int  2395955 182291 117939 99877 64728 89626 61144 57919 71049 69133 ...
##  $ Impressions.click.through.rate....: num  4.71 5.27 6.59 5.63 6.45 5.34 8.57 6.29 6.2 6.17 ...
colnames(Youtube)
## [1] "Content"                            "Video.title"                       
## [3] "Video.publish.time"                 "Views"                             
## [5] "Watch.time..hours."                 "Subscribers"                       
## [7] "Impressions"                        "Impressions.click.through.rate...."
# Remove the first column and the first row
Youtube<- Youtube[-1, ] %>% select(-1)
# Convert 'Video.publish.time' to date format (day-month-year)
Youtube$Video.publish.time <- dmy(Youtube$Video.publish.time)
View(Youtube)

Summarize Data by Year and Month

# Summarize data by year
Youtube_year <- Youtube %>%
  mutate(year = floor_date(Video.publish.time, "year")) %>%
  group_by(year) %>%
  summarise(
    total_views = sum(Views, na.rm = TRUE),
    total_watch_time = sum(Watch.time..hours., na.rm = TRUE),
    total_subscribers = sum(Subscribers, na.rm = TRUE),
    total_impressions = sum(Impressions, na.rm = TRUE)
  )

# Summarize data by month
Youtube_monthly <-Youtube %>%
  mutate(month = floor_date(Video.publish.time, "month")) %>%
  group_by(month) %>%
  summarise(
    total_views = sum(Views, na.rm = TRUE),
    total_watch_time = sum(Watch.time..hours., na.rm = TRUE),
    total_subscribers = sum(Subscribers, na.rm = TRUE),
    total_impressions = sum(Impressions, na.rm = TRUE)
  )

Visualization

Monthly plots reveal significant fluctuations, with several spikes, particularly in 2023, indicating high engagement. Yearly trends show a strong peak across all metrics in 2023, followed by a decline in 2024. Watch Time (green), Subscribers (purple), and Impressions (yellow) follow similar patterns, suggesting 2023 was a standout year for growth, while 2024 shows reduced activity.

# Load gridExtra package
library(gridExtra)

# Plot for monthly watch time
plot_monthly_watch_time <- ggplot(Youtube_monthly, aes(x = month, y = total_watch_time)) +
  geom_line(color = "green") +
  labs(title = "Trend of Monthly Watch Time", x = "Month", y = "Total Watch Time (hours)") +
  theme_minimal()

# Plot for yearly watch time
plot_yearly_watch_time <- ggplot(Youtube_year, aes(x = year, y = total_watch_time)) +
  geom_line(color = "green") +
  labs(title = "Trend of Yearly Watch Time", x = "Year", y = "Total Watch Time (hours)") +
  theme_minimal()

# Plot for monthly subscribers
plot_monthly_subscribers <- ggplot(Youtube_monthly, aes(x = month, y = total_subscribers)) +
  geom_line(color = "purple") +
  labs(title = "Trend of Monthly Subscribers", x = "Month", y = "Total Subscribers") +
  theme_minimal()

# Plot for yearly subscribers
plot_yearly_subscribers <- ggplot(Youtube_year, aes(x = year, y = total_subscribers)) +
  geom_line(color = "purple") +
  labs(title = "Trend of Yearly Subscribers", x = "Year", y = "Total Subscribers") +
  theme_minimal()

# Plot for monthly impressions
plot_monthly_impressions <- ggplot(Youtube_monthly, aes(x = month, y = total_impressions)) +
  geom_line(color = "orange") +
  labs(title = "Trend of Monthly Impressions", x = "Month", y = "Total Impressions") +
  theme_minimal()

# Plot for yearly impressions
plot_yearly_impressions <- ggplot(Youtube_year, aes(x = year, y = total_impressions)) +
  geom_line(color = "orange") +
  labs(title = "Trend of Yearly Impressions", x = "Year", y = "Total Impressions") +
  theme_minimal()

# Combine the plots in a grid (3 rows)
grid.arrange(
  plot_monthly_watch_time, plot_yearly_watch_time,
  plot_monthly_subscribers, plot_yearly_subscribers,
  plot_monthly_impressions, plot_yearly_impressions,
  nrow = 3
)

Analyze Performance by Video Title Average Performance Metrics

These top 30 videos reveal the types of content that resonate most with the audience, offering a clear understanding of the kind of videos they prefer and expect. By focusing on these high-performing topics, the creator can narrow their content niche and target their audience more effectively.

# Ensure that 'data' is indeed a dataframe
if (is.data.frame(Youtube)) {
  # Group by Video.title and calculate averages
  category_performance <- Youtube%>%
    group_by(Video.title) %>%
    summarise(
      avg_views = mean(Views, na.rm = TRUE),
      avg_watch_time = mean(Watch.time..hours., na.rm = TRUE),
      avg_subscribers = mean(Subscribers, na.rm = TRUE)
    )
} else {
  stop("The 'data' object is not a dataframe.")
}

# Filter for the top 30 video titles by average views
top_30_performance <- category_performance %>%
  arrange(desc(avg_views)) %>%  # Sort by avg_views in descending order
  head(30)  # Select the top 30

# Plot Average Views for Top 30 Video Titles
ggplot(top_30_performance, aes(x = reorder(Video.title, avg_views), y = avg_views)) +
  geom_bar(stat = "identity", fill = "blue") +
  coord_flip() +  # Flip coordinates for better readability
  labs(title = "Top 30 Video Titles by Average Views", x = "Video Title", y = "Average Views") +
  theme_minimal()

# Filter for the top 30 video titles by average watch time
top_30_watch_time <- category_performance %>%
  arrange(desc(avg_watch_time)) %>%  # Sort by avg_watch_time in descending order
  head(30)  # Select the top 30

# Plot Average Watch Time for Top 30 Video Titles
ggplot(top_30_watch_time, aes(x = reorder(Video.title, avg_watch_time), y = avg_watch_time)) +
  geom_bar(stat = "identity", fill = "green") +
  coord_flip() +  # Flip coordinates for better readability
  labs(title = "Top 30 Video Titles by Average Watch Time", x = "Video Title", y = "Average Watch Time (hours)") +
  theme_minimal()

# Filter for the top 30 video titles by average subscribers
top_30_subscribers <- category_performance %>%
  arrange(desc(avg_subscribers)) %>%  # Sort by avg_subscribers in descending order
  head(30)  # Select the top 30

# Plot Average Subscribers for Top 30 Video Titles
ggplot(top_30_subscribers, aes(x = reorder(Video.title, avg_subscribers), y = avg_subscribers)) +
  geom_bar(stat = "identity", fill = "purple") +
  coord_flip() +  # Flip coordinates for better readability
  labs(title = "Top 30 Video Titles by Average Subscribers", x = "Video Title", y = "Average Subscribers") +
  theme_minimal()

category_performance <- Youtube %>%
  group_by(Video.title) %>%
  summarise(
    avg_views = mean(Views, na.rm = TRUE),
    avg_watch_time = mean(Watch.time..hours., na.rm = TRUE),
    avg_subscribers = mean(Subscribers, na.rm = TRUE)
  )

# Filter for the top 10 video titles by average views
top_10_performance <- category_performance %>%
  arrange(desc(avg_views)) %>%
  head(10)

top_10_watch_time <- category_performance %>%
    arrange(desc(avg_watch_time)) %>%
    head(10)  # Select the top 10 videos by average watch time

top_10_subscribers <- category_performance %>%
    arrange(desc(avg_subscribers)) %>%
    head(10)  # Select the top 10 videos by average subscribers

# Plot Average Views for Top 10 Video Titles
plot_avg_views <- ggplot(top_10_performance, aes(x = reorder(Video.title, avg_views), y = avg_views)) +
  geom_bar(stat="identity", fill="blue") +
  coord_flip() +
  labs(title="Top 10 Video Titles by Average Views", x="Video Title", y="Average Views") +
  theme_minimal()

# Display the plot
plot_avg_views

Interactive Plot

The plot shows the top 10 video titles ranked by Average Views, Average Watch Time, and Average Subscribers. The video “SECRET to adding GLOW LIGHT to your GRAPHIC DESIGNS with Smartphone” ranks highest across both watch time and subscribers, indicating high viewer engagement and conversion. While the video on downloading through Telegram leads in views, it shows lower engagement in watch time and subscriber gain. From this plot graphic design-related content tends to perform best, attracting both views and subscribers.

# Create interactive plot for top 10 average views
plot_avg_views_interactive <- ggplot(top_10_performance, aes(x = reorder(Video.title, avg_views), y = avg_views)) +
  geom_bar(stat = "identity", fill = "blue") +
  coord_flip() +
  labs(x = "Video Title", y = "Average Views") +
  theme_minimal()

plot_avg_views_interactive <- ggplotly(plot_avg_views_interactive)

# Create interactive plot for top 10 average watch time
plot_avg_watch_time_interactive <- ggplot(top_10_watch_time, aes(x = reorder(Video.title, avg_watch_time), y = avg_watch_time)) +
  geom_bar(stat = "identity", fill = "green") +
  coord_flip() +
  labs(x = "Video Title", y = "Average Watch Time (hours)") +
  theme_minimal()

plot_avg_watch_time_interactive <- ggplotly(plot_avg_watch_time_interactive)

# Create interactive plot for top 10 average subscribers
plot_avg_subscribers_interactive <- ggplot(top_10_subscribers, aes(x = reorder(Video.title, avg_subscribers), y = avg_subscribers)) +
  geom_bar(stat = "identity", fill = "purple") +
  coord_flip() +
  labs(x = "Video Title", y = "Average Subscribers") +
  theme_minimal()

plot_avg_subscribers_interactive <- ggplotly(plot_avg_subscribers_interactive)

# Combine interactive plots with individual titles
combined_plot <- subplot(
  plot_avg_views_interactive,
  plot_avg_watch_time_interactive,
  plot_avg_subscribers_interactive,
  nrows = 3,
  shareX = TRUE,
  shareY = TRUE
) %>%
  layout(
    annotations = list(
      list(text = " Average Views", xref='paper', yref='paper', x=0.5, y=1.05, showarrow=FALSE, font=list(size=14)),
      list(text = "Watch Time", xref='paper', yref='paper', x=0.5, y=0.65, showarrow=FALSE, font=list(size=14)),
      list(text = " Subscribers", xref='paper', yref='paper', x=0.5, y=0.25, showarrow=FALSE, font=list(size=14))
    )
  )

# Display the combined plot
combined_plot

Time Series

The time series decomposition of YouTube video performance data highlights several key observations. The observed values have leveled off since 2023, indicating a stable performance trend. The trend component initially pointed to growth but has recently shown signs of decline, suggesting a slowdown. Seasonal variations are evident, with regular fluctuations likely tied to recurring events or viewer behavior patterns. The random component remains relatively stable, with no major irregularities or anomalies affecting the data. Overall, the analysis points to a period of steady performance with predictable seasonal cycles and a potential shift in momentum.

# Install forecast package for time series decomposition
install.packages("forecast")
## Warning: package 'forecast' is in use and will not be installed
library(forecast)

# Convert to time series object (monthly frequency)
views_ts <- ts(Youtube_monthly$total_views, frequency = 12, start = c(year(min(Youtube_monthly$month)), month(min(Youtube_monthly$month))))

# Decompose time series
decomp_views <- decompose(views_ts)

# Plot decomposition
plot(decomp_views)

Conclusion

In summary, while the channel has experienced growth in the past, recent trends indicate a need for proactive measures to sustain and enhance viewer engagement. By focusing on understanding audience behavior, optimizing content strategies around seasonal trends, and addressing factors contributing to the decline in performance, there is potential for renewed growth and success in the YouTube landscape. Future analyses should continue to monitor these metrics closely to adapt to changing dynamics effectively.

Recommendation

  • Optimize for Seasonal Trends: Analyze past seasonal patterns and schedule content releases strategically to capitalize on periods of higher viewer engagement.

  • Leverage Popular Videos: The video about downloading videos through Telegram has the highest views but lower engagement in terms of watch time and subscribers. Consider revisiting this content to make it more engaging and potentially drive conversions.

  • Content Optimization: Analyze what makes top-performing videos successful (e.g., detailed tutorials or engaging delivery) and replicate those elements in future content.

  • Continuous Monitoring: Regularly review key performance metrics to adapt content strategies based on real-time data and shifting audience preferences.