Global Trends in YouTube Popularity
Introduction
Every year, YouTube is one of the most used social media platforms with millions of videos being posted and watched everyday. YouTube’s content creation, social influence, and the money associated with it only continues to grow. This study analyzes YouTube data to explore the factors that drive channel success and popularity. By analyzing key metrics about the most successful YouTube channels, this research aims to uncover patterns and trends that are common amongst successful channels.
The Data
The data this research uses comes from a Kaggle data set published by the user Nidula Elgiriyewithana. A link to the data can be found here: https://www.kaggle.com/datasets/nelgiriyewithana/global-youtube-statistics-2023/data. This data set contains information on the top 995 YouTube channels in the world in 2023 by subscriber count. Below is a data dictionary describing all the variables in the data. This data will provide information on factors are common among the top YouTube channels have in common. It should be noted that some of the data may have changed since this data set was published.
| Variable | Description |
|---|---|
| Rank | Position of the YouTube channel based on the number of subscribers |
| Youtuber | Name of the YouTube channel |
| Subcribers | Number of subscribers to the channel |
| video views | Total views across all videos on the channel |
| category | Category or niche of the channel |
| title | Title of the YouTube channel |
| uploads | Total number of videos uploaded on the channel |
| Country | Country where the YouTube channel originates |
| Abbreviation | Abbreviation of the country |
| channel_type | Type of the YouTube channel (e.g., individual, brand) |
| video_views_rank | Ranking of the channel based on total video views |
| country_rank | Ranking of the channel based on the number of subscribers within its country |
| channel_type_rank | Ranking of the channel based on its type (individual or brand) |
| video_views_for_last_30_days | Total video views in the last 30 days |
| lowest_monthly_earnings | Lowest estimated monthly earnings from the channel |
| highest_monthly_earnings | Highest estimated monthly earnings from the channel |
| lowest_yearly_earnings | Lowest estimated yearly earnings from the channel |
| highest_yearly_earnings | Highest estimated yearly earnings from the channel |
| subscribers_for_last_30_days | Number of new subscribers gained in the last 30 days |
| created_year | Year when the YouTube channel was created |
| created_month | Month when the YouTube channel was created |
| created_date | Exact date of the YouTube channel’s creation |
| Gross tertiary educated enrollment (%) | Percentage of the population enrolled in tertiary education in the country |
| Population | Total population of the country |
| Unemployment rate | Unemployment rate in the country |
| Urban_population | Percentage of the population living in urban areas |
| Latitude | Latitude coordinate of the country’s location |
| Longitude | Longitude coordinate of the country’s location |
Below is a table of summary statistics that show information about a few of the numerical variables that are included in the data set. It should be noted that the minimum for some values is 0. This is because some of the YouTube made channels that have a large enough subscriber count to be a part of this list. An example of these include YouTube Music or YouTube Movies. These channels actually do not post any videos of their own, but have large subscriber counts and are this included in the data.
| Statistics | Min | Q1 | Median | Mean | Q3 | Max |
|---|---|---|---|---|---|---|
| Subscribers | 12300000 | 14500000 | 17700000 | 22982412 | 24600000 | 245000000 |
| video views | 0 | 4.288e+09 | 7.761e+09 | 1.104e+10 | 1.355e+10 | 2.280e+11 |
| highest_yearly_earnings | 0 | 521750 | 2600000 | 7081814 | 7300000 | 163400000 |
| lowest_yearly_earnings | 0 | 32650 | 159500 | 442257 | 455100 | 10200000 |
| highest_monthly_earnings | 0 | 43500 | 212700 | 589808 | 606800 | 13600000 |
| lowest_monthly_earnings | 0 | 2700 | 13300 | 36886 | 37900 | 8509000 |
| created_year | 1970 | 2009 | 2013 | 2013 | 2016 | 2022 |
Analysis of Top YouTube Channels
The following analysis will use the data outlined above to identify key trends and details in the top YouTube channels. This will identify any commonalities or trends that could point to why a YouTube channel is popular.
The Top 10 Channels By Subscriber Count
Below is a bar chart showcasing the top 10 YouTube Channels by subscriber count in this data set. This just gives us a preliminary idea of who the biggest channels are and how many subscribers that they have.
Top 10 Countries by Number of Channels
One of the variables in the data set is country. Using this variable, we can see where in the world is most popular and where these YouTube channels are located. Below is a bar chart showcasing the top 10 countries by number of channels.
This data is listed by the number of countries found in the data, so which countries have the most channels in the roughly top 1000 channels by subscriber count. By far the biggest outlier is the United States, making up more than 30% of the total channels in the top 1000. India is the next largest after that. This highlights how dominant these two countries are in the world of YouTube. It should also be noted that “nan” is for channels that have no country listed or are unaffiliated with a country.
Distribution of Subscribers Among Top 10 Countries
Using the countries identified above, a deeper analysis into the number of subscribers for each will provide more context for each country. It will be possible to determine that if these countries also have the highest subscriber counts as well as channels.
Above is a box and whisker plot of the top 10 countries from above. It should be noted that the subscribers on the y-axis were adjusted using log scale for easier readability. Here it is clear that while there was a large disparity in the number of channels from each country, the average number of subscribers does not change drastically. India has the largest distribution, as they have some large outlier, like T-Series the worlds #1 channel, hailing from their country. This shows us that countries with less major channels can still have high subscriber counts.
What kinds of videos are watched the most?
There are two variables in the data set that could explain this question. These variables are category and type. These variables describe what kind of content that the channel is publishing and they will be used to identify what kinds of videos people are watching.
Top Categories by Number of Channels
Above is a bar graph that showcases the top categories by total number of channels in the top 1000 YouTube channels. As the same as earlier, it should be noted that “nan” essentially means “NA” and there is no listed category for the channel. It is clear that entertainment and music dominate the most watched videos. Thing like music videos are very prominent on YouTube. Just after these two are people and blogs and gaming. These four types of channels dominate the top 1000 and is a common denominator for many successful YouTube Channels.
Top Type by Average Subscriber Count
Above is a similar bar graph showing average subscriber count by type. This tells a little different story than the graph on categories. Sports is actually the highest type by average subscriber count, but it does have less channels in the top 1000. Music is still high on the list, making it a common factor among the top channels with high subscriber counts. However, entertainment and games does fall down the list, but they still make up a large portion of the channels in general.
The Age of Channels
Data is also provided on the year the channel is created. This can be used to show if older channels are more likely to have more subscribers.
Above is a histogram of the year created of the top 1000 YouTube channels. The data has two different peaks, around 2014 and 2006. In the mid 2000s, YouTube was created, so many of these channels have been around since essentially the beginning of YouTube. The other peak is in the mid 2010’s. This could point to a correlation to an increasingly digital age, as smartphones and social media began to have more and more relevance in this time period. Overall, it looks as if YouTube channels can be successful whenever they were created, but most of them are at least 10 years old.
The Number of Uploads by Subscriber
Does the number of uploads effect the amount of subscribers a channel can gain? By analyzing this, trends can be identified that if the channels posting the most gain more subscribers.
Above is a scatter plot showcasing the number of uploads by subscribers. The axis have been adjusted using log scale for easier readability. From this graph, it can be determined that the number of uploads actually does not effect the number of subscribers. Since this data comes from the top 1000 channels, some of the most viral videos in history have been posted by these channels. This means some channels, especially the music ones, may have a lower number of uploads. Therefore, a fewer amount of videos that go viral can lead to the same amount of subscribers as a channel that posts frequently.
Sentiment Analysis of Top YouTubers
Now that an analysis has been done on certain YouTubers, a deeper analysis can be made into the sentiments about some of the top YouTubers. This will be done by using the YouTube API to scrape 200 comments each from 4 of the top channels, MrBeast, PewDiePie, WWE, and Justin Bieber.
These four were chosen due to the fact they leave their comment sections turned on, their viewers primarily comment in English, and they represent different categories. MrBeast is a channel that falls under entertainment and people and blogs, PewDiePie falls under gaming, WWE falls under entertainment and sports, and Justin Bieber falls under music. These are four of the top categories and types identified earlier. By doing analysis on their comment section, we can identify if anything makes their channels stand out. Comments on each channels most popular and most recent posts were taken.
After the data was scraped, a few changes had to be made to the data to get it ready for sentiment analysis. First the data was extracted from the nested data frames within the API call to retrieve just the data needed for analysis. Then a column was added identifying the name of the channel.
From there, the data was divided at th word level and stop words were removed to not throw off the data. The Bing and NRC lexicons will both be used for this analysis.
Positive and Negative Sentiment Analysis
The first sentiment analysis is to see whether or not the comments under these four channels contain more positive or negative words in them. This will use the Bing Lexicon.
The above bar graph showcases how the overall sentiment scores for scorable works is more positive then negative. The biggest difference is in Justin Bieber, where his comments were far more positive then negative, pointing to a more positive attitude towards a musical artist versus a person like Mr. beast, who had the least positive words. PewDiePie, the only gamer on the list, actually had a similar number of positive sentiments to negative sentiments, pointing towards a worst public perception then the other channels.
Emotional Sentiments
Using the NRC lexicon, an analysis using emotional sentiments can be created for each channel.
Above is a bar graph showcasing the emotional sentiments for each channel. Again, Justin Bieber dominates the positive emotions like joy. This could point to more overall positivity regarding musical artists that are well liked. We see much more anger and anticipation included with the WWE, pointing to how an entertainment giant can build up big events on their social media. PewDiePie again had more negative language. This could be a result of the gaming industries lack of formality and willingness to comment harsh things. Overall these sentiment analysis show that the top YouTube channels are often perceived positively among their viewers, but they do seem to change based off the channel and maybe even the genre as a whole.
Conclusion
A lot of factors go into what makes a YouTube channel popular. This document aimed to identify any trends among the top YouTube channels in the world. Many of these channels were from the US, but a large chunk including thee worlds most subscribed channel is from India. Certain categories like entertainment, gaming, people and blogs, and music dominated the top subscriber counts. These factors seemed the most convincing when it came to identifying commonalities in these YouTube channels. The comments are four of the top channels shows that sentiments can differ between channels, and a channel like Justin Beiber will be far more positive then someone like PewDiePie. Overall, there is a lot that goes into what makes a YouTube channel popular, but certain trends can be identified in each.