According to this article, YouTube is the second most visited website in the United States with over 400 hours of content posted every minute.
The target users are anyone with an interest in online video sharing. They could be users posting random content, categorized content on a channel or live streaming. Users can also be visitors to the site browsing for content, either specific or recommended.
In terms of subject specific content, here are in top interests for males and female.
A look at the age distribution of users.
The key goals of the users are to be able to easily access content that they are searching for as well as easily being able to share video content that they have uploaded. They want good recommendations for similar videos and often participate in commenting on videos or channels.
Given that YouTube is the second most visited website in the United States and maintains the largest online video sharing platform, you could say that YouTube is succeeding in helping users accomplish their goals. They are easily able to upload content and share on the YouTube wesbite itself or as embedded video in other web pages.
Based on the analysis that we will get into below, it seems that Youtube could fine-tune their recommendation system to reflect more of the users search history.
Let’s take a look at my Youtube recommendation found on the homepage. For each recommendation, we can see a thumbtail, a title, an uploader, a number of views and how recently it was posted.
Let’s study each recommendation and see if it makes sense.
It seems to make sense that given that top male-dominated category is soccer and that it is one of my interests that I should be recommended soccer videos. It makes sense to recommend a song I’ve searched for before and might want to listen to again. It also makes sense to show me a song or an interview similar to one I’ve listened to before.
A quick search of my recent Youtube history reveals that I watch a lot of data science videos. The recent data science content eclipses all other content. What seems to be missing here is that my data science heavy searches and view history is not reflected in my top recommendations.
YouTube uses Deep Neural Network Recommendations instead of matrix factorization approaches because matrices are immensly sparse which complicates computation. There are two networks at play here:
One network generates candidate recommendations by processesing the following information:
- IDs of videos being watched
- search history
- user-level demographics)
It outputs a few hundred videos that might broadly be applicable to the user. The emphasis here is on precise relevance to the user, even if it forgoes content which may be widely popular but irrelevant.
A second network ranks these generated recommendations and takes a richer set of features for each video, and scores each item. The goal here is to have high recall - it’s okay for some recommendation to not be very relevant as long as most relevant items are present.
The networks are trained using hold-out data and given a users history at time t, and the system is asked what they would like to watch at time t+1. This is important to consider also given the episodic nature of some YouTube content.
The objective of the ranking system here is to maximize the expected watch time for any given recommendation. Covington et al. (from the paper linked below) decided to attempt to maximize watch time over probability of a click, due to the common “clickbait” titles in videos. Predicted watch time are modeled using logistic regression.
For further details, please see Deep Neural Networks for YouTube Recommendations by Google and How YouTube Recommends Videos
Given what we now know about how Youtube recommends videos, my personal homepage recommendations are making sense. The majority of the content is something I have watched or would watch. However, I would expect my more recent search history to bear a bigger weight on my recommendations.
Based on my recents searches and interests I would expect more content regarding:
- Data science
- Drone videos
- Construction engineering methods and equipment (related to my job)
- Recipes
- Music
Since this is information that the recommendation candidate generation system takes as inputs, I suspect the system has room for improvements if its outputs do not reflect some of these topics. However, it is possible that ranking system is scoring the candidate recommendations inadequately.
In order to provide users with content more tailored to their interests, YouTube could provide a menu option where preferences could be explicitly stated instead of relying on past searches or the types of channel a user subscribes to since users are not always inclined to indicate their preference that way.
How YouTube Recommends Videos: https://towardsdatascience.com/how-youtube-recommends-videos-b6e003a5ab2f
Deep Neural Networks for YouTube Recommendations: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45530.pdf
The demographics of YouTube, in 5 charts: https://digiday.com/media/demographics-youtube-5-charts/