Assignment 11 - Recommender System Discussion

According to this article, YouTube is the second most visited website in the United States with over 400 hours of content posted every minute.

Scenario Design Analysis

Who are your target users?

The target users are anyone with an interest in online video sharing. They could be users posting random content, categorized content on a channel or live streaming. Users can also be visitors to the site browsing for content, either specific or recommended.

In terms of subject specific content, here are in top interests for males and female.

Male Viewership Female Viewership

A look at the age distribution of users.

https://digiday.com/media/demographics-youtube-5-charts/

What are their key goals?

The key goals of the users are to be able to easily access content that they are searching for as well as easily being able to share video content that they have uploaded. They want good recommendations for similar videos and often participate in commenting on videos or channels.

How can you help them accomplish their goals?

Given that YouTube is the second most visited website in the United States and maintains the largest online video sharing platform, you could say that YouTube is succeeding in helping users accomplish their goals. They are easily able to upload content and share on the YouTube wesbite itself or as embedded video in other web pages.

Based on the analysis that we will get into below, it seems that Youtube could fine-tune their recommendation system to reflect more of the users search history.

Reverse Engineering

Let’s take a look at my Youtube recommendation found on the homepage. For each recommendation, we can see a thumbtail, a title, an uploader, a number of views and how recently it was posted.

Let’s study each recommendation and see if it makes sense.

  • Soccer videos
    • These are 4/12 of my recommendations.
    • This makes sense as I occasionally search for this kind of content.
    • There is a video on the player Ibrahimovic. I recall searching specifically for this player. Makes sense
    • The other three videos are more random. We can note that two of the soccer videos are about Manchester United. Not a team that I explicitly search for. Coincidentally, all players shwon on the video thumbnails are current or former Manchester United players.
  • Cedric Villani
    • He is a French mathematician, winner of the Fields Medal in 2010 and now a member of parliament for the majority party and currently pursuing a bid for mayor of Paris. I’ve explicitly searched for him in the past for his “vulgarization” of science videos. He is always dressed in a particular way with a spider brooch. Very interesting guy.
  • Interview with Ray Dallio:
    • A legendary hedge fund manager who created a company with a very particular work culture of radical transparency. He shares his research, work and life principles for free on his website.
    • This makes sense. I recall searching for two interviews with him, one was an interview by Bloomberg.
  • A Game of Thrones fan theory video:
    • This makes some sense as I have searched for similar content by this uploader Alt Shift X in the past but I would consider it quite dated.
  • Watchmen Explained:
    • This is totaly random except for the same uploader Alt Shift X.
  • Songs
    • A song by De La Soul: Somewhat random. I recall searching for an old Dr. Dre song so that could explain the link.
    • A french song. I have searched for this specific song before.
  • A video about a recent Netflix release, The King. This seems more like promoted content by Netflix.

It seems to make sense that given that top male-dominated category is soccer and that it is one of my interests that I should be recommended soccer videos. It makes sense to recommend a song I’ve searched for before and might want to listen to again. It also makes sense to show me a song or an interview similar to one I’ve listened to before.

A quick search of my recent Youtube history reveals that I watch a lot of data science videos. The recent data science content eclipses all other content. What seems to be missing here is that my data science heavy searches and view history is not reflected in my top recommendations.

How YouTube Recommends Videos

YouTube uses Deep Neural Network Recommendations instead of matrix factorization approaches because matrices are immensly sparse which complicates computation. There are two networks at play here:

One network generates candidate recommendations by processesing the following information:
- IDs of videos being watched
- search history
- user-level demographics)

It outputs a few hundred videos that might broadly be applicable to the user. The emphasis here is on precise relevance to the user, even if it forgoes content which may be widely popular but irrelevant.

A second network ranks these generated recommendations and takes a richer set of features for each video, and scores each item. The goal here is to have high recall - it’s okay for some recommendation to not be very relevant as long as most relevant items are present.

The networks are trained using hold-out data and given a users history at time t, and the system is asked what they would like to watch at time t+1. This is important to consider also given the episodic nature of some YouTube content.

The objective of the ranking system here is to maximize the expected watch time for any given recommendation. Covington et al. (from the paper linked below) decided to attempt to maximize watch time over probability of a click, due to the common “clickbait” titles in videos. Predicted watch time are modeled using logistic regression.

For further details, please see Deep Neural Networks for YouTube Recommendations by Google and How YouTube Recommends Videos

Improvements and Recommendations

Given what we now know about how Youtube recommends videos, my personal homepage recommendations are making sense. The majority of the content is something I have watched or would watch. However, I would expect my more recent search history to bear a bigger weight on my recommendations.

Based on my recents searches and interests I would expect more content regarding:
- Data science
- Drone videos
- Construction engineering methods and equipment (related to my job)
- Recipes
- Music

Since this is information that the recommendation candidate generation system takes as inputs, I suspect the system has room for improvements if its outputs do not reflect some of these topics. However, it is possible that ranking system is scoring the candidate recommendations inadequately.

In order to provide users with content more tailored to their interests, YouTube could provide a menu option where preferences could be explicitly stated instead of relying on past searches or the types of channel a user subscribes to since users are not always inclined to indicate their preference that way.