Google News Recommender Algorithm

Scenario Design:

1) Who are your target users?

Google news is targeting people who:

  1. look for their news online
  2. are open to information from various sources
  3. read many articles

2) What are their goals?

Ideally, you’d like to think that a reader is looking to read the most accurate news from credible sources on topics they are interested in.

These days it seems that far too often a reader is looking for articles that validate their position from a subset of sources that shares their world view.

3) How can you help them accomplish those goals?

To help each user find the most accurate news from credible sources on relevant topics, Google News could:

  • assign a credibility index to each news source *
  • assign topics/keywords to every article old and new *
  • determine the topics of interest for each user *
  • suggest articles with relevant topics and the highest possible credibility score *

To provide each user with articles they are most likely to be interested in and agree strongly with, Google news could:

  • track the most commonly clicked news sources for each user *
  • assign topics/keywords to every article old and new *
  • determine the topics of interest for each user *
  • suggest articles on relevant topics from the most agreeable sources for each user *

Reverse Engineering the Algorithm

Google News breaks down the news into categories on the left banner (Top Stories, For You, U.S., World, Local, Technology, etc.). This indicates that all articles are tagged with a certain categry or key word. The “Local” option indicates that each article is also tagged with a geographical location that is cross-referenced against the user’s location.

The user is offered the option of selecting topics that will help filter the articles shown. These topics include both general topics (like Sports) and specific ones (like specific teams or people). This suggests that in addition to the overall category associated with an article, Google News finds topics mentioned within the article and associates additional tags to these topics.

Within the preferences a user can specify, there is a section for preferred sources and sources to avoid. When combined with the articles the user tends to click, this can be aggregated into a Source Preference Index for each user.

The user is also offered three review-like options on each article they are shown:

  1. Hide all stories from [News Source]
  2. More stories like this
  3. Fewer stories like this

The first option here would affect the Source Preference Index, lowering the priority of the [News Source] in question (most likely lowering it to the minimum).

The other two indicate that in addition to a Source PReference Index, Google News compiles some sort of Topic Interest Index. This index would most likely be originally populated using the topics marked as interesting by the user, but will constantly be updated based on selections of these two options and, potentially, by what articles the user chooses to access or not.

Top Stories appears to simply be an aggregation of the most visited articles at the time, though based on the article sources in my feed, I would suspect that the Source Preferenece Index mentioned above is in use here as well. The presence of this section indicates the obvious tracking of overall hits for each article in order to track popularity.

The place where the recommender algorithm should be playing the largest role is within the “For you” section of the newsfeed. This section always shows fairly recent articles, which is no surprise in today’s world of the constant news cycle. The four articles listed here would most likely be the ones with the highest combination of:

  1. The highest number of views
  2. The highest Topic Interest Index
  3. The highest Source Preference Index

The weighting of the three categories is negotiable, though I suspect it follows the order of the three criteria above.

Recommendations

If I had to make a recommendation, it would be to add something along the lines of a Credibility Index. It is the mandate of news organizations to inform and educate the public, not validate their beliefs. This Index would then be used to prioritize news sources, or perhaps even writers. This could be used in a separate section offering articles on topics that interest the user, but offer credible news sources outside of their normal reading preferences.

A suggestion to improve their current approach would be to provide the user with a few of the topics/keywords associated with the article they chose to See more or fewer like. This would increase the resolution to which they could adjust the Topic Interest Index. If, for example, I choose to “see fewer like this” on an article titled: “Where Kylie Jenner and Drake Really Stand Amid Romance Rumors” I would like to have the option to specify if it is Kylie Jenner, Drake or celebrity relationships that I don’t want to see in my news feed (It’s all three, so in this case I hope their algorithm drops all three of those topics in my Topic Interest Index).