From the message board assignment: “You may also choose one of the three non-personalized recommenders (below) we went over in class and describe the technique and which of the three you prefer to use.”
I chose Rotton Tomatoes, as it is my favorite of the three choices (Metacritic, Rotten Tomatoes and IMDB), at least in terms of recommendation.
Per the link provided, Rotten Tomatoes provides three types of ratings:
This is not a particularly complicated system, nor does it use the techniques we have discussed in this class, but I still prefer it because of its ease of use. If you are perusing a list of movies or TV shows, you can quickly refer to the icons. If you are looking at a specific movie or TV show, you can see both the critics’ assessment and regular viewer assessments in one place. You can see quickly if it is a polarizing movie (i.e., if the audience score is 60%+, but the average star rating is significantly below 3.5, then most users gave it 3.5+, but the ones who didn’t rated it low - love it or hate it phenomenon).
You can quickly get a good summary of critics’ and users’ ratings. For example, for the 2020 movie “Shirley”, I can see immediately that critics love it (87% score and Certified Fresh) while the audience is so-so or at least mixed (58% score). I can then click “See Score Details” to quickly drill into the numbers:
Score Details
It should be noted that 90% of the time I go to MetaCritic, Rotten Tomatoes or IMDB, I am going there to look up a specific movie. I rarely go there to browse and find a movie.
In the article included in the assignment (see references) - “Wisdom of the crowd? IMDb users gang up on Christian Bale’s new movie before it even opens” - does not specify whether the Turkish internet trolls who submitted bogus reviews to IMDB did so one-by-one or via some kind of automated process.
Regardless, I think the first step in eliminating or reducing the possibility for attacks on recommendation systems is technical. Your web site has to be set up so that data cannot be injected into it, or so that it cannot be taken over by automation. If you make it so ratings are almost certainly entered manually, then you reduce the problem to monitoring user behavior, which can be accomplished via Data Science methodologies. Is there a surge in profiles being created in a particular region? Is there a proper level of validation to allow someone to be eligible to submit ratings? Is the flow of ratings reasonable, or is there a surge in ratings quantity?
Perhaps more important is how the ratings get into the system. Are they real time, or is there a temporary buffer before they register? Or can ratings which are flagged as suspect be put into a temporary buffer automatically when the detection metrics call for it?
If all ratings (or perhaps only suspect ratings) were put into a temporary buffer, and then eventually released, they could them be sent into the system via streaming data technologies.
There are also some more lowfi rules that could be adopted. Should someone be able to rate a movie before it is even screened or released?