Research Approach: There are a few packages that are available in the R for the recommendation system,and the most commonly used is recommenderlab. I have used recommender lab throughout this course so far, and have performed below analysis on it: Content based filtering, User based filtering, and SVD. In this project, I am using all three methods to apply to the movie lense database, and to make comparison on their performances.
UserBased (UBCF) vs ItemBased (IBCF)
This gif illustrates the most commonly used recomendation system model: Collaborative Filting. Collaborative filtering can answer a question “What items do users with interests similar to yours like?
Figure
Figure
Figure
Figure
The dataset i choose for this project is movieLense dataset. The dataset is already present in the recommenderlab package so we will be using that dataset and will explore it first before applying SVD (Singular Value Decompostion)
library(recommenderlab)
## Loading required package: Matrix
## Loading required package: arules
##
## Attaching package: 'arules'
## The following objects are masked from 'package:base':
##
## abbreviate, write
## Loading required package: proxy
##
## Attaching package: 'proxy'
## The following object is masked from 'package:Matrix':
##
## as.matrix
## The following objects are masked from 'package:stats':
##
## as.dist, dist
## The following object is masked from 'package:base':
##
## as.matrix
## Loading required package: registry
## Registered S3 methods overwritten by 'registry':
## method from
## print.registry_field proxy
## print.registry_entry proxy
library(ggplot2)
library(tidyverse)
## -- Attaching packages -------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v tibble 3.0.0 v dplyr 0.8.5
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## v purrr 0.3.4
## -- Conflicts ----------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x tidyr::expand() masks Matrix::expand()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x tidyr::pack() masks Matrix::pack()
## x dplyr::recode() masks arules::recode()
## x tidyr::unpack() masks Matrix::unpack()
library(pander)
Lets load the dataser first
data(MovieLense)
movielense <- MovieLense # Loading the movie datset
print(paste0("The dimensions of dataset : (Users x Movies)", nrow(movielense), " x ",ncol(movielense)))
## [1] "The dimensions of dataset : (Users x Movies)943 x 1664"
Lets see differnt ratings given by users.
mvector <- as.vector(movielense@data)
mvector <- mvector[mvector != 0]
unique(mvector)
## [1] 5 4 3 1 2
mvector <- factor(mvector)
qplot(mvector,fill=I("blue"), col=I("red") ) + ggtitle("Histogram of Ratings") +
xlab("Raings") + ylab("Count")
### Top Ten Movies
movie_watched <- data.frame(
movie_name = names(colCounts(movielense)),
watched_times = colCounts(movielense)
)
top_ten_movies <- movie_watched[order(movie_watched$watched_times, decreasing = TRUE), ][1:10, ]
ggplot(top_ten_movies) + aes(x=movie_name, y=watched_times) +
geom_bar(stat = "identity",fill = "firebrick4", color = "dodgerblue2") + xlab("Movie Tile") + ylab("Count") +
theme(axis.text = element_text(angle = 40, hjust = 1))
### Average Movie Rating Histogram
qplot(colMeans(movielense)) + stat_bin(binwidth =0.25,fill=I("blue"), col=I("red")) +
xlim(0,5)+
xlab("Average Rating") + ylab("Count") +
ggtitle("Average Ratings Counts Histogram")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 2 rows containing missing values (geom_bar).
## Warning: Removed 2 rows containing missing values (geom_bar).
We would like to take a peak looking into the database now. Here is ratings of movies in the beginning and end the movie lense database, and we also peak what the movies the first user has rated.
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
## View Data as a 5 by 5 example
y<-as.matrix(movielense@data[1:10,1:10])
y %>% kable (caption ="DataExample") %>% kable_styling ("striped", full_width=TRUE)
Toy Story (1995) | GoldenEye (1995) | Four Rooms (1995) | Get Shorty (1995) | Copycat (1995) | Shanghai Triad (Yao a yao yao dao waipo qiao) (1995) | Twelve Monkeys (1995) | Babe (1995) | Dead Man Walking (1995) | Richard III (1995) |
---|---|---|---|---|---|---|---|---|---|
5 | 3 | 4 | 3 | 3 | 5 | 4 | 1 | 5 | 3 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 2 | 4 | 4 | 0 |
0 | 0 | 0 | 5 | 0 | 0 | 5 | 5 | 5 | 4 |
0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 5 | 4 | 0 | 0 | 0 |
4 | 0 | 0 | 4 | 0 | 0 | 4 | 0 | 4 | 0 |
moviemeta <- MovieLenseMeta
pander(head(moviemeta),caption = "First few Rows within Movie Meta Data ")
title | year |
---|---|
Toy Story (1995) | 1995 |
GoldenEye (1995) | 1995 |
Four Rooms (1995) | 1995 |
Get Shorty (1995) | 1995 |
Copycat (1995) | 1995 |
Shanghai Triad (Yao a yao yao dao waipo qiao) (1995) | 1995 |
Action | Adventure | Animation | Children’s | Comedy | Crime | Documentary |
---|---|---|---|---|---|---|
0 | 0 | 1 | 1 | 1 | 0 | 0 |
1 | 1 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 1 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Thriller | War | Western |
---|---|---|
0 | 0 | 0 |
1 | 0 | 0 |
1 | 0 | 0 |
0 | 0 | 0 |
1 | 0 | 0 |
0 | 0 | 0 |
pander(tail(moviemeta), caption = "Last few Rows within Movie Meta Data")
 | title | year |
---|---|---|
1676 | War at Home, The (1996) | 1996 |
1677 | Sweet Nothing (1995) | 1996 |
1678 | Mat’ i syn (1997) | 1998 |
1679 | B. Monkey (1998) | 1998 |
1681 | You So Crazy (1994) | 1994 |
1682 | Scream of Stone (Schrei aus Stein) (1991) | 1996 |
 | unknown | Action | Adventure | Animation | Children’s | Comedy |
---|---|---|---|---|---|---|
1676 | 0 | 0 | 0 | 0 | 0 | 0 |
1677 | 0 | 0 | 0 | 0 | 0 | 0 |
1678 | 0 | 0 | 0 | 0 | 0 | 0 |
1679 | 0 | 0 | 0 | 0 | 0 | 0 |
1681 | 0 | 0 | 0 | 0 | 0 | 1 |
1682 | 0 | 0 | 0 | 0 | 0 | 0 |
 | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror |
---|---|---|---|---|---|---|
1676 | 0 | 0 | 1 | 0 | 0 | 0 |
1677 | 0 | 0 | 1 | 0 | 0 | 0 |
1678 | 0 | 0 | 1 | 0 | 0 | 0 |
1679 | 0 | 0 | 0 | 0 | 0 | 0 |
1681 | 0 | 0 | 0 | 0 | 0 | 0 |
1682 | 0 | 0 | 1 | 0 | 0 | 0 |
 | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western |
---|---|---|---|---|---|---|---|
1676 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1677 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1678 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1679 | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
1681 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1682 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
## look at the first 8 ratings of the first user
head(as(movielense[1,], "list")[[1]], 8)
## Toy Story (1995)
## 5
## GoldenEye (1995)
## 3
## Four Rooms (1995)
## 4
## Get Shorty (1995)
## 3
## Copycat (1995)
## 3
## Shanghai Triad (Yao a yao yao dao waipo qiao) (1995)
## 5
## Twelve Monkeys (1995)
## 4
## Babe (1995)
## 1
# look at the last 4 ratings of the first user
tail(as(movielense[1,], "list")[[1]], 4)
## Full Monty, The (1997) Gattaca (1997) Starship Troopers (1997)
## 5 5 2
## Good Will Hunting (1997)
## 3
# Loading the metadata that gets loaded with main dataset
As the dataset is quite large so we need to cut the dataset bit smaller we do this by user which have rated at least 30 movies and movies which are rated by minimum 60 users.
movielense <- movielense [rowCounts(movielense) > 30, colCounts(movielense) > 60]
print(paste0("Number of Rows after filtering : ", nrow(movielense)))
## [1] "Number of Rows after filtering : 726"
print(paste0("Number of Columns after filtering : ", ncol(movielense)))
## [1] "Number of Columns after filtering : 529"
set.seed(2020)#seed as year
n_folds <- 10
to_keep <- 15
threshold <- 3
e <- evaluationScheme(movielense, method="cross-validation",k = n_folds, train=0.8, given=to_keep, goodRating=threshold)
print(e)
## Evaluation scheme with 15 items given
## Method: 'cross-validation' with 10 run(s).
## Good ratings: >=3.000000
## Data set: 726 x 529 rating matrix of class 'realRatingMatrix' with 74956 ratings.
training <- getData(e, "train")
known <- getData(e, "known")
unknown <- getData(e, "unknown")
print(paste0("Traing data has ", nrow(training)," rows"))
## [1] "Traing data has 648 rows"
print(paste0("Known Testing data has ", nrow(known)," rows"))
## [1] "Known Testing data has 78 rows"
print(paste0("Unknown Testing data has ", nrow(unknown)," rows"))
## [1] "Unknown Testing data has 78 rows"
We choose 1st model as Singular Value Decompostion to train our model so that we can use it to recommend movies.
training_time <- system.time({
model_svd <- Recommender(data = training, method = "SVD")
})
print("Model training time : ")
## [1] "Model training time : "
print(training_time)
## user system elapsed
## 0.06 0.00 0.06
print(model_svd)
## Recommender of type 'SVD' for 'realRatingMatrix'
## learned using 648 users.
predicted_top_ten_movies_svd <- predict(object = model_svd, newdata = known, n = 10)
predicted_top_ten_movies_df_svd <- data.frame(users = sort(rep(1:length(predicted_top_ten_movies_svd@items),
predicted_top_ten_movies_svd@n)),
ratings = unlist(predicted_top_ten_movies_svd@ratings),
index = unlist(predicted_top_ten_movies_svd@items))
predicted_top_ten_movies_df_svd$title <- predicted_top_ten_movies_svd@itemLabels[predicted_top_ten_movies_df_svd$index]
predicted_top_ten_movies_df_svd$year <- MovieLenseMeta$
year[predicted_top_ten_movies_df_svd$index]
predicted_top_ten_movies_df_svd <- predicted_top_ten_movies_df_svd %>% group_by(users) %>% top_n(5,ratings)
predicted_top_ten_movies_df_svd[predicted_top_ten_movies_df_svd$users %in% (1:10), ]
## # A tibble: 50 x 5
## # Groups: users [10]
## users ratings index title year
## <int> <dbl> <int> <chr> <dbl>
## 1 1 3.51 43 Pulp Fiction (1994) 1994
## 2 1 3.44 79 Fargo (1996) 1993
## 3 1 3.42 104 2001: A Space Odyssey (1968) 1996
## 4 1 3.40 37 Star Wars (1977) 1994
## 5 1 3.39 225 Leaving Las Vegas (1995) 1996
## 6 2 4.25 43 Pulp Fiction (1994) 1994
## 7 2 4.23 166 Back to the Future (1985) 1986
## 8 2 4.22 176 Field of Dreams (1989) 1986
## 9 2 4.22 19 Braveheart (1995) 1995
## 10 2 4.21 196 Jerry Maguire (1996) 1989
## # ... with 40 more rows
svd_prediction <- predict(object = model_svd, newdata = known, n = 10, type = "ratings")
print("Acuracy Matrix SVD :")
## [1] "Acuracy Matrix SVD :"
print(calcPredictionAccuracy(x = svd_prediction, data = unknown, byUser = FALSE))
## RMSE MSE MAE
## 0.9986768 0.9973553 0.7918828
training_time <- system.time({
model_ibcf_cosine <- Recommender(data = training, method = "IBCF", parameter = list(method = "Cosine"))
})
print("Model training time : ")
## [1] "Model training time : "
print(training_time)
## user system elapsed
## 1.00 0.15 1.20
print(model_ibcf_cosine)
## Recommender of type 'IBCF' for 'realRatingMatrix'
## learned using 648 users.
predicted_top_ten_movies_ibcf_cosine <- predict(object = model_ibcf_cosine, newdata = known, n = 10)
predicted_top_ten_movies_df_ibcf_cosine <- data.frame(users = sort(rep(1:length(predicted_top_ten_movies_ibcf_cosine@items),
predicted_top_ten_movies_ibcf_cosine@n)),
ratings = unlist(predicted_top_ten_movies_ibcf_cosine@ratings),
index = unlist(predicted_top_ten_movies_ibcf_cosine@items))
predicted_top_ten_movies_df_ibcf_cosine$title <- predicted_top_ten_movies_ibcf_cosine@itemLabels[predicted_top_ten_movies_df_ibcf_cosine$index]
predicted_top_ten_movies_df_ibcf_cosine$year <- MovieLenseMeta$year[predicted_top_ten_movies_df_ibcf_cosine$index]
predicted_top_ten_movies_df_ibcf_cosine <- predicted_top_ten_movies_df_ibcf_cosine %>% group_by(users) %>% top_n(5,ratings)
predicted_top_ten_movies_df_ibcf_cosine[predicted_top_ten_movies_df_ibcf_cosine$users %in% (1:10), ]
## # A tibble: 93 x 5
## # Groups: users [10]
## users ratings index title year
## <int> <dbl> <int> <chr> <dbl>
## 1 1 5 3 Four Rooms (1995) 1995
## 2 1 5 14 Mr. Holland's Opus (1995) 1994
## 3 1 5 29 Net, The (1995) 1995
## 4 1 5 38 Legends of the Fall (1994) 1995
## 5 1 5 51 While You Were Sleeping (1995) 1994
## 6 1 5 59 Firm, The (1993) 1994
## 7 1 5 67 Sleepless in Seattle (1993) 1994
## 8 1 5 80 Heavy Metal (1981) 1993
## 9 1 5 85 Truth About Cats & Dogs, The (1996) 1994
## 10 1 5 88 Rock, The (1996) 1993
## # ... with 83 more rows
ibcf_prediction <- predict(object = model_ibcf_cosine, newdata = known, n = 10, type = "ratings")
print("Acuracy Matrix IBCF :")
## [1] "Acuracy Matrix IBCF :"
print(calcPredictionAccuracy(x = ibcf_prediction, data = unknown, byUser = FALSE))
## RMSE MSE MAE
## 1.444221 2.085774 1.092323
training_time <- system.time({
model_ubcf_cosine <- Recommender(data = training, method = "UBCF", parameter = list(method = "Cosine"))
})
print("Model training time : ")
## [1] "Model training time : "
print(training_time)
## user system elapsed
## 0 0 0
print(model_ubcf_cosine)
## Recommender of type 'UBCF' for 'realRatingMatrix'
## learned using 648 users.
predicted_top_ten_movies_ubcf_cosine <- predict(object = model_ubcf_cosine, newdata = known, n = 10)
predicted_top_ten_movies_df_ubcf_cosine <- data.frame(users = sort(rep(1:length(predicted_top_ten_movies_ubcf_cosine@items),
predicted_top_ten_movies_ubcf_cosine@n)),
ratings = unlist(predicted_top_ten_movies_ubcf_cosine@ratings),
index = unlist(predicted_top_ten_movies_ubcf_cosine@items))
predicted_top_ten_movies_df_ubcf_cosine$title <- predicted_top_ten_movies_ubcf_cosine@itemLabels[predicted_top_ten_movies_df_ubcf_cosine$index]
predicted_top_ten_movies_df_ubcf_cosine$year <- MovieLenseMeta$year[predicted_top_ten_movies_df_ubcf_cosine$index]
predicted_top_ten_movies_df_ubcf_cosine <- predicted_top_ten_movies_df_ubcf_cosine %>% group_by(users) %>% top_n(5,ratings)
predicted_top_ten_movies_df_ubcf_cosine[predicted_top_ten_movies_df_ubcf_cosine$users %in% (1:10), ]
## # A tibble: 50 x 5
## # Groups: users [10]
## users ratings index title year
## <int> <dbl> <int> <chr> <dbl>
## 1 1 3.78 79 Fargo (1996) 1993
## 2 1 3.76 37 Star Wars (1977) 1994
## 3 1 3.63 43 Pulp Fiction (1994) 1994
## 4 1 3.62 143 Return of the Jedi (1983) 1965
## 5 1 3.54 149 Godfather: Part II, The (1974) 1996
## 6 2 4.57 37 Star Wars (1977) 1994
## 7 2 4.54 79 Fargo (1996) 1993
## 8 2 4.48 143 Return of the Jedi (1983) 1965
## 9 2 4.46 77 Silence of the Lambs, The (1991) 1993
## 10 2 4.44 212 Contact (1997) 1988
## # ... with 40 more rows
ubcf_prediction <- predict(object = model_ubcf_cosine, newdata = known, n = 10, type = "ratings")
print("Acuracy Matrix UBCF :")
## [1] "Acuracy Matrix UBCF :"
print(calcPredictionAccuracy(x = ubcf_prediction, data = unknown, byUser = FALSE))
## RMSE MSE MAE
## 0.9970938 0.9941960 0.7911661
models_evaluation <- list(
SVD = list(name = "SVD"),
IBCF = list(name = "IBCF", param = list(method = "cosine")),
UBCF = list(name = "UBCF", param = list(method = "cosine"))
)
lerror <- evaluate(x = e, method = models_evaluation, type = "ratings")
## SVD run fold/sample [model time/prediction time]
## 1 [0.07sec/0.01sec]
## 2 [0.22sec/0sec]
## 3 [0.2sec/0.02sec]
## 4 [0.21sec/0sec]
## 5 [0.25sec/0.02sec]
## 6 [0.05sec/0.02sec]
## 7 [0.05sec/0.02sec]
## 8 [0.06sec/0.01sec]
## 9 [0.04sec/0.01sec]
## 10 [0.04sec/0.02sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [1.04sec/0.02sec]
## 2 [1.05sec/0.01sec]
## 3 [1.08sec/0sec]
## 4 [1.27sec/0.02sec]
## 5 [0.96sec/0.19sec]
## 6 [1.07sec/0.19sec]
## 7 [1.08sec/0.03sec]
## 8 [1.1sec/0.02sec]
## 9 [1.18sec/0.02sec]
## 10 [0.84sec/0sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.01sec/0.13sec]
## 2 [0.01sec/0.13sec]
## 3 [0sec/0.13sec]
## 4 [0sec/0.14sec]
## 5 [0.02sec/0.12sec]
## 6 [0sec/0.15sec]
## 7 [0sec/0.17sec]
## 8 [0.02sec/0.12sec]
## 9 [0sec/0.14sec]
## 10 [0.02sec/0.3sec]
mdlcmp <- as.data.frame(sapply(avg(lerror), rbind))
cmpMdl <- as.data.frame(t(as.matrix(mdlcmp)))
colnames(cmpMdl) <- c("RMSE", "MSE", "MAE")
pander(cmpMdl, caption = "Model Comparison")
 | RMSE | MSE | MAE |
---|---|---|---|
SVD | 1.037 | 1.077 | 0.8232 |
IBCF | 1.469 | 2.162 | 1.116 |
UBCF | 1.034 | 1.07 | 0.8197 |
rmse_ubcf<- calcPredictionAccuracy(x = ubcf_prediction, data = unknown, byUser = FALSE)
print (rmse_ubcf)
## RMSE MSE MAE
## 0.9970938 0.9941960 0.7911661
rmse_ibcf <- calcPredictionAccuracy(x = ibcf_prediction, data = unknown, byUser = FALSE)
print(rmse_ibcf)
## RMSE MSE MAE
## 1.444221 2.085774 1.092323
rmse_svd <- calcPredictionAccuracy(x = svd_prediction, data = unknown, byUser = FALSE)
print (rmse_svd)
## RMSE MSE MAE
## 0.9986768 0.9973553 0.7918828
library (ggplot2)
library(dplyr)
comparison = rbind(rmse_ibcf, rmse_ubcf, rmse_svd)
comparison = data.frame(comparison, row.names = NULL)
comparison = cbind(model =c('IBCF','UBCF','SVD'), comparison)
comparison %>% gather ('measure', 'value',-1) %>%
ggplot (aes (x=measure, y=value, fill=model)) +
geom_bar (stat='identity', position=position_dodge())
Item based content filtering performs the worst, which has the biggest RMSE (root square mean standard deviation) value. Singular value decomposition and user based content filtering performs similar.
n_recommendations = c(1,3,5,8,10,15,20, 25)
results = evaluate (x=e, method = models_evaluation, n = n_recommendations)
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.03sec]
## 2 [0.06sec/0.02sec]
## 3 [0.05sec/0.01sec]
## 4 [0.05sec/0.03sec]
## 5 [0.04sec/0.03sec]
## 6 [0.04sec/0.03sec]
## 7 [0.06sec/0.01sec]
## 8 [0.04sec/0.04sec]
## 9 [0.04sec/0.04sec]
## 10 [0.05sec/0.03sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.91sec/0.01sec]
## 2 [1sec/0.03sec]
## 3 [0.84sec/0.02sec]
## 4 [1.13sec/0.03sec]
## 5 [1.12sec/0.02sec]
## 6 [1sec/0.03sec]
## 7 [1.15sec/0.03sec]
## 8 [1.06sec/0.04sec]
## 9 [1.07sec/0.03sec]
## 10 [0.88sec/0.03sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0sec/0.16sec]
## 2 [0.02sec/0.15sec]
## 3 [0sec/0.33sec]
## 4 [0sec/0.15sec]
## 5 [0sec/0.15sec]
## 6 [0.01sec/0.14sec]
## 7 [0sec/0.16sec]
## 8 [0sec/0.14sec]
## 9 [0sec/0.16sec]
## 10 [0.02sec/0.14sec]
plot(results, y="ROC", annotate = 1, legend ="topleft")
title ("ROC Curve")
plot (results, y ='prec/rec', annotate=1)
title ("Precision-Recall")
The ROC (receiver operative curve) revales that singular value decomposition has the best area under the curve, followed by user based content filtering, while the item based content filtering has the worst area under curve. So is true with the precision-recall figure, with SVD ranks the best, and IBCF ranks the worst.
Singular value decomposition performes better than than the collaborative filterting family (UBCF and IBCF), in this movie setting. It is not surprising that below famous big tech all uses singular value decomposition as their recommendation system until very recently.
Figure