Unsupervised Learning: Project 3

Recommendation System with Association Rules

Introduction

The aim of this paper is to determine a possible recommendations of movies based on the user’s ratings with the help of association rules. Association Rules is a non-linear, unsupervised learning measure used in examining and establishing relations between variables in substantial data sets, allocating patterns in behavious and more (Djenouri et al., 2018). Undoubtedly, the most popular application of this measure is in the market basket analysis, where the identification of products that are sold together is implemented. Moreover, association rules are often used in the recommendations systems and this is the implementation we are going to conduct.

Association Rules Measures

We can distinguish three commons means of Association Rules to determine their quality: support confidence and lift.

Support

Simply put - support is a measure of how many times the joint itemset appears in the database of use.

Confidence

A percentage value showing how frequently the rule’s head appears amongst all the groups that contain the rule’s body. It indicates how reliable such a rule is (IBM, 2021a). The higher such confidence is, the stronger the rule.
X itemset - antecedent
Y itemset - consequent

Lift

Ratio measure of the confidence of the rule and its expected confidence. The higher it is, the higher the chance of co-occurence of X and Y. Lift can has a value between 0 and infinity:

Value greater than 1 - X and Y are positively dependent on one another
Value equal to 1 - X and Y are independent of each other and we are not able to derive any rule from them
Value is less than 1 - X and Y are negatively dependent on each other. X has a negative effect on the appearance of B (IBM, 2021b).

Now, let’s move on the dataset preparation.

Data description

The data is dervied from Kaggle and it consists of two datasets: movies and ratings (Karthik, 2020). It has movies on a wide spectrum of genres, from movies such as “Toy Story” to “Inception” and a broad group of users. Since Association Rules are good for big data sets, let’s implement one that can be considered as such, as our movies list is over 30.000 titles, and the ratings count surpasses 2 million.

First, let’s investigate the movies dataset.

#Loading all the packages first
library(arules)

## Loading required package: Matrix

## 
## Attaching package: 'arules'

## The following objects are masked from 'package:base':
## 
##     abbreviate, write

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:arules':
## 
##     intersect, recode, setdiff, setequal, union

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(reshape2)
library(Matrix)
library(stringr)
library(stringdist)
library(ggplot2)
library(arulesViz)

## Loading required package: grid

movies <- read.csv("movies.csv", stringsAsFactors = FALSE)
head(movies)

##   movieId                              title
## 1       1                   Toy Story (1995)
## 2       2                     Jumanji (1995)
## 3       3            Grumpier Old Men (1995)
## 4       4           Waiting to Exhale (1995)
## 5       5 Father of the Bride Part II (1995)
## 6       6                        Heat (1995)
##                                        genres
## 1 Adventure|Animation|Children|Comedy|Fantasy
## 2                  Adventure|Children|Fantasy
## 3                              Comedy|Romance
## 4                        Comedy|Drama|Romance
## 5                                      Comedy
## 6                       Action|Crime|Thriller

As we can see we have 3 variables: movieId, title and genres. First, let us separate the year of release from the actual title for the better clasrity of the dataset.

movies$year <- as.numeric(str_sub(str_trim(movies$title), start = -5, end = -2))

## Warning: pojawiły się wartości NA na skutek przekształcenia

head(movies)

##   movieId                              title
## 1       1                   Toy Story (1995)
## 2       2                     Jumanji (1995)
## 3       3            Grumpier Old Men (1995)
## 4       4           Waiting to Exhale (1995)
## 5       5 Father of the Bride Part II (1995)
## 6       6                        Heat (1995)
##                                        genres year
## 1 Adventure|Animation|Children|Comedy|Fantasy 1995
## 2                  Adventure|Children|Fantasy 1995
## 3                              Comedy|Romance 1995
## 4                        Comedy|Drama|Romance 1995
## 5                                      Comedy 1995
## 6                       Action|Crime|Thriller 1995

Now, we have a fourth variable - year. However, there are some movies without the release year, and as a result of coercion, we implemented NA values to the database. Now, we will get rid of those movies, and thereby of NAs for the analysis.

movies_without_year <- which(is.na(movies$year))
movies$title[movies_without_year]

##  [1] "Babylon 5"                                                                 
##  [2] "Millions Game, The (Das Millionenspiel)"                                   
##  [3] "Bicycle, Spoon, Apple (Bicicleta, cullera, poma)"                          
##  [4] "Mona and the Time of Burning Love (Mona ja palavan rakkauden aika) (1983))"
##  [5] "Diplomatic Immunity (2009– )"                                              
##  [6] "Big Bang Theory, The (2007-)"                                              
##  [7] "Brazil: In the Shadow of the Stadiums"                                     
##  [8] "Slaying the Badger"                                                        
##  [9] "Tatort: Im Schmerz geboren"                                                
## [10] "National Theatre Live: Frankenstein"                                       
## [11] "The Court-Martial of Jackie Robinson"                                      
## [12] "In Our Garden"                                                             
## [13] "Stephen Fry In America - New World"                                        
## [14] "Two: The Story of Roman & Nyro"                                            
## [15] "Li'l Quinquin"                                                             
## [16] "A Year Along the Abandoned Road"                                           
## [17] "Body/Cialo"                                                                
## [18] "Polskie gówno"                                                             
## [19] "The Third Reich: The Rise & Fall"                                          
## [20] "My Own Man"                                                                
## [21] "Moving Alan"                                                               
## [22] "Michael Laudrup - en Fodboldspiller"                                       
## [23] "Doli Saja Ke Rakhna"                                                       
## [24] "The Dead Lands"                                                            
## [25] "C'mon, Let's Live a Little"                                                
## [26] "For a Book of Dollars"                                                     
## [27] "Bad Boys 3"                                                                
## [28] "Señorita Justice"                                                          
## [29] "Red Victoria"                                                              
## [30] "Vaastupurush"                                                              
## [31] "Sierra Leone's Refugee All Stars"                                          
## [32] "L'uomo della carità"                                                       
## [33] "Filmage: The Story of Descendents/All"                                     
## [34] "About Sarah"                                                               
## [35] "Swallows and Amazons"                                                      
## [36] "Ready Player One"                                                          
## [37] "Los tontos y los estúpidos"                                                
## [38] "The Naked Truth (1957) (Your Past Is Showing)"                             
## [39] "Disaster Playground"                                                       
## [40] "Nice Guy"                                                                  
## [41] "OMG, I'm a Robot!"                                                         
## [42] "KillerSaurus"                                                              
## [43] "Viva"                                                                      
## [44] "Ollaan vapaita"                                                            
## [45] "Fakta Ladh Mhana"                                                          
## [46] "Sentimentalnyy roman"                                                      
## [47] "Yedyanchi Jatra"                                                           
## [48] "Dhadakebaaz"                                                               
## [49] "Ittefaq"                                                                   
## [50] "Elämältä kaiken sain"                                                      
## [51] "Dil Kya Kare"                                                              
## [52] "Hogi Pyar Ki Jeet"                                                         
## [53] "Monk by Blood"                                                             
## [54] "I Am Syd Stone"                                                            
## [55] "Alone With People"                                                         
## [56] "38 Parrots"                                                                
## [57] "The Adventures of Sherlock Holmes and Doctor Watson"                       
## [58] "The Adventures of Sherlock Holmes and Doctor Watson: The Treasures of Agra"
## [59] "The Republic "                                                             
## [60] "A Fare to Remember"                                                        
## [61] "The Code"                                                                  
## [62] "101次求婚"                                                                 
## [63] "S: Saigo no Keikan - Dakkan: Recovery of Our Future"                       
## [64] "Vrijdag"                                                                   
## [65] "Aimy in a Cage"                                                            
## [66] "Trophy Kids"                                                               
## [67] "Jasne Błękitne Okna"                                                       
## [68] "Mr. Kuka's Advice"                                                         
## [69] "Hundra"

We have 69 titles with no year of the premiere. Let’s discard them from the dataset.

movies <- movies[-movies_without_year, ]

#Checking for NAs one more time as a precaution
sum(is.na(movies))

## [1] 0

Now, we have no NAs and we can extract the title, from the title variable.

movies$title <- str_sub(str_trim(movies$title), start = 1, end = -8)

As you can see, in the dataset movies we also have the genres variable. However, multiple genres are stacked into one cell. Let’s look at the unique genres in the table. But first we need to exctract the individual genres from the stacked cells.

uniqueGenres <- unique(unlist(str_split(movies$genres, "\\|")))
uniqueGenres

##  [1] "Adventure"          "Animation"          "Children"          
##  [4] "Comedy"             "Fantasy"            "Romance"           
##  [7] "Drama"              "Action"             "Crime"             
## [10] "Thriller"           "Horror"             "Mystery"           
## [13] "Sci-Fi"             "IMAX"               "Documentary"       
## [16] "War"                "Musical"            "Western"           
## [19] "Film-Noir"          "(no genres listed)"

We have 20 different genres. However, we can see that two of them are not really genre defining: “(no genres listed)” and “IMAX”.
Let’s see how many positions we have with no genres and discard them from the dataset.

movies %>% filter(str_detect(genres, "(no genres listed)")) %>% nrow()

## [1] 1104

movies <- movies[! movies$genres == "(no genres listed)", ]

#Check how many position wihout genres we have now
movies %>% filter(str_detect(genres, "(no genres listed)")) %>% nrow()

## [1] 0

uniqueGenres <- uniqueGenres[! uniqueGenres == "IMAX"]
uniqueGenres

##  [1] "Adventure"          "Animation"          "Children"          
##  [4] "Comedy"             "Fantasy"            "Romance"           
##  [7] "Drama"              "Action"             "Crime"             
## [10] "Thriller"           "Horror"             "Mystery"           
## [13] "Sci-Fi"             "Documentary"        "War"               
## [16] "Musical"            "Western"            "Film-Noir"         
## [19] "(no genres listed)"

Now we have 18 genres left. And we will create a binary variable and assign it to each genre it belongs. After that, we can drop the genres variable from the movies data set.

for(genre in uniqueGenres) {
  movies[str_c("genre_", genre)] = ifelse(str_detect(movies$genres, genre), 1, 0)
}

  
head(movies, 5)

##   movieId                       title
## 1       1                   Toy Story
## 2       2                     Jumanji
## 3       3            Grumpier Old Men
## 4       4           Waiting to Exhale
## 5       5 Father of the Bride Part II
##                                        genres year genre_Adventure
## 1 Adventure|Animation|Children|Comedy|Fantasy 1995               1
## 2                  Adventure|Children|Fantasy 1995               1
## 3                              Comedy|Romance 1995               0
## 4                        Comedy|Drama|Romance 1995               0
## 5                                      Comedy 1995               0
##   genre_Animation genre_Children genre_Comedy genre_Fantasy genre_Romance
## 1               1              1            1             1             0
## 2               0              1            0             1             0
## 3               0              0            1             0             1
## 4               0              0            1             0             1
## 5               0              0            1             0             0
##   genre_Drama genre_Action genre_Crime genre_Thriller genre_Horror
## 1           0            0           0              0            0
## 2           0            0           0              0            0
## 3           0            0           0              0            0
## 4           1            0           0              0            0
## 5           0            0           0              0            0
##   genre_Mystery genre_Sci-Fi genre_Documentary genre_War genre_Musical
## 1             0            0                 0         0             0
## 2             0            0                 0         0             0
## 3             0            0                 0         0             0
## 4             0            0                 0         0             0
## 5             0            0                 0         0             0
##   genre_Western genre_Film-Noir genre_(no genres listed)
## 1             0               0                        0
## 2             0               0                        0
## 3             0               0                        0
## 4             0               0                        0
## 5             0               0                        0

#Discaring genres variavle
movies <- select(movies, -genres)

Let’s see the genre distribution now.

genreDist <- colSums(movies[, 4:21])
genreDistDF <- data.frame(genre = names(genreDist),count = genreDist)
genreDistDF$genre <- str_sub(str_trim(genreDistDF$genre), start = 7, end = -1)
genreDistDF

##                         genre count
## genre_Adventure     Adventure  2762
## genre_Animation     Animation  1386
## genre_Children       Children  1607
## genre_Comedy           Comedy 10115
## genre_Fantasy         Fantasy  1689
## genre_Romance         Romance  4875
## genre_Drama             Drama 15765
## genre_Action           Action  4441
## genre_Crime             Crime  3443
## genre_Thriller       Thriller  5297
## genre_Horror           Horror  3363
## genre_Mystery         Mystery  1836
## genre_Sci-Fi           Sci-Fi  2152
## genre_Documentary Documentary  3033
## genre_War                 War  1345
## genre_Musical         Musical  1051
## genre_Western         Western   779
## genre_Film-Noir     Film-Noir   338

ggplot(genreDistDF, aes(x = reorder(genre, -count), y = count, fill = genre)) + geom_bar(stat = "identity") + ggtitle("Distribution of Genres") + theme(legend.position = "none", axis.text.x = element_text(angle = 90)) + xlab("Genre") + ylab("Count")

Now we can the distribution of our dataset. Majority of the movies are in the drama or comedy genre. After comedy, there is a substantial decrease in the count of the movies.
Since we have some more understanding of the movies dataset and a correct format, let’s move onto the second dataset - ratings.

Association Rules from Ratings Dataset

After reading it, we can see that the ratings dataset consists of over 2 million observations, let’s decrease it to 1 000 000 observations for a smoother work. Then, we will get rid of the unnecessary variables - timestamp and rating, as the rating itself holds little interest to the study. The activity of rating itself is more important, as it shows the interest in watching a given movie and not particularly liking it - for an appeal based recommendation, we would need another project :).

ratings <- read.csv("ratings.csv")

#Choosing the first 1.000.000 observations
ratings <- ratings[1:1000000, ]
ratings <- select(ratings, userId, movieId)

head(ratings, 5)

##   userId movieId
## 1      1     169
## 2      1    2471
## 3      1   48516
## 4      2    2571
## 5      2  109487

Now, let us adjust this dataset more to the dataset of movies and the ratings of the movies that do not exist in the final movies dataset anymore.

ratings <- ratings %>% filter(! movieId %in% movies)

dim(ratings)

## [1] 1000000       2

We can see that we still have 1000000 observations left.

Apriori algorithm

Now, we can move onto applying the frequent itemset mining with the Apriori algorithm. First, we need to build a User-Item matrix with 1/0 values, whether a movie has been seen by a user or not respectively. To represent the matrix, we will implement the object transactions, to prevent most of the elements to become 0s.

matrix1 <- as(split(ratings[ , "movieId"], ratings[ , "userId"]), "transactions")

matrix1

## transactions in sparse format with
##  10790 transactions (rows) and
##  17292 items (columns)

Now, after establishing the matrix with 1052 transactions (number of raters) and 9484 items (number of movies), we can move to finding frequent pair of films that the users watch.
We can put out such an assumption, that if X and Y items are viewed together often, there is ought to be some underlying connection between those two position, that would help in establishing a viewer’s choice behaviour. Such finding can help in recommending movie X, has the user watched item Y and the other way around.

Now is the time to set the support and confidence values. Let’s set confidence to 0.01, so that the pair is watched by at least 107 users, and the confidence, that had the user watched film X, film Y will also be seen, to 75%. Following to that, we can run the Apriori with this rule.

ruleParameters <- list(supp = 0.01, conf = 0.75, maxlen = 2)

associationRules <- apriori(matrix1, parameter = ruleParameters)

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##        0.75    0.1    1 none FALSE            TRUE       5    0.01      1
##  maxlen target  ext
##       2  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 107 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[17292 item(s), 10790 transaction(s)] done [0.52s].
## sorting and recoding items ... [1968 item(s)] done [0.03s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2

## Warning in apriori(matrix1, parameter = ruleParameters): Mining stopped (maxlen
## reached). Only patterns up to a length of 2 returned!

##  done [0.35s].
## writing ... [5896 rule(s)] done [0.01s].
## creating S4 object  ... done [0.01s].

summary(associationRules)

## set of 5896 rules
## 
## rule length distribution (lhs + rhs):sizes
##    2 
## 5896 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       2       2       2       2       2       2 
## 
## summary of quality measures:
##     support          confidence        coverage            lift       
##  Min.   :0.01001   Min.   :0.7500   Min.   :0.01057   Min.   : 2.330  
##  1st Qu.:0.01196   1st Qu.:0.7667   1st Qu.:0.01492   1st Qu.: 2.776  
##  Median :0.01576   Median :0.7875   Median :0.01974   Median : 3.750  
##  Mean   :0.02286   Mean   :0.7965   Mean   :0.02885   Mean   : 4.198  
##  3rd Qu.:0.02382   3rd Qu.:0.8174   3rd Qu.:0.02966   3rd Qu.: 4.811  
##  Max.   :0.20816   Max.   :0.9561   Max.   :0.27711   Max.   :54.682  
##      count       
##  Min.   : 108.0  
##  1st Qu.: 129.0  
##  Median : 170.0  
##  Mean   : 246.7  
##  3rd Qu.: 257.0  
##  Max.   :2246.0  
## 
## mining info:
##     data ntransactions support confidence
##  matrix1         10790    0.01       0.75

set.seed(240)
plot(associationRules, method = "graph", measure = "support", shading = "lift", main = "Association Rules Graph")

## Warning: plot: Too many rules supplied. Only plotting the best 100 rules using
## 'support' (change control parameter max if needed)

With this chunk of code we created 281554 rules. From the summary we can also determine the lift statistics for all the rules combined. Lift, as a remainder, is the dependency measure, where we compute chances of X and Y occurring together. From what we can see, our minimum lift is over 2, therefore all the rules have the positive dependency.
By looking at the plot, we can see that the rules can be applied to substantial amount of inputs, however, the strongest associations are between the groups 7153, 4993 and 5952.

arulesViz::plotly_arules(associationRules, method = "matrix", measure=c("support","confidence"))

## Warning: 'arulesViz::plotly_arules' is deprecated.
## Use 'plot' instead.
## See help("Deprecated")

## Warning: plot: Too many rules supplied. Only plotting the best 1000 rules using
## lift (change parameter max if needed)

Here we can see the associations graphically between the particular movies by rules with the indication of the lift value. Because our lift values are all positive and some of them are really strong, only a small number of rules is in the top lift spectrum.

Because we have such a big number of rules, let us filter those that are above the 3rd quartile (4.811).

associationRules <- subset(associationRules, lift >= 4.811)

summary(associationRules)

## set of 1474 rules
## 
## rule length distribution (lhs + rhs):sizes
##    2 
## 1474 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       2       2       2       2       2       2 
## 
## summary of quality measures:
##     support          confidence        coverage            lift       
##  Min.   :0.01001   Min.   :0.7500   Min.   :0.01103   Min.   : 4.813  
##  1st Qu.:0.01140   1st Qu.:0.7680   1st Qu.:0.01446   1st Qu.: 5.191  
##  Median :0.01423   Median :0.7902   Median :0.01807   Median : 6.190  
##  Mean   :0.01733   Mean   :0.7974   Mean   :0.02177   Mean   : 6.826  
##  3rd Qu.:0.01863   3rd Qu.:0.8177   3rd Qu.:0.02345   3rd Qu.: 7.339  
##  Max.   :0.13411   Max.   :0.9328   Max.   :0.16636   Max.   :54.682  
##      count       
##  Min.   : 108.0  
##  1st Qu.: 123.0  
##  Median : 153.5  
##  Mean   : 186.9  
##  3rd Qu.: 201.0  
##  Max.   :1447.0  
## 
## mining info:
##     data ntransactions support confidence
##  matrix1         10790    0.01       0.75

associationRules <- as(associationRules, "data.frame")
tail(associationRules, 5)

##                 rules    support confidence   coverage     lift count
## 5753 {2916} => {1580} 0.06830399  0.7790698 0.08767377 5.166664   737
## 5804 {2115} => {1291} 0.07367933  0.8103976 0.09091752 5.908236   795
## 5834 {1200} => {1214} 0.09434662  0.7965571 0.11844300 6.014592  1018
## 5845 {7153} => {5952} 0.13410565  0.8226265 0.16302132 4.944925  1447
## 5846 {5952} => {7153} 0.13410565  0.8061281 0.16635774 4.944925  1447

From what we can see on the head of the associationRules data frame, the rules contain the movieId inputs. So let’s divide them up.

rules <- sapply(associationRules$rules, function(x){
    x = gsub("[\\{\\}]", "", regmatches(x, gregexpr("\\{.*\\}", x))[[1]])
    x = gsub("=>",",",x)
    x = str_replace_all(x," ","")
    return( x )
})

rules <- as.character(rules)
rules <- str_split(rules, ",")

associationRules$movieLeftSide <- sapply( rules, "[[", 1)
associationRules$movieRightSide <- sapply( rules, "[[", 2)

associationRules$movieLeftSide <- as.numeric(associationRules$movieLeftSide)
associationRules$movieRightSide <- as.numeric(associationRules$movieRightSide)

Now, let’s get rid of the rules variable.

associationRules$rules <- NULL

Now, we can join two dataframes: associationRules and movies. We can get the titles both on the left and right hand sides of the rules respectively with the according genre.

associationRules <- associationRules %>% left_join(movies, by = c("movieLeftSide" = "movieId"))

associationRules$movieLeftSide <- NULL
columnNames <- colnames(associationRules)
columnNames[5] <- str_c("Left_", columnNames[5])
columnNames[7:25] <- str_c("Left_", columnNames[7:25])


colnames(associationRules) <- columnNames

#Now the same for right hand side movies
associationRules <- associationRules %>% left_join(movies, by = c("movieRightSide" = "movieId"))

associationRules$movieRightSide <- NULL
columnNames <- colnames(associationRules)
columnNames[26:45] <- str_c("Right_", columnNames[26:45])
colnames(associationRules) <- columnNames
colnames(associationRules)

##  [1] "support"                          "confidence"                      
##  [3] "coverage"                         "lift"                            
##  [5] "Left_count"                       "Left_title"                      
##  [7] "Left_year"                        "Left_genre_Adventure"            
##  [9] "Left_genre_Animation"             "Left_genre_Children"             
## [11] "Left_genre_Comedy"                "Left_genre_Fantasy"              
## [13] "Left_genre_Romance"               "Left_genre_Drama"                
## [15] "Left_genre_Action"                "Left_genre_Crime"                
## [17] "Left_genre_Thriller"              "Left_genre_Horror"               
## [19] "Left_genre_Mystery"               "Left_genre_Sci-Fi"               
## [21] "Left_genre_Documentary"           "Left_genre_War"                  
## [23] "Left_genre_Musical"               "Left_genre_Western"              
## [25] "genre_Film-Noir.x"                "Right_genre_(no genres listed).x"
## [27] "Right_title"                      "Right_year"                      
## [29] "Right_genre_Adventure"            "Right_genre_Animation"           
## [31] "Right_genre_Children"             "Right_genre_Comedy"              
## [33] "Right_genre_Fantasy"              "Right_genre_Romance"             
## [35] "Right_genre_Drama"                "Right_genre_Action"              
## [37] "Right_genre_Crime"                "Right_genre_Thriller"            
## [39] "Right_genre_Horror"               "Right_genre_Mystery"             
## [41] "Right_genre_Sci-Fi"               "Right_genre_Documentary"         
## [43] "Right_genre_War"                  "Right_genre_Musical"             
## [45] "Right_genre_Western"              "genre_Film-Noir.y"               
## [47] "genre_(no genres listed).y"

Recommending Movies

Now that we have established the titles and genres with the rules, we can look back at what we achieved here. Let’s look at the rules with the highest lift, so the highest positive dependency on one another, thereby the strongest recommendations.

associationRules <- arrange(associationRules, desc(lift))
associationRules <- select(associationRules, Left_title, Left_year, Right_title, Right_year, support, confidence, lift)

head(associationRules)

##                                  Left_title Left_year
## 1   Manon of the Spring (Manon des sources)      1986
## 2 Fantastic Four: Rise of the Silver Surfer      2007
## 3      Hobbit: The Desolation of Smaug, The      2013
## 4                 Resident Evil: Apocalypse      2004
## 5                                    Saw II      2005
## 6          Hunger Games: Catching Fire, The      2013
##                          Right_title Right_year    support confidence     lift
## 1                   Jean de Florette       1986 0.01028730  0.7551020 54.68155
## 2                     Fantastic Four       2005 0.01037998  0.7567568 36.13011
## 3 Hobbit: An Unexpected Journey, The       2012 0.01251158  0.7758621 34.88147
## 4                      Resident Evil       2002 0.01075070  0.7581699 33.12006
## 5                                Saw       2004 0.01028730  0.8283582 30.92728
## 6                  Hunger Games, The       2012 0.01399444  0.7905759 29.41488

head(associationRules, 150)

##                                                        Left_title Left_year
## 1                         Manon of the Spring (Manon des sources)      1986
## 2                       Fantastic Four: Rise of the Silver Surfer      2007
## 3                            Hobbit: The Desolation of Smaug, The      2013
## 4                                       Resident Evil: Apocalypse      2004
## 5                                                          Saw II      2005
## 6                                Hunger Games: Catching Fire, The      2013
## 7                        Three Colors: White (Trzy kolory: Bialy)      1994
## 8                             Transformers: Revenge of the Fallen      2009
## 9                        Three Colors: White (Trzy kolory: Bialy)      1994
## 10                            Captain America: The Winter Soldier      2014
## 11                                                     Iron Man 3      2013
## 12                             Captain America: The First Avenger      2011
## 13                                        Amazing Spider-Man, The      2012
## 14                            Captain America: The Winter Soldier      2014
## 15                                                  Scary Movie 2      2001
## 16                                             X-Men: First Class      2011
## 17                                        Star Trek Into Darkness      2013
## 18               Birdman: Or (The Unexpected Virtue of Ignorance)      2014
## 19                                               Edge of Tomorrow      2014
## 20                       Pirates of the Caribbean: At World's End      2007
## 21                                           Beverly Hills Cop II      1987
## 22                                     Chronicles of Riddick, The      2004
## 23                      Harry Potter and the Order of the Phoenix      2007
## 24                                        Ice Age 2: The Meltdown      2006
## 25                                                    Thunderball      1965
## 26                                                 28 Weeks Later      2007
## 27                                                    Death Proof      2007
## 28                      Fantastic Four: Rise of the Silver Surfer      2007
## 29                                         Lars and the Real Girl      2007
## 30                                                        Shooter      2007
## 31                                           Terminator Salvation      2009
## 32                                              Quantum of Solace      2008
## 33                    Kiki's Delivery Service (Majo no takkyûbin)      1989
## 34  Nausicaä of the Valley of the Wind (Kaze no tani no Naushika)      1984
## 35                            Harry Potter and the Goblet of Fire      2005
## 36                        Grave of the Fireflies (Hotaru no haka)      1988
## 37                                               Dawn of the Dead      1978
## 38                                    Hellboy II: The Golden Army      2008
## 39            For a Few Dollars More (Per qualche dollaro in più)      1965
## 40                                                     Iron Man 2      2010
## 41                                                     Iron Man 3      2013
## 42             Laputa: Castle in the Sky (Tenkû no shiro Rapyuta)      1986
## 43                                           Incredible Hulk, The      2008
## 44                Fistful of Dollars, A (Per un pugno di dollari)      1964
## 45                      Harry Potter and the Order of the Phoenix      2007
## 46                                        Mission: Impossible III      2006
## 47                          My Neighbor Totoro (Tonari no Totoro)      1988
## 48                                           Terminator Salvation      2009
## 49                             Captain America: The First Avenger      2011
## 50                                                Shrek the Third      2007
## 51                                Wallace & Gromit: A Close Shave      1995
## 52                                                 Broken Flowers      2005
## 53                      Fantastic Four: Rise of the Silver Surfer      2007
## 54                         Harry Potter and the Half-Blood Prince      2009
## 55                                                   Spider-Man 3      2007
## 56                                                           Thor      2011
## 57                    Howl's Moving Castle (Hauru no ugoku shiro)      2004
## 58                                                        Hancock      2008
## 59                       Grand Day Out with Wallace and Gromit, A      1989
## 60                                        Ice Age 2: The Meltdown      2006
## 61                            Captain America: The Winter Soldier      2014
## 62                                       X-Men Origins: Wolverine      2009
## 63                                        Amazing Spider-Man, The      2012
## 64                                              Quantum of Solace      2008
## 65                                        Star Trek Into Darkness      2013
## 66                                                     Just Cause      1995
## 67                                        Matrix Revolutions, The      2003
## 68                            Transformers: Revenge of the Fallen      2009
## 69                                                         Wanted      2008
## 70                                          X-Men: The Last Stand      2006
## 71                                                Michael Clayton      2007
## 72                                                 Animatrix, The      2003
## 73                                                     Prometheus      2012
## 74                                              I Heart Huckabees      2004
## 75                                 Rise of the Planet of the Apes      2011
## 76                                                           Hulk      2003
## 77                                                        Bananas      1971
## 78                           Mission: Impossible - Ghost Protocol      2011
## 79                                                 Tropic Thunder      2008
## 80                                                   Spider-Man 3      2007
## 81                                               Superman Returns      2006
## 82                                                      Daredevil      2003
## 83                                             Star Trek: Nemesis      2002
## 84                                                 Gone Baby Gone      2007
## 85                            Harry Potter and the Goblet of Fire      2005
## 86                                                    Death Proof      2007
## 87                                                  Planet Terror      2007
## 88                      Harry Potter and the Order of the Phoenix      2007
## 89                                                            F/X      1986
## 90                                                 Animatrix, The      2003
## 91                                                  Reign of Fire      2002
## 92                                                 Animatrix, The      2003
## 93                                                     Grindhouse      2007
## 94                                           Terminator Salvation      2009
## 95                                                         Zodiac      2007
## 96                                                      Town, The      2010
## 97                                         History of Violence, A      2005
## 98                                                Ruthless People      1986
## 99                                               Eastern Promises      2007
## 100                               Who's Afraid of Virginia Woolf?      1966
## 101                                                    Underworld      2003
## 102                  Star Wars: Episode III - Revenge of the Sith      2005
## 103                                    Chronicles of Riddick, The      2004
## 104                                            Star Trek: Nemesis      2002
## 105                                                 Fountain, The      2006
## 106                                             American Gangster      2007
## 107                                                   Van Helsing      2004
## 108                                                      Sunshine      2007
## 109                                     Resident Evil: Apocalypse      2004
## 110                            Terminator 3: Rise of the Machines      2003
## 111                                                         Mimic      1997
## 112                       Harry Potter and the Chamber of Secrets      2002
## 113                                       Mission: Impossible III      2006
## 114                     Harry Potter and the Order of the Phoenix      2007
## 115                     Fantastic Four: Rise of the Silver Surfer      2007
## 116                                               Ruthless People      1986
## 117                                                   Hard Target      1993
## 118                                                    Prometheus      2012
## 119                                                State and Main      2000
## 120                           Harry Potter and the Goblet of Fire      2005
## 121                                                   Cloud Atlas      2012
## 122                                                  Man of Steel      2013
## 123                                                        Looper      2012
## 124                                                   Source Code      2011
## 125                                                       Super 8      2011
## 126                                                  Tron: Legacy      2010
## 127                                       Airplane II: The Sequel      1982
## 128                                                       Gravity      2013
## 129                                               Ruthless People      1986
## 130                                                 Planet Terror      2007
## 131                                Rise of the Planet of the Apes      2011
## 132                                                   Death Proof      2007
## 133                                             Scanner Darkly, A      2006
## 134                                         Bourne Supremacy, The      2004
## 135                           Captain America: The Winter Soldier      2014
## 136                                                     Limitless      2011
## 137                                                    Iron Man 3      2013
## 138                                       Star Trek Into Darkness      2013
## 139                                                    Innerspace      1987
## 140                                              Superman Returns      2006
## 141                                                The Lego Movie      2014
## 142                                       Cabin in the Woods, The      2012
## 143                                       Mission: Impossible III      2006
## 144              Birdman: Or (The Unexpected Virtue of Ignorance)      2014
## 145                                                     Town, The      2010
## 146                                               Wild Bunch, The      1969
## 147                                                          Moon      2009
## 148                                                        Primer      2004
## 149                                         Peggy Sue Got Married      1986
## 150                                              Moonrise Kingdom      2012
##                                                                                 Right_title
## 1                                                                          Jean de Florette
## 2                                                                            Fantastic Four
## 3                                                        Hobbit: An Unexpected Journey, The
## 4                                                                             Resident Evil
## 5                                                                                       Saw
## 6                                                                         Hunger Games, The
## 7                                                 Three Colors: Blue (Trois couleurs: Bleu)
## 8                                                                              Transformers
## 9                                                 Three Colors: Red (Trois couleurs: Rouge)
## 10                                                                            Avengers, The
## 11                                                                            Avengers, The
## 12                                                                            Avengers, The
## 13                                                                            Avengers, The
## 14                                                                   Dark Knight Rises, The
## 15                                                                              Scary Movie
## 16                                                                            Avengers, The
## 17                                                                                Star Trek
## 18                                                                             Interstellar
## 19                                                                             Interstellar
## 20                                               Pirates of the Caribbean: Dead Man's Chest
## 21                                                                        Beverly Hills Cop
## 22                                                                                 I, Robot
## 23                                                      Harry Potter and the Goblet of Fire
## 24                                                                                  Ice Age
## 25                                                                               Goldfinger
## 26                                                                            28 Days Later
## 27                                                                   No Country for Old Men
## 28                                                                                      300
## 29                                                                                     Juno
## 30                                                                    Bourne Ultimatum, The
## 31                                                                                   Avatar
## 32                                                                            Casino Royale
## 33                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 34                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 35                                                 Harry Potter and the Prisoner of Azkaban
## 36                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 37                                                                        Shaun of the Dead
## 38                                                                                 Iron Man
## 39                       Good, the Bad and the Ugly, The (Buono, il brutto, il cattivo, Il)
## 40                                                                                 Iron Man
## 41                                                                                 Iron Man
## 42                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 43                                                                                 Iron Man
## 44                       Good, the Bad and the Ugly, The (Buono, il brutto, il cattivo, Il)
## 45                                                 Harry Potter and the Prisoner of Azkaban
## 46                                                                            Casino Royale
## 47                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 48                                                                                 Iron Man
## 49                                                                                 Iron Man
## 50                                                                                  Shrek 2
## 51                                                     Wallace & Gromit: The Wrong Trousers
## 52                                                                      Lost in Translation
## 53                                                                                 Iron Man
## 54                                                 Harry Potter and the Prisoner of Azkaban
## 55                                                                             Spider-Man 2
## 56                                                                                 Iron Man
## 57                                            Spirited Away (Sen to Chihiro no kamikakushi)
## 58                                                                                 Iron Man
## 59                                                     Wallace & Gromit: The Wrong Trousers
## 60                                                                                  Shrek 2
## 61                                                                                 Iron Man
## 62                                                                                 Iron Man
## 63                                                                                 Iron Man
## 64                                                                                 Iron Man
## 65                                                                                 Iron Man
## 66                                                                              Client, The
## 67                                                                     Matrix Reloaded, The
## 68                                                                                 Iron Man
## 69                                                                                 Iron Man
## 70                                                                         X2: X-Men United
## 71                                                                            Departed, The
## 72                                                                     Matrix Reloaded, The
## 73                                                                                 Iron Man
## 74                                                                      Lost in Translation
## 75                                                                                 Iron Man
## 76                                                                         X2: X-Men United
## 77                                                                               Annie Hall
## 78                                                                                 Iron Man
## 79                                                                                 Iron Man
## 80                                                                                 Iron Man
## 81                                                                             Spider-Man 2
## 82                                                                         X2: X-Men United
## 83                                             Star Wars: Episode II - Attack of the Clones
## 84                                                                            Departed, The
## 85                                                  Harry Potter and the Chamber of Secrets
## 86                                                                                 Sin City
## 87                                                                                 Sin City
## 88                                                  Harry Potter and the Chamber of Secrets
## 89                                                                            Lethal Weapon
## 90                                                                                 Sin City
## 91                                             Star Wars: Episode II - Attack of the Clones
## 92                                                                           V for Vendetta
## 93                                                                                 Sin City
## 94                                                                     Matrix Reloaded, The
## 95                                                                            Departed, The
## 96                                                                            Departed, The
## 97                                                                                 Sin City
## 98                                                                            Lethal Weapon
## 99                                                                            Departed, The
## 100                                                                           Graduate, The
## 101                                                                    Matrix Reloaded, The
## 102                                            Star Wars: Episode II - Attack of the Clones
## 103                                                                    Matrix Reloaded, The
## 104                                                                    Matrix Reloaded, The
## 105                                                                          V for Vendetta
## 106                                                                           Departed, The
## 107                                                                    Matrix Reloaded, The
## 108                                                                          V for Vendetta
## 109                                                                    Matrix Reloaded, The
## 110                                                                    Matrix Reloaded, The
## 111                                                                                Face/Off
## 112 Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone)
## 113                                                                          V for Vendetta
## 114 Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone)
## 115                                                                          V for Vendetta
## 116                                                                    Fish Called Wanda, A
## 117                                                                          Demolition Man
## 118                                                                               Inception
## 119                                                              O Brother, Where Art Thou?
## 120 Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone)
## 121                                                                               Inception
## 122                                                                               Inception
## 123                                                                               Inception
## 124                                                                               Inception
## 125                                                                               Inception
## 126                                                                               Inception
## 127                                                                               Airplane!
## 128                                                                               Inception
## 129                                                                                     Big
## 130                                                                       Kill Bill: Vol. 2
## 131                                                                               Inception
## 132                                                                       Kill Bill: Vol. 2
## 133                                                                            Donnie Darko
## 134                                                                    Bourne Identity, The
## 135                                                                               Inception
## 136                                                                               Inception
## 137                                                                               Inception
## 138                                                                               Inception
## 139                                                              Back to the Future Part II
## 140                                                                           Batman Begins
## 141                                                                               Inception
## 142                                                                               Inception
## 143                                                                           Batman Begins
## 144                                                                               Inception
## 145                                                                               Inception
## 146                                                                                  Psycho
## 147                                                                               Inception
## 148                                                                            Donnie Darko
## 149                                                                 When Harry Met Sally...
## 150                                                                               Inception
##     Right_year    support confidence      lift
## 1         1986 0.01028730  0.7551020 54.681550
## 2         2005 0.01037998  0.7567568 36.130112
## 3         2012 0.01251158  0.7758621 34.881466
## 4         2002 0.01075070  0.7581699 33.120055
## 5         2004 0.01028730  0.8283582 30.927284
## 6         2012 0.01399444  0.7905759 29.414876
## 7         1993 0.01770158  0.7764228 26.851287
## 8         2007 0.01037998  0.7724138 26.208632
## 9         1994 0.01742354  0.7642276 22.905601
## 10        2012 0.01260426  0.8343558 21.034344
## 11        2012 0.01390176  0.8333333 21.008567
## 12        2012 0.01399444  0.8074866 20.356964
## 13        2012 0.01241891  0.7976190 20.108200
## 14        2012 0.01149212  0.7607362 19.405068
## 15        2000 0.01269694  0.7828571 19.285453
## 16        2012 0.01909175  0.7545788 19.023142
## 17        2009 0.01436515  0.8288770 18.104419
## 18        2014 0.01149212  0.7948718 17.980433
## 19        2014 0.01696015  0.7530864 17.035225
## 20        2006 0.02187210  0.7892977 16.633832
## 21        1984 0.01380908  0.7720207 16.495255
## 22        2004 0.01232623  0.7556818 15.987857
## 23        2005 0.02659870  0.7798913 15.847509
## 24        2002 0.01121409  0.8120805 15.345620
## 25        1964 0.01529194  0.7932692 15.284598
## 26        2002 0.01612604  0.8365385 15.195707
## 27        2007 0.01047266  0.7533333 14.161092
## 28        2007 0.01028730  0.7500000 13.880789
## 29        2007 0.01010195  0.8384615 13.854518
## 30        2007 0.01000927  0.8000000 13.466459
## 31        2009 0.01019462  0.7746479 13.438024
## 32        2006 0.01612604  0.8613861 13.164811
## 33        2001 0.01075070  0.8345324 12.808825
## 34        2001 0.01380908  0.8324022 12.776131
## 35        2004 0.04096386  0.8323917 12.757822
## 36        2001 0.01093605  0.8251748 12.665201
## 37        2004 0.01010195  0.7569444 12.662683
## 38        2008 0.01037998  0.8682171 12.540913
## 39        1966 0.01705283  0.7965368 12.528618
## 40        2008 0.02511585  0.8658147 12.506212
## 41        2008 0.01436515  0.8611111 12.438272
## 42        2001 0.01399444  0.8074866 12.393714
## 43        2008 0.01436515  0.8563536 12.369552
## 44        1966 0.02168675  0.7826087 12.309545
## 45        2004 0.02715477  0.7961957 12.203056
## 46        2006 0.01223355  0.7951807 12.152975
## 47        2001 0.02307692  0.7830189 12.018170
## 48        2008 0.01084337  0.8239437 11.901408
## 49        2008 0.01427247  0.8235294 11.895425
## 50        2004 0.01028730  0.7872340 11.880077
## 51        1993 0.04096386  0.7906977 11.767763
## 52        2003 0.01102873  0.8150685 11.741774
## 53        2008 0.01112141  0.8108108 11.711712
## 54        2004 0.02177943  0.7605178 11.656232
## 55        2004 0.01992586  0.8113208 11.625698
## 56        2008 0.01575533  0.8018868 11.582809
## 57        2001 0.02224282  0.7523511 11.547466
## 58        2008 0.01455051  0.7969543 11.511562
## 59        1993 0.02845227  0.7713568 11.479917
## 60        2004 0.01047266  0.7583893 11.444783
## 61        2008 0.01195551  0.7914110 11.431493
## 62        2008 0.01529194  0.7894737 11.403509
## 63        2008 0.01223355  0.7857143 11.349206
## 64        2008 0.01464319  0.7821782 11.298130
## 65        2008 0.01353105  0.7807487 11.277481
## 66        1994 0.01260426  0.7555556 11.198413
## 67        2003 0.04559778  0.8395904 11.184174
## 68        2008 0.01037998  0.7724138 11.157088
## 69        2008 0.01130677  0.7672956 11.083159
## 70        2003 0.02446710  0.7719298 11.002804
## 71        2006 0.01075070  0.8169014 10.976795
## 72        2003 0.01084337  0.8239437 10.975743
## 73        2008 0.01177016  0.7559524 10.919312
## 74        2003 0.01130677  0.7577640 10.916253
## 75        2008 0.01316033  0.7553191 10.910165
## 76        2003 0.01751622  0.7651822 10.906626
## 77        1977 0.01010195  0.7622378 10.893438
## 78        2008 0.01158480  0.7530120 10.876841
## 79        2008 0.01371640  0.7512690 10.851664
## 80        2008 0.01844300  0.7509434 10.846960
## 81        2004 0.01464319  0.7523810 10.781129
## 82        2003 0.01714551  0.7551020 10.762947
## 83        2002 0.01102873  0.7933333 10.740360
## 84        2006 0.01139944  0.7987013 10.732238
## 85        2002 0.03707136  0.7532957 10.694816
## 86        2005 0.01093605  0.7866667 10.676897
## 87        2005 0.01084337  0.7852349 10.657465
## 88        2002 0.02557924  0.7500000 10.648026
## 89        1987 0.01112141  0.7741935 10.627924
## 90        2005 0.01028730  0.7816901 10.609354
## 91        2002 0.01010195  0.7730496 10.465754
## 92        2006 0.01047266  0.7957746 10.445752
## 93        2005 0.01445783  0.7684729 10.429966
## 94        2003 0.01028730  0.7816901 10.412885
## 95        2006 0.01816497  0.7747036 10.409778
## 96        2006 0.01019462  0.7746479 10.409030
## 97        2005 0.01362373  0.7656250 10.391313
## 98        1987 0.01010195  0.7517241 10.319470
## 99        2006 0.01260426  0.7640449 10.266557
## 100       1967 0.01121409  0.7610063 10.264072
## 101       2003 0.01501390  0.7641509 10.179245
## 102       2002 0.04365153  0.7511962 10.169895
## 103       2003 0.01241891  0.7613636 10.142116
## 104       2003 0.01056534  0.7600000 10.123951
## 105       2006 0.01028730  0.7708333 10.118360
## 106       2006 0.01974050  0.7526502 10.113444
## 107       2003 0.01575533  0.7555556 10.064746
## 108       2006 0.01000927  0.7659574 10.054356
## 109       2003 0.01065802  0.7516340 10.012507
## 110       2003 0.02826691  0.7512315 10.007146
## 111       1997 0.01130677  0.7770701  9.899157
## 112       2001 0.05514365  0.7828947  9.891609
## 113       2006 0.01158480  0.7530120  9.884428
## 114       2001 0.02659870  0.7798913  9.853662
## 115       2006 0.01028730  0.7500000  9.844891
## 116       1988 0.01056534  0.7862069  9.795811
## 117       1993 0.01084337  0.7500000  9.680024
## 118       2010 0.01417980  0.9107143  9.643383
## 119       2000 0.01000927  0.8244275  9.627243
## 120       2001 0.03744208  0.7608286  9.612811
## 121       2010 0.01186284  0.9078014  9.612539
## 122       2010 0.01075070  0.8992248  9.521723
## 123       2010 0.01974050  0.8987342  9.516528
## 124       2010 0.02113068  0.8976378  9.504918
## 125       2010 0.01047266  0.8897638  9.421542
## 126       2010 0.01019462  0.8870968  9.393301
## 127       1980 0.01445783  0.7878788  9.362568
## 128       2010 0.01964782  0.8796680  9.314640
## 129       1988 0.01019462  0.7586207  9.312306
## 130       2004 0.01130677  0.8187919  9.251063
## 131       2010 0.01519926  0.8723404  9.237049
## 132       2004 0.01130677  0.8133333  9.189389
## 133       2001 0.01204819  0.8227848  9.180815
## 134       2002 0.05468026  0.8477011  9.174218
## 135       2010 0.01306766  0.8650307  9.159648
## 136       2010 0.01631140  0.8627451  9.135446
## 137       2010 0.01436515  0.8611111  9.118144
## 138       2010 0.01492122  0.8609626  9.116571
## 139       1989 0.01056534  0.7651007  9.111961
## 140       2005 0.01640408  0.8428571  9.094429
## 141       2010 0.01010195  0.8582677  9.088036
## 142       2010 0.01112141  0.8571429  9.076125
## 143       2005 0.01288230  0.8373494  9.035000
## 144       2010 0.01232623  0.8525641  9.027641
## 145       2010 0.01121409  0.8521127  9.022861
## 146       1960 0.01084337  0.7597403  9.018259
## 147       2010 0.02437442  0.8511327  9.012484
## 148       2001 0.01167748  0.8076923  9.012410
## 149       1989 0.01102873  0.8095238  8.995635
## 150       2010 0.01612604  0.8487805  8.987577

Comment: We can see now, that the rules with the highest list establish a sequel/prequel or the same universum connection between the inputs. That was rather predictable, as series are very often watched as a whole. However, we also looked at the first 150 observations to looks for some other associations.

A great example of a successful association would be composed of movies from different universes, series, directors etc.

Such scenario can be found in line 128, where we associated “Inception” and “Cloud Atlas”, with the lift of over 9. Both movies evolve around the dystopy and dreaminess. However, they star different actors, and are not connected through plots and so on. That indicated the pattern of likings when it comes to a particular theme, here we have the take on science fiction, that is more emotional and technological, than some other films.
Another example is line 29 connecting Juno and Lars and the Real Girl, both deeply emotional films with an original take on the matters discussed.

Some other associations:

For example, line 52, Lost in Translation connected with Broken Flowers star the same actor (Bill Murray), although the movies are not connected by the plot, nor the genre.
Another case can be the association of movies launched in the same decades, as many people are fans of specific decades, especially when it comes to the olden days. In line 146, we have Psycho and the Wild Bunch, which are rather on the opposite sides of the spectrum, first being a horror movie, and the latter a western, but they both were made in the 1960s, therefore for a fan of this decade, they can be both interesting regardless of the differences.

Additional cases

Now that we have established the basic Association Rules, we can apply endless variations and ideas to modify and explore deeper the recommendations, as going one by one through the dataset containing almost 1500 connection will be highly tedious.
For example, we can check, whether there are rules, where older movies led the audience to the newer positions.

Year Association

associationRules %>% 
    filter(Left_year < 1990 & Right_year > 2000) %>%
    arrange(desc(lift)) %>% 
    head(25)

##                                                      Left_title Left_year
## 1                   Kiki's Delivery Service (Majo no takkyûbin)      1989
## 2 Nausicaä of the Valley of the Wind (Kaze no tani no Naushika)      1984
## 3                       Grave of the Fireflies (Hotaru no haka)      1988
## 4                                              Dawn of the Dead      1978
## 5            Laputa: Castle in the Sky (Tenkû no shiro Rapyuta)      1986
## 6                         My Neighbor Totoro (Tonari no Totoro)      1988
## 7                                                      WarGames      1983
## 8                                          Bourne Identity, The      1988
## 9                       Grave of the Fireflies (Hotaru no haka)      1988
##                                     Right_title Right_year    support
## 1 Spirited Away (Sen to Chihiro no kamikakushi)       2001 0.01075070
## 2 Spirited Away (Sen to Chihiro no kamikakushi)       2001 0.01380908
## 3 Spirited Away (Sen to Chihiro no kamikakushi)       2001 0.01093605
## 4                             Shaun of the Dead       2004 0.01010195
## 5 Spirited Away (Sen to Chihiro no kamikakushi)       2001 0.01399444
## 6 Spirited Away (Sen to Chihiro no kamikakushi)       2001 0.02307692
## 7                               Minority Report       2002 0.01566265
## 8                              Dark Knight, The       2008 0.01065802
## 9        Lord of the Rings: The Two Towers, The       2002 0.01065802
##   confidence      lift
## 1  0.8345324 12.808825
## 2  0.8324022 12.776131
## 3  0.8251748 12.665201
## 4  0.7569444 12.662683
## 5  0.8074866 12.393714
## 6  0.7830189 12.018170
## 7  0.7824074  7.106209
## 8  0.7986111  6.679856
## 9  0.8041958  4.834135

As we can see, majority of the positions on this short list are of Japanese origins. My thought is, that the world of Japanese animation has enough power the draw the viewers into its world, no matter what the year of th production is. Let’s try the other way around now.

associationRules %>% 
    filter(Left_year > 2000 & Right_year < 1990) %>%
    arrange(desc(lift)) %>% 
    head(25)

##                           Left_title Left_year     Right_title Right_year
## 1 Terminator 3: Rise of the Machines      2003 Terminator, The       1984
## 2               Terminator Salvation      2009 Terminator, The       1984
## 3                 Star Trek: Nemesis      2002 Terminator, The       1984
##      support confidence     lift
## 1 0.03012048  0.8004926 5.667530
## 2 0.01019462  0.7746479 5.484548
## 3 0.01047266  0.7533333 5.333640

Here, we also have only a few instances, all from the genre of Science Fiction, inspiring out viewers to explore the old version of the Terminator. Since the yearly association has no robustness in our dataset, let’s try out the last modification.

Title Association

We can also use the Association Rules measure to recommend a potential movie based on a given title. Let’s explore the connection with the amazing science fiction movie directed by the great Christopher Nolan “Inception”.

InceptionLeft <- associationRules %>% 
    filter(str_detect(Left_title, "Inception")) %>%
    head(20)

InceptionRight <- associationRules %>% 
    filter(str_detect(Right_title, "Inception"))

InceptionLeft

## [1] Left_title  Left_year   Right_title Right_year  support     confidence 
## [7] lift       
## <0 rows> (or 0-length row.names)

InceptionRight

##                                                  Left_title Left_year
## 1                                                Prometheus      2012
## 2                                               Cloud Atlas      2012
## 3                                              Man of Steel      2013
## 4                                                    Looper      2012
## 5                                               Source Code      2011
## 6                                                   Super 8      2011
## 7                                              Tron: Legacy      2010
## 8                                                   Gravity      2013
## 9                            Rise of the Planet of the Apes      2011
## 10                      Captain America: The Winter Soldier      2014
## 11                                                Limitless      2011
## 12                                               Iron Man 3      2013
## 13                                  Star Trek Into Darkness      2013
## 14                                           The Lego Movie      2014
## 15                                  Cabin in the Woods, The      2012
## 16         Birdman: Or (The Unexpected Virtue of Ignorance)      2014
## 17                                                Town, The      2010
## 18                                                     Moon      2009
## 19                                         Moonrise Kingdom      2012
## 20                                           Shutter Island      2010
## 21                                   Adjustment Bureau, The      2011
## 22                                   Dark Knight Rises, The      2012
## 23                                       X-Men: First Class      2011
## 24                               X-Men: Days of Future Past      2014
## 25                                                     Argo      2012
## 26                                                 Kick-Ass      2010
## 27                                                     Hugo      2011
## 28                       Hobbit: An Unexpected Journey, The      2012
## 29                                                  Skyfall      2012
## 30                                                     Thor      2011
## 31                              Scott Pilgrim vs. the World      2010
## 32                                 Wolf of Wall Street, The      2013
## 33                     Mission: Impossible - Ghost Protocol      2011
## 34                                             Fighter, The      2010
## 35                         Hunger Games: Catching Fire, The      2013
## 36                     Hobbit: The Desolation of Smaug, The      2013
## 37                       Sherlock Holmes: A Game of Shadows      2011
## 38                                                    Drive      2011
## 39                                               Black Swan      2010
## 40                                                True Grit      2010
## 41                                       Dallas Buyers Club      2013
## 42                                                Moneyball      2011
## 43                                                127 Hours      2010
## 44                                      Social Network, The      2010
## 45                                               Life of Pi      2012
## 46                                          Sherlock Holmes      2009
## 47                                         Edge of Tomorrow      2014
## 48                       Captain America: The First Avenger      2011
## 49                                               Ex Machina      2015
## 50                                Grand Budapest Hotel, The      2014
## 51                                                      Her      2013
## 52 Girl with the Dragon Tattoo, The (Män som hatar kvinnor)      2009
## 53                                  Amazing Spider-Man, The      2012
## 54                                  Silver Linings Playbook      2012
## 55                                  Guardians of the Galaxy      2014
## 56                                        Midnight in Paris      2011
## 57                                                    50/50      2011
## 58                                        Fantastic Mr. Fox      2009
## 59                                        Hunger Games, The      2012
## 60                                               Iron Man 2      2010
## 61                                         Hurt Locker, The      2008
## 62                                         Django Unchained      2012
## 63                                       Mad Max: Fury Road      2015
## 64                                            Up in the Air      2009
## 65                                            Despicable Me      2010
## 66                                            Avengers, The      2012
## 67                         Girl with the Dragon Tattoo, The      2011
## 68                                              Toy Story 3      2010
## 69                                               District 9      2009
##    Right_title Right_year    support confidence     lift
## 1    Inception       2010 0.01417980  0.9107143 9.643383
## 2    Inception       2010 0.01186284  0.9078014 9.612539
## 3    Inception       2010 0.01075070  0.8992248 9.521723
## 4    Inception       2010 0.01974050  0.8987342 9.516528
## 5    Inception       2010 0.02113068  0.8976378 9.504918
## 6    Inception       2010 0.01047266  0.8897638 9.421542
## 7    Inception       2010 0.01019462  0.8870968 9.393301
## 8    Inception       2010 0.01964782  0.8796680 9.314640
## 9    Inception       2010 0.01519926  0.8723404 9.237049
## 10   Inception       2010 0.01306766  0.8650307 9.159648
## 11   Inception       2010 0.01631140  0.8627451 9.135446
## 12   Inception       2010 0.01436515  0.8611111 9.118144
## 13   Inception       2010 0.01492122  0.8609626 9.116571
## 14   Inception       2010 0.01010195  0.8582677 9.088036
## 15   Inception       2010 0.01112141  0.8571429 9.076125
## 16   Inception       2010 0.01232623  0.8525641 9.027641
## 17   Inception       2010 0.01121409  0.8521127 9.022861
## 18   Inception       2010 0.02437442  0.8511327 9.012484
## 19   Inception       2010 0.01612604  0.8487805 8.987577
## 20   Inception       2010 0.03632994  0.8484848 8.984447
## 21   Inception       2010 0.01075070  0.8467153 8.965710
## 22   Inception       2010 0.03317887  0.8463357 8.961690
## 23   Inception       2010 0.02131603  0.8424908 8.920978
## 24   Inception       2010 0.01519926  0.8410256 8.905463
## 25   Inception       2010 0.01658943  0.8403756 8.898580
## 26   Inception       2010 0.02381835  0.8371336 8.864250
## 27   Inception       2010 0.01019462  0.8333333 8.824010
## 28   Inception       2010 0.01853568  0.8333333 8.824010
## 29   Inception       2010 0.01835032  0.8319328 8.809180
## 30   Inception       2010 0.01631140  0.8301887 8.790712
## 31   Inception       2010 0.01807229  0.8297872 8.786461
## 32   Inception       2010 0.01936979  0.8260870 8.747280
## 33   Inception       2010 0.01269694  0.8253012 8.738960
## 34   Inception       2010 0.01037998  0.8235294 8.720199
## 35   Inception       2010 0.01455051  0.8219895 8.703893
## 36   Inception       2010 0.01325301  0.8218391 8.702300
## 37   Inception       2010 0.01603336  0.8199052 8.681823
## 38   Inception       2010 0.01603336  0.8199052 8.681823
## 39   Inception       2010 0.02780352  0.8174387 8.655705
## 40   Inception       2010 0.01603336  0.8160377 8.640871
## 41   Inception       2010 0.01121409  0.8120805 8.598969
## 42   Inception       2010 0.01288230  0.8081395 8.557238
## 43   Inception       2010 0.01167748  0.8076923 8.552502
## 44   Inception       2010 0.02826691  0.8047493 8.521340
## 45   Inception       2010 0.01334569  0.8044693 8.518374
## 46   Inception       2010 0.02808156  0.8037135 8.510372
## 47   Inception       2010 0.01807229  0.8024691 8.497195
## 48   Inception       2010 0.01390176  0.8021390 8.493700
## 49   Inception       2010 0.01501390  0.8019802 8.492018
## 50   Inception       2010 0.02131603  0.8013937 8.485808
## 51   Inception       2010 0.01631140  0.8000000 8.471050
## 52   Inception       2010 0.01556997  0.8000000 8.471050
## 53   Inception       2010 0.01241891  0.7976190 8.445839
## 54   Inception       2010 0.01640408  0.7972973 8.442432
## 55   Inception       2010 0.02150139  0.7972509 8.441940
## 56   Inception       2010 0.01353105  0.7934783 8.401993
## 57   Inception       2010 0.01102873  0.7933333 8.400458
## 58   Inception       2010 0.01177016  0.7888199 8.352666
## 59   Inception       2010 0.02113068  0.7862069 8.324997
## 60   Inception       2010 0.02279889  0.7859425 8.322198
## 61   Inception       2010 0.01723818  0.7848101 8.310207
## 62   Inception       2010 0.03113994  0.7832168 8.293336
## 63   Inception       2010 0.01594069  0.7818182 8.278526
## 64   Inception       2010 0.01427247  0.7777778 8.235743
## 65   Inception       2010 0.01482854  0.7766990 8.224320
## 66   Inception       2010 0.03067655  0.7733645 8.189012
## 67   Inception       2010 0.01399444  0.7704082 8.157708
## 68   Inception       2010 0.02789620  0.7678571 8.130695
## 69   Inception       2010 0.03438369  0.7540650 7.984653

Interestingly we do not have Inception in the left hand association. However, on the right hand side association, becauce Inception is a very popular movie we have a much larger scope to work on, 69 rules to be exact. Moreover, all lift values are above 8, giving us a strong dependency between the two observations.
Noteworthy, we can spot a few different tendencies here in associating Inception with other positions. For example, movies such as Cloud Atlas, Source Code or Limitless base their connection on the plot, incorporated themes and genres. Movies like Looper or Shutter Island, are an interesting case, connecting both themes alike, for instance, Inception and Shutter Island both deal with a mental illnesses, while Inception and Looper both evolve around fight scenes, and action packed parts, but also actors, because Leonardo Di Caprio played in the first pair mentioned, and Joseph Gordon Levitt in the second one. We also see associations by the direction, in the pair with The Dark Knight Rises (Batman saga).
It is fascinating, how many variants of a connection can one movie create, providing something worthy for everyone.

Conclusion

Implementing the Association Rules measure can be extremely useful in the process of recommendation, of establishing behaviour patterns. It is a great technique for an extraction of useful information and remarks from a dataset that is hard to read or too vast in its dimensions. The algorithms transforms the provided dataset into a readable and useful outputs, comprehendable even for a beginner, what can be an extreme advantage in the business environment, where a depiction of the conclusions and the outputs, to someone who is not familiar with the works of such algorithms, can sometimes be a tedious procedure. Association rules use the traditional probability theory and its statistics, therefore anyone can be able to understand the impacts.
The measure can be modified and implemented in many ways, depending on the user’s interest. A deeper look into the outputs can established additional rules for a more detailed analysis.
Association Rules application on the dataset of movie ratings provided a set of very interesting rules, where a dependency between two movies can be conditional on different rules, or a combination of a few of them. Although the presented technique is not a sophisticated measure for establishing a general recommendation pattern, it provided us with an underlying relationships between the movies. Such approach can also be incorporated in many activities, for instance in behaviour analysis, product suggestion or a marketing campaign.

Refereces

Djenouri, Y., Belhadi, A., Fournier-Viger, P., & Fujita, H. (2018). Mining diversified association rules in big datasets: A cluster/GPU/genetic approach. Information Sciences, 459, 117-134. Retrieved from https://www.sciencedirect.com/science/article/abs/pii/S0020025518303980

Fjällström, P. (2016). A way to compare measures in association rule mining. Retrieved from https://www.diva-portal.org/smash/get/diva2:956424/FULLTEXT01.pdf

IBM. (2021a). Confidence in an association rule. Retrieved from https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.im.model.doc/c_confidence_in_an_association_rule.html

IBM. (2021b). Lift in an association rule. Retrieved from https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.1.0/com.ibm.im.model.doc/c_lift_in_an_association_rule.html

Karthik, B. (2020). Movie Recommendation System. Retrieved from https://www.kaggle.com/bandikarthik/movie-recommendation-system?select=ratings.csv