Introduction
I first created a survey of 6 movies which I have 5 persons responding. After saving the survey in csv, I stored the results in MySQL database.
Approach
I will analyze the rating of each movie to know what is the most rated movie. Then, go through each friend’s rating of individual movies and overall rating of movies. This will help me to know what type of the movie the friend likes and if he or she is a fun of movies. Then, after I will recommend a movie if necessary.
# Load the results from MySQL to R
library(tidyverse)
## -- Attaching packages --------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Loading required package: DBI
# Accessing MySQL data base with password stored in a settings file
#rmariadb.settingsfile <- "C:\\data\\credentials.cnf"
#rmariadb.db <- "credentials"
#mydb <- dbConnect(RMariaDB::MariaDB(),default.file = rmariadb.settingsfile,group = rmariadb.db)
mydb <-dbConnect(RMariaDB::MariaDB(), user='root', password='mypassword', dbname ='labs', host='localhost')
# Store the result in a dataframe movies
movies<- dbGetQuery(mydb, "SELECT * FROM movies")
library(tidyverse)
glimpse(movies)
## Rows: 30
## Columns: 3
## $ Movie_title <chr> "American son", "The last thing he wanted", "Spenser co...
## $ Friend_name <chr> "Batila", "Batila", "Batila", "Batila", "Batila", "Bati...
## $ Rating <int> 5, 4, 5, 3, 4, 5, NA, 4, 3, NA, 4, 5, NA, 5, 4, 4, 5, 5...
Handle missing values
# Replace all missing values (a move not seen by a person) by 0
movies[is.na(movies)] = 0
## Movie_title Friend_name Rating
## Length:30 Length:30 Min. :0.000
## Class :character Class :character 1st Qu.:4.000
## Mode :character Mode :character Median :4.000
## Mean :3.633
## 3rd Qu.:5.000
## Max. :5.000
Best Movie by Average rating
movies %>%
group_by(Movie_title) %>%
summarise(mean_rating = mean(Rating)) %>%
arrange(desc(mean_rating))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 6 x 2
## Movie_title mean_rating
## <chr> <dbl>
## 1 Inside Man Most Wanted 4.8
## 2 Extraction 4.4
## 3 The last thing he wanted 4.2
## 4 Spenser confidential 3.4
## 5 Six Underground 3.2
## 6 American son 1.8
movies %>%
group_by(Movie_title) %>%
summarise(mean_rating = mean(Rating)) %>%
ggplot(., aes(y = Movie_title, x= mean_rating, fill = mean_rating)) + geom_bar(stat = 'identity')
## `summarise()` ungrouping output (override with `.groups` argument)

Individual rating by movie
# Batila's preference
batila_movie <- movies %>%
filter(Friend_name == "Batila")
ggplot(data = batila_movie, aes( y = Movie_title, x = Rating, fill = Rating)) + geom_bar(stat = 'identity')

# Musengie's preference
musengie_movie <- movies %>%
filter(Friend_name == "Musengie")
ggplot(data = musengie_movie, aes( y = Movie_title, x = Rating, fill = Rating)) + geom_bar(stat = 'identity')

# Jonathan's preference
jonathan_movie <- movies %>%
filter(Friend_name == "Jonathan")
ggplot(data = jonathan_movie, aes( y = Movie_title, x = Rating, fill = Rating)) + geom_bar(stat = 'identity')

# Herve's preference
herve_movie <- movies %>%
filter(Friend_name == "Herve")
ggplot(data = herve_movie, aes( y = Movie_title, x = Rating, fill = Rating)) + geom_bar(stat = 'identity')

MaV’s preference
MaV_movie <- movies %>%
filter(Friend_name == "MaV")
ggplot(data = MaV_movie, aes( y = Movie_title, x = Rating, fill = Rating)) + geom_bar(stat = 'identity')
## Each friend’s overall rating
movies %>%
group_by(Friend_name) %>%
summarise(mean_rating = mean(Rating)) %>%
ggplot(., aes(y = Friend_name, x= mean_rating, fill = mean_rating)) + geom_bar(stat = 'identity')
## `summarise()` ungrouping output (override with `.groups` argument)

Conclusion & recommendation
I realized that Batila has watched all the ovies and his overall rating for those movies were very high. Nothing I can recommend to him from my list of six movies. Musengie had watched 4 out 6 movies but her rating is kind of the average so I would to recommend a very good movie to her. When it comes to movie, “American son” had an average low rating just because it was watched only by two two people although the friends who’d seen, like it. It is a good option to recommend to others.
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
# Construct a table frame for friends and recommended movies
f <- c("Musengie", "Jonathan", "Herve", "MaV")
m <- c("Six Underground", "American son", "American son", "Spencer confidential")
df <- data.frame(f,m)
colnames(df) <- c("Friend's name", "Recommended Movie")
# Movie recomendation table
df %>%
kbl(caption = "Movie recommendation", align = 'c') %>%
kable_material(c("striped", "hover")) %>%
row_spec(0, color = "indigo")
Movie recommendation
|
Friend’s name
|
Recommended Movie
|
|
Musengie
|
Six Underground
|
|
Jonathan
|
American son
|
|
Herve
|
American son
|
|
MaV
|
Spencer confidential
|
LS0tDQp0aXRsZTogIkRBVEE2MDcgQVNTMjogU1FMIGFuZCBSIg0KYXV0aG9yOiAiSmVyZWQgQXRha3kiDQpkYXRlOiAiOS8xMS8yMDIwIg0Kb3V0cHV0OiANCiAgb3BlbmludHJvOjpsYWJfcmVwb3J0OiBkZWZhdWx0DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMNCi0tLQ0KDQpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0NCmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSkNCmBgYA0KDQoNCiMjIEludHJvZHVjdGlvbg0KDQoNCjxzdHlsZT4NCmRpdi5hcXVhbWFyaW5lIHsgYmFja2dyb3VuZC1jb2xvcjojN2ZmZmQ0OyBib3JkZXItcmFkaXVzOiAxMHB4OyBwYWRkaW5nOiA1cHg7fQ0KPC9zdHlsZT4NCjxkaXYgY2xhc3MgPSAiYXF1YW1hcmluZSI+DQoNCg0KSSBmaXJzdCBjcmVhdGVkIGEgc3VydmV5IG9mIDYgbW92aWVzIHdoaWNoIEkgaGF2ZSA1IHBlcnNvbnMgcmVzcG9uZGluZy4NCkFmdGVyIHNhdmluZyB0aGUgc3VydmV5IGluIGNzdiwgSSBzdG9yZWQgdGhlIHJlc3VsdHMgaW4gTXlTUUwgZGF0YWJhc2UuDQoNCg0KPC9kaXY+IFxoZmlsbFxicmVhaw0KDQoNCg0KIyMgQXBwcm9hY2gNCg0KDQo8c3R5bGU+DQpkaXYuYXF1YW1hcmluZSB7IGJhY2tncm91bmQtY29sb3I6IzdmZmZkNDsgYm9yZGVyLXJhZGl1czogMTBweDsgcGFkZGluZzogNXB4O30NCjwvc3R5bGU+DQo8ZGl2IGNsYXNzID0gImFxdWFtYXJpbmUiPg0KDQoNCkkgd2lsbCBhbmFseXplIHRoZSByYXRpbmcgb2YgZWFjaCBtb3ZpZSB0byBrbm93IHdoYXQgaXMgdGhlIG1vc3QgcmF0ZWQgbW92aWUuDQpUaGVuLCBnbyB0aHJvdWdoIGVhY2ggZnJpZW5kJ3MgcmF0aW5nIG9mIGluZGl2aWR1YWwgbW92aWVzIGFuZCBvdmVyYWxsIHJhdGluZyANCm9mIG1vdmllcy4gVGhpcyB3aWxsIGhlbHAgbWUgdG8ga25vdyB3aGF0IHR5cGUgb2YgdGhlIG1vdmllIHRoZSBmcmllbmQgbGlrZXMgDQphbmQgaWYgaGUgb3Igc2hlIGlzIGEgZnVuIG9mIG1vdmllcy4gVGhlbiwgYWZ0ZXIgSSB3aWxsIHJlY29tbWVuZCBhIG1vdmllIA0KaWYgbmVjZXNzYXJ5LiANCg0KPC9kaXY+IFxoZmlsbFxicmVhaw0KDQoNCmBgYHtyfQ0KDQojIExvYWQgdGhlIHJlc3VsdHMgZnJvbSBNeVNRTCB0byBSDQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkoUk15U1FMKQ0KDQpgYGANCg0KYGBge3IgTVNRTH0NCg0KIyBBY2Nlc3NpbmcgTXlTUUwgZGF0YSBiYXNlIHdpdGggcGFzc3dvcmQgc3RvcmVkIGluIGEgc2V0dGluZ3MgZmlsZQ0KDQojcm1hcmlhZGIuc2V0dGluZ3NmaWxlIDwtICJDOlxcZGF0YVxcY3JlZGVudGlhbHMuY25mIg0KI3JtYXJpYWRiLmRiIDwtICJjcmVkZW50aWFscyINCiNteWRiIDwtIGRiQ29ubmVjdChSTWFyaWFEQjo6TWFyaWFEQigpLGRlZmF1bHQuZmlsZSA9IHJtYXJpYWRiLnNldHRpbmdzZmlsZSxncm91cCA9IHJtYXJpYWRiLmRiKQ0KDQpteWRiIDwtZGJDb25uZWN0KFJNYXJpYURCOjpNYXJpYURCKCksIHVzZXI9J3Jvb3QnLCBwYXNzd29yZD0nbXlwYXNzd29yZCcsIGRibmFtZSA9J2xhYnMnLCBob3N0PSdsb2NhbGhvc3QnKQ0KDQpgYGANCg0KYGBge3J9DQojIFN0b3JlIHRoZSByZXN1bHQgaW4gYSBkYXRhZnJhbWUgbW92aWVzDQoNCm1vdmllczwtIGRiR2V0UXVlcnkobXlkYiwgIlNFTEVDVCAqIEZST00gbW92aWVzIikNCmBgYA0KDQpgYGB7ciBwb3B1bGF0ZS1kYXRhLWZyYW1lfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpnbGltcHNlKG1vdmllcykNCg0KYGBgDQoNCiMjIEhhbmRsZSBtaXNzaW5nIHZhbHVlcw0KDQpgYGB7ciBtaXNzaW5nLWRhdGF9DQojIFJlcGxhY2UgYWxsIG1pc3NpbmcgdmFsdWVzIChhIG1vdmUgbm90IHNlZW4gYnkgYSBwZXJzb24pIGJ5IDANCg0KbW92aWVzW2lzLm5hKG1vdmllcyldID0gMA0KYGBgDQoNCg0KYGBge3Igc3VtbWFyaXNlLW1vdmllc30NCnN1bW1hcnkobW92aWVzKQ0KDQpgYGANCiMjIEJlc3QgTW92aWUgYnkgQXZlcmFnZSByYXRpbmcNCg0KDQpgYGB7ciBtb3ZpZXMtcmF0aW5nLWF2ZXJhZ2V9DQoNCm1vdmllcyAlPiUNCiAgZ3JvdXBfYnkoTW92aWVfdGl0bGUpICU+JQ0KICBzdW1tYXJpc2UobWVhbl9yYXRpbmcgPSBtZWFuKFJhdGluZykpICU+JQ0KICBhcnJhbmdlKGRlc2MobWVhbl9yYXRpbmcpKQ0KICANCmBgYA0KDQoNCmBgYHtyIG1vdmllcy1yYXRpbmctYXZlcmFnZS1wbG90fQ0KDQptb3ZpZXMgJT4lDQogIGdyb3VwX2J5KE1vdmllX3RpdGxlKSAlPiUNCiAgc3VtbWFyaXNlKG1lYW5fcmF0aW5nID0gbWVhbihSYXRpbmcpKSAlPiUNCiAgZ2dwbG90KC4sIGFlcyh5ID0gTW92aWVfdGl0bGUsIHg9IG1lYW5fcmF0aW5nLCBmaWxsID0gbWVhbl9yYXRpbmcpKSArIGdlb21fYmFyKHN0YXQgPSAnaWRlbnRpdHknKQ0KICANCmBgYA0KDQoNCiMjIEluZGl2aWR1YWwgcmF0aW5nIGJ5IG1vdmllDQoNCmBgYHtyIEJhdGlsYS1wcmVmZXJlbmNlfQ0KDQojIEJhdGlsYSdzIHByZWZlcmVuY2UNCg0KYmF0aWxhX21vdmllIDwtIG1vdmllcyAlPiUNCiAgZmlsdGVyKEZyaWVuZF9uYW1lID09ICJCYXRpbGEiKQ0KDQpnZ3Bsb3QoZGF0YSA9IGJhdGlsYV9tb3ZpZSwgYWVzKCB5ID0gTW92aWVfdGl0bGUsIHggPSBSYXRpbmcsIGZpbGwgPSBSYXRpbmcpKSArIGdlb21fYmFyKHN0YXQgPSAnaWRlbnRpdHknKQ0KDQoNCg0KYGBgDQoNCmBgYHtyIE11c2VuZ2llLXByZWZlcmVuY2V9DQoNCiMgTXVzZW5naWUncyBwcmVmZXJlbmNlDQoNCm11c2VuZ2llX21vdmllIDwtIG1vdmllcyAlPiUNCiAgZmlsdGVyKEZyaWVuZF9uYW1lID09ICJNdXNlbmdpZSIpDQoNCmdncGxvdChkYXRhID0gbXVzZW5naWVfbW92aWUsIGFlcyggeSA9IE1vdmllX3RpdGxlLCB4ID0gUmF0aW5nLCBmaWxsID0gUmF0aW5nKSkgKyBnZW9tX2JhcihzdGF0ID0gJ2lkZW50aXR5JykNCg0KDQoNCmBgYA0KDQoNCg0KYGBge3IgSm9uYXRoYW4tcHJlZmVyZW5jZX0NCg0KIyBKb25hdGhhbidzIHByZWZlcmVuY2UNCg0Kam9uYXRoYW5fbW92aWUgPC0gbW92aWVzICU+JQ0KICBmaWx0ZXIoRnJpZW5kX25hbWUgPT0gIkpvbmF0aGFuIikNCg0KZ2dwbG90KGRhdGEgPSBqb25hdGhhbl9tb3ZpZSwgYWVzKCB5ID0gTW92aWVfdGl0bGUsIHggPSBSYXRpbmcsIGZpbGwgPSBSYXRpbmcpKSArIGdlb21fYmFyKHN0YXQgPSAnaWRlbnRpdHknKQ0KDQoNCg0KYGBgDQoNCg0KDQpgYGB7ciBIZXJ2ZS1wcmVmZXJlbmNlfQ0KDQojIEhlcnZlJ3MgcHJlZmVyZW5jZQ0KDQpoZXJ2ZV9tb3ZpZSA8LSBtb3ZpZXMgJT4lDQogIGZpbHRlcihGcmllbmRfbmFtZSA9PSAiSGVydmUiKQ0KDQpnZ3Bsb3QoZGF0YSA9IGhlcnZlX21vdmllLCBhZXMoIHkgPSBNb3ZpZV90aXRsZSwgeCA9IFJhdGluZywgZmlsbCA9IFJhdGluZykpICsgZ2VvbV9iYXIoc3RhdCA9ICdpZGVudGl0eScpDQoNCg0KYGBgDQoNCg0KIyMgTWFWJ3MgcHJlZmVyZW5jZQ0KDQpgYGB7ciBNYVYtcHJlZmVyZW5jZX0NCk1hVl9tb3ZpZSA8LSBtb3ZpZXMgJT4lDQogIGZpbHRlcihGcmllbmRfbmFtZSA9PSAiTWFWIikNCg0KZ2dwbG90KGRhdGEgPSBNYVZfbW92aWUsIGFlcyggeSA9IE1vdmllX3RpdGxlLCB4ID0gUmF0aW5nLCBmaWxsID0gUmF0aW5nKSkgKyBnZW9tX2JhcihzdGF0ID0gJ2lkZW50aXR5JykNCg0KDQpgYGANCiMjIEVhY2ggZnJpZW5kJ3Mgb3ZlcmFsbCByYXRpbmcNCg0KYGBge3J9DQoNCm1vdmllcyAlPiUNCiAgZ3JvdXBfYnkoRnJpZW5kX25hbWUpICU+JQ0KICBzdW1tYXJpc2UobWVhbl9yYXRpbmcgPSBtZWFuKFJhdGluZykpICU+JQ0KICBnZ3Bsb3QoLiwgYWVzKHkgPSBGcmllbmRfbmFtZSwgeD0gbWVhbl9yYXRpbmcsIGZpbGwgPSBtZWFuX3JhdGluZykpICsgZ2VvbV9iYXIoc3RhdCA9ICdpZGVudGl0eScpDQpgYGANCg0KDQoNCiMjIENvbmNsdXNpb24gJiByZWNvbW1lbmRhdGlvbg0KDQo8c3R5bGU+DQpkaXYuYXF1YW1hcmluZSB7IGJhY2tncm91bmQtY29sb3I6IzdmZmZkNDsgYm9yZGVyLXJhZGl1czogMTBweDsgcGFkZGluZzogNXB4O30NCjwvc3R5bGU+DQo8ZGl2IGNsYXNzID0gImFxdWFtYXJpbmUiPg0KDQoNCkkgcmVhbGl6ZWQgdGhhdCBCYXRpbGEgaGFzIHdhdGNoZWQgYWxsIHRoZSAgb3ZpZXMgYW5kIGhpcyBvdmVyYWxsIHJhdGluZyANCmZvciB0aG9zZSBtb3ZpZXMgd2VyZSB2ZXJ5IGhpZ2guIE5vdGhpbmcgSSBjYW4gcmVjb21tZW5kIHRvIGhpbSBmcm9tIG15IA0KbGlzdCBvZiBzaXggbW92aWVzLiBNdXNlbmdpZSBoYWQgd2F0Y2hlZCA0IG91dCA2IG1vdmllcyBidXQgaGVyIHJhdGluZyBpcyBraW5kDQpvZiB0aGUgYXZlcmFnZSBzbyBJIHdvdWxkIHRvIHJlY29tbWVuZCBhIHZlcnkgZ29vZCBtb3ZpZSB0byBoZXIuDQpXaGVuIGl0IGNvbWVzIHRvIG1vdmllLCAiQW1lcmljYW4gc29uIiBoYWQgYW4gYXZlcmFnZSBsb3cgcmF0aW5nIGp1c3QgYmVjYXVzZSANCml0IHdhcyB3YXRjaGVkIG9ubHkgYnkgdHdvIHR3byBwZW9wbGUgYWx0aG91Z2ggdGhlIGZyaWVuZHMgd2hvJ2Qgc2VlbiwgbGlrZSBpdC4NCkl0IGlzIGEgZ29vZCBvcHRpb24gdG8gcmVjb21tZW5kIHRvIG90aGVycy4NCg0KPC9kaXY+IFxoZmlsbFxicmVhaw0KDQoNCg0KYGBge3J9DQpsaWJyYXJ5KGthYmxlRXh0cmEpDQoNCiMgQ29uc3RydWN0IGEgdGFibGUgZnJhbWUgZm9yIGZyaWVuZHMgYW5kIHJlY29tbWVuZGVkIG1vdmllcw0KDQpmIDwtIGMoIk11c2VuZ2llIiwgIkpvbmF0aGFuIiwgIkhlcnZlIiwgIk1hViIpDQptIDwtIGMoIlNpeCBVbmRlcmdyb3VuZCIsICJBbWVyaWNhbiBzb24iLCAiQW1lcmljYW4gc29uIiwgIlNwZW5jZXIgY29uZmlkZW50aWFsIikNCg0KDQpkZiA8LSBkYXRhLmZyYW1lKGYsbSkNCg0KDQpjb2xuYW1lcyhkZikgPC0gYygiRnJpZW5kJ3MgbmFtZSIsICJSZWNvbW1lbmRlZCBNb3ZpZSIpDQoNCg0KICANCiMgTW92aWUgcmVjb21lbmRhdGlvbiB0YWJsZQ0KDQpkZiAlPiUNCiAga2JsKGNhcHRpb24gPSAiTW92aWUgcmVjb21tZW5kYXRpb24iLCBhbGlnbiA9ICdjJykgJT4lDQogIGthYmxlX21hdGVyaWFsKGMoInN0cmlwZWQiLCAiaG92ZXIiKSkgJT4lDQogIHJvd19zcGVjKDAsIGNvbG9yID0gImluZGlnbyIpDQoNCmBgYA0KDQoNCg0KDQo=