Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe.
Your deliverables should include your SQL scripts and your R Markdown code, posted to GitHub.
SQL Code: DROP DATABASE IF EXISTS movie; CREATE database movie;
CREATE TABLE users (userID int not null auto_increment, firstname varchar(20), lastname varchar(20), primary key (userID));
INSERT INTO users (firstname, lastname) VALUES (‘Fawad’,‘Younus’), (‘Saad’,‘Younus’), (‘Mariam’,‘Shabbir’), (‘Wajidul’,‘Hussain’), (‘Atib’,‘Ali’);
CREATE TABLE movies ( movieID int not null auto_increment, title varchar (30), genre varchar(20), RottenTom float, primary key (movieID));
INSERT INTO movies (title, genre, RottenTom) VALUES (‘Crazy Rich Asians’,‘Action’,9.3), (‘The Meg’,‘Action’,4.6), (‘Mission Impossible’,‘Action’,9.7), (‘Christopher Robin’,‘Family’,7.0), (‘The Nun’,‘Horror’,2.8), (‘Incredibles 2’,‘Animation’,9.4);
CREATE TABLE reviews( userID int, movieID int, userrating int);
INSERT INTO reviews (userID, movieID, userrating) VALUES (1,1,4),(1,2,4),(1,3,3),(1,4,4),(1,5,4),(1,6,5), (2,1,3),(2,2,3),(2,3,2),(2,4,5),(2,5,5),(2,6,3), (3,1,5),(3,2,5),(3,3,5),(3,4,4),(3,5,5),(3,6,3), (4,1,4),(4,2,3),(4,3,2),(4,4,2),(4,5,4),(4,6,3), (5,1,5),(5,2,3),(5,3,3),(5,4,4),(5,5,4),(5,6,3);
SELECT users.firstname, users.lastname, movies.title, movies.genre, movies.RottenTom, reviews.userrating from reviews JOIN users on users.userID = reviews.userID JOIN movies on reviews.movieID = movies.movieID ORDER BY title;
Load Library
library(RMySQL)
## Loading required package: DBI
Pulling SQL database into R studio:
rmysql.settingsfile<-"/etc/my.cnf"
con <- dbConnect(RMySQL::MySQL(), default.file=rmysql.settingsfile, dbname = "sys", username='root', password='Veloster76')
sql <- "SELECT users.firstname, users.lastname, movies.title, movies.genre, movies.RottenTom, reviews.userrating from reviews JOIN users on users.userID = reviews.userID JOIN movies on reviews.movieID = movies.movieID ORDER BY title;"
res <- dbGetQuery(con, sql)
res
## firstname lastname title genre RottenTom userrating
## 1 Fawad Younus Christopher Robin Family 7.0 4
## 2 Saad Younus Christopher Robin Family 7.0 5
## 3 Mariam Shabbir Christopher Robin Family 7.0 4
## 4 Wajidul Hussain Christopher Robin Family 7.0 2
## 5 Atib Ali Christopher Robin Family 7.0 4
## 6 Fawad Younus Crazy Rich Asians Action 9.3 4
## 7 Saad Younus Crazy Rich Asians Action 9.3 3
## 8 Mariam Shabbir Crazy Rich Asians Action 9.3 5
## 9 Wajidul Hussain Crazy Rich Asians Action 9.3 4
## 10 Atib Ali Crazy Rich Asians Action 9.3 5
## 11 Fawad Younus Incredibles 2 Animation 9.4 5
## 12 Saad Younus Incredibles 2 Animation 9.4 3
## 13 Mariam Shabbir Incredibles 2 Animation 9.4 3
## 14 Wajidul Hussain Incredibles 2 Animation 9.4 3
## 15 Atib Ali Incredibles 2 Animation 9.4 3
## 16 Fawad Younus Mission Impossible Action 9.7 3
## 17 Saad Younus Mission Impossible Action 9.7 2
## 18 Mariam Shabbir Mission Impossible Action 9.7 5
## 19 Wajidul Hussain Mission Impossible Action 9.7 2
## 20 Atib Ali Mission Impossible Action 9.7 3
## 21 Fawad Younus The Meg Action 4.6 4
## 22 Saad Younus The Meg Action 4.6 3
## 23 Mariam Shabbir The Meg Action 4.6 5
## 24 Wajidul Hussain The Meg Action 4.6 3
## 25 Atib Ali The Meg Action 4.6 3
## 26 Fawad Younus The Nun Horror 2.8 4
## 27 Saad Younus The Nun Horror 2.8 5
## 28 Mariam Shabbir The Nun Horror 2.8 5
## 29 Wajidul Hussain The Nun Horror 2.8 4
## 30 Atib Ali The Nun Horror 2.8 4
Mean of reviews vs RT:
View1=aggregate(res[, 6], list(res$title, res$RottenTom), mean)
colnames(View1) <- c("Move_Title", "User_Rank","RT_Rank")
View1
## Move_Title User_Rank RT_Rank
## 1 The Nun 2.8 4.4
## 2 The Meg 4.6 3.6
## 3 Christopher Robin 7.0 3.8
## 4 Crazy Rich Asians 9.3 4.2
## 5 Incredibles 2 9.4 3.4
## 6 Mission Impossible 9.7 3.0