Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe.
Connecting to MySql and reading in the reviews and movies tables from the movies database.
library(DBI)
library(RMySQL)
movie_db = dbConnect(MySQL(), user='root', password=dbPassword, dbname='Movies', host='localhost')
reviews <- dbGetQuery(movie_db, "SELECT * FROM reviewers ORDER BY movie_ID, UsersName")
movies <- dbGetQuery(movie_db, "SELECT movies.Title, AVG(movies.Rotten_Rating), AVG(reviewers.Reviewer_Rating) FROM movies INNER JOIN reviewers ON movies.movie_ID = reviewers.movie_ID GROUP BY movies.Title")Each person gave a rating and a review for each of the 10 recent popular movies.
str(reviews)## 'data.frame': 60 obs. of 4 variables:
## $ movie_ID : int 1 1 1 1 1 1 2 2 2 2 ...
## $ UsersName : chr "Hank" "Jessica" "Kelvin" "Rose" ...
## $ Reviewer_Rating: num 5 5 4 3 1 4 2 4 2 4 ...
## $ Review : chr "Impossible no more!" "Love Deadpool" "Nothing is impossible anymore." "Action Packed!" ...
colnames(reviews) <- c('MovieID', 'ReviewerName', 'Rating', 'Review')
reviews## MovieID ReviewerName Rating Review
## 1 1 Hank 5 Impossible no more!
## 2 1 Jessica 5 Love Deadpool
## 3 1 Kelvin 4 Nothing is impossible anymore.
## 4 1 Rose 3 Action Packed!
## 5 1 Sarah 1 Not a fan
## 6 1 Tristan 4 Entertainment Machine
## 7 2 Hank 2 Again?
## 8 2 Jessica 4 Pretty Good
## 9 2 Kelvin 2 Same old same old
## 10 2 Rose 4 Bring on the sequel!
## 11 2 Sarah 3 Same movie all over again
## 12 2 Tristan 2 70s Disaster movie run amock
## 13 3 Hank 4 Great!
## 14 3 Jessica 3 Nice
## 15 3 Kelvin 3 Not great, but not bad.
## 16 3 Rose 5 Girl power
## 17 3 Sarah 5 Love it!
## 18 3 Tristan 4 Slick and smart
## 19 4 Hank 2 Boring!
## 20 4 Hank 5 When is the next movie?
## 21 4 Jessica 2 OK
## 22 4 Kelvin 3 Interesting new take
## 23 4 Rose 1 Hate Star Wars
## 24 4 Sarah 1 Skip it
## 25 4 Tristan 5 Hilarious
## 26 5 Jessica 4 Wakanda Forever!
## 27 5 Kelvin 2 Where will it end?
## 28 5 Rose 4 Who is Captain Marvel?????
## 29 5 Sarah 3 Marvel all over again
## 30 5 Tristan 4 Quick Witted
## 31 6 Hank 5 Awesome!
## 32 6 Jessica 5 Irreverent
## 33 6 Kelvin 4 Love Deadpool!
## 34 6 Rose 5 Deadpools gonna fix the timeline!
## 35 6 Sarah 4 Blows my mind
## 36 6 Tristan 4 Depths of human depravity
## 37 7 Hank 4 Strong performances
## 38 7 Jessica 1 Just ok
## 39 7 Kelvin 4 Sticks with you.
## 40 7 Rose 4 Intelligent
## 41 7 Sarah 5 Perfect acting!
## 42 7 Tristan 2 Perposterous!
## 43 8 Hank 2 Not scary
## 44 8 Jessica 3 Gothic horror
## 45 8 Kelvin 5 Mesmerizing!
## 46 8 Rose 4 Spooky
## 47 8 Sarah 3 Unnerving
## 48 8 Tristan 4 Slow burn
## 49 9 Hank 2 Dry and Stodgy
## 50 9 Jessica 2 Never comes alive
## 51 9 Kelvin 1 Obvious
## 52 9 Rose 4 Gentle, quiet film
## 53 9 Sarah 2 Generic
## 54 9 Tristan 4 Small, sad, sweet.
## 55 10 Hank 4 Cheeky!
## 56 10 Jessica 5 Inspiring
## 57 10 Kelvin 3 Intriging portrait
## 58 10 Rose 5 Riveting!
## 59 10 Sarah 4 Memorable
## 60 10 Tristan 4 Informative
Each movie is listed with the Rotten Tomatoes average rating and the sample reviewers average ratings.
str(movies)## 'data.frame': 10 obs. of 3 variables:
## $ Title : chr "Mission Impossible: Fallout" "Jurrassic World: Fallen Kingdom" "Oceans 8" "Solo: A Star Wars Story" ...
## $ AVG(movies.Rotten_Rating) : num 4.3 3.3 3 3.5 4.5 ...
## $ AVG(reviewers.Reviewer_Rating): num 3.67 2.83 4 2.71 3.4 ...
colnames(movies) <- c('Title', 'RottenTomatesRating', 'AvgReviewerRating')
movies## Title RottenTomatesRating AvgReviewerRating
## 1 Mission Impossible: Fallout 4.3 3.666667
## 2 Jurrassic World: Fallen Kingdom 3.3 2.833333
## 3 Oceans 8 3.0 4.000000
## 4 Solo: A Star Wars Story 3.5 2.714286
## 5 Avengers: Infinity War 4.5 3.400000
## 6 Deadppol 2 4.1 4.500000
## 7 The Wife 3.8 3.333333
## 8 The Little Stranger 3.4 3.500000
## 9 The Bookshop 3.0 2.500000
## 10 RBG 3.8 4.166667
We plot the ratings side by side to compare the Rotten Tomatoes ratings with our sample reviewers ratings.
colours = c("red","green")
par(mar=c(5,9,4,4))
barplot(t(as.matrix(movies[2:3])),
main="Movie Ratings", xlab = "Average Rating",
names.arg = movies$Title, horiz = TRUE, las = 1,
cex.lab = 1, cex.main = 1.4,
cex.names = 0.5,
beside=TRUE, col=colours)
par(xpd=TRUE)
legend(-2,-4, c("Rotten Tomatoes","Personal Reviewer"),
cex=0.8, bty="n", fill=colours)