Assignment 2 : SQL and R

Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe. Your deliverables should include your SQL scripts and your R Markdown code, posted to GitHub.

Import Library

library(RMySQL)
## Loading required package: DBI

Read the database from MySQL

moviedb = dbConnect(MySQL(),user = 'root', password = pwd, dbname = 'data607', host='localhost')
dbListTables(moviedb)
## [1] "friends"     "movierating"
dbListFields(moviedb, 'Friends')
## [1] "friend_id"   "friend_name"
dbListFields(moviedb, 'MovieRating')
## [1] "movie_name"   "friend_id"    "movie_rating"

Join the tables in SQL and storing the result into R

ratings <- dbGetQuery(moviedb, 'SELECT m.movie_name, f.friend_name, m.movie_rating FROM movierating m 
          LEFT JOIN friends f on m.friend_id = f.friend_id 
          ORDER BY m.movie_rating desc')

Check if there are any null values in the dataset

any(is.null(ratings))
## [1] FALSE

According to the result, there are no NULL values and we can move on.

Display the Movie Rating table by ratings

library(knitr)
library(kableExtra)
kable(ratings, "html", caption = "Ratings of 6 Movies from 6 Friends") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "bordered", "condensed")) %>%
  column_spec(3, bold = T) %>%
  row_spec(row=0, background = "yellow") %>%
  scroll_box(width = "600px", height = "400px")
Ratings of 6 Movies from 6 Friends
movie_name friend_name movie_rating
Toy Story 4 Karen 5
Detective Pikachu Karen 5
Detective Pikachu Paul 5
Alita Cindy 5
Glass Jack 4
Glass Cindy 4
Black Panther Paul 4
Glass Paul 4
Toy Story 4 Louis 4
Toy Story 4 Paul 4
Alita Karen 4
Black Panther Karen 4
Glass Louis 4
Aquaman Jack 4
Black Panther Jack 4
Alita Chloe 4
Alita Paul 3
Alita Louis 3
Black Panther Louis 3
Black Panther Chloe 3
Alita Jack 3
Glass Karen 3
Aquaman Cindy 3
Detective Pikachu Cindy 3
Aquaman Paul 2
Aquaman Louis 2
Toy Story 4 Chloe 2
Aquaman Chloe 2
Aquaman Karen 2
Detective Pikachu Chloe 2
Glass Chloe 2
Toy Story 4 Jack 2
Detective Pikachu Louis 1
Detective Pikachu Jack 1
Toy Story 4 Cindy 1
Black Panther Cindy 1

library(ggplot2)
ggplot(ratings, aes(x=reorder(movie_name, movie_rating),  y=movie_rating, fill = friend_name))  +
    geom_bar(stat="identity")  +
    geom_text(aes(label=movie_rating, legend="Friends"), size=3, position=position_stack(vjust=0.5)) +
    ggtitle("Ratings of 6 Movies by 6 Friends") +
    labs(x="Movies", y="Ratings") + guides(fill=guide_legend(title="Friends"))
## Warning: Ignoring unknown aesthetics: legend

Get the average rating of each movie and sort in descending order

average <- aggregate(movie_rating~movie_name, data=ratings, FUN=mean)
average[order(-average$movie_rating),]

Conclusion

From the two tables generated above, the ratings got from 6 Friends sorting from high to low is: Alita, Glass, Black Panther, Toy Story 4, Detective Pikachu, and Aquaman.

Disconnect my database

dbDisconnect(moviedb)
## [1] TRUE