Description

Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe.

Load data into R

library(RMySQL)
## Loading required package: DBI
drv <- dbDriver("MySQL")
con <- dbConnect(drv, user="root", password = "password", dbname="movie", host="localhost")

Show two tables of data

#All the data
sql<-"select a.id, a.title,a.genre,b.personName,b.score
      from movie a inner join review b
      on a.id=b.movieID
     
      order by title
"
df <- data.frame(dbGetQuery(con, sql))
df
##    id                     title       genre personName score
## 1   5           Blink Of An Eye Documentary       paul     4
## 2   5           Blink Of An Eye Documentary      Linda     4
## 3   5           Blink Of An Eye Documentary      Stacy     3
## 4   5           Blink Of An Eye Documentary       Andy     3
## 5   5           Blink Of An Eye Documentary        Tom     3
## 6   1        Crazy Rich Asians      Romance       paul     5
## 7   1        Crazy Rich Asians      Romance      Linda     4
## 8   1        Crazy Rich Asians      Romance      Stacy     4
## 9   1        Crazy Rich Asians      Romance       Andy     3
## 10  1        Crazy Rich Asians      Romance        Tom     4
## 11  2                 Good Boys   Adventure       paul     4
## 12  2                 Good Boys   Adventure      Linda     3
## 13  2                 Good Boys   Adventure      Stacy     3
## 14  2                 Good Boys   Adventure       Andy     3
## 15  2                 Good Boys   Adventure        Tom     3
## 16  6                 Overcomer       Drama       paul     2
## 17  6                 Overcomer       Drama      Linda     1
## 18  6                 Overcomer       Drama      Stacy     2
## 19  6                 Overcomer       Drama       Andy     2
## 20  6                 Overcomer       Drama        Tom     1
## 21  4             Scary Stories      Horror       paul     3
## 22  4             Scary Stories      Horror      Linda     4
## 23  4             Scary Stories      Horror      Stacy     5
## 24  4             Scary Stories      Horror       Andy     4
## 25  4             Scary Stories      Horror        Tom     4
## 26  3 Spider-Man: Far from Home      Sci-Fi       paul     5
## 27  3 Spider-Man: Far from Home      Sci-Fi      Linda     4
## 28  3 Spider-Man: Far from Home      Sci-Fi      Stacy     5
## 29  3 Spider-Man: Far from Home      Sci-Fi       Andy     5
## 30  3 Spider-Man: Far from Home      Sci-Fi        Tom     5
#Table with the average rating 

sql2<-"select a.id, a.title, a.genre,avg(score)
      from movie a inner join review b
      on a.id=b.movieID
      group by title
      order by id
      "
df2 <- data.frame(dbGetQuery(con, sql2))
## Warning in .local(conn, statement, ...): Decimal MySQL column 3 imported as
## numeric
df2
##   id                     title       genre avg.score.
## 1  1        Crazy Rich Asians      Romance        4.0
## 2  2                 Good Boys   Adventure        3.2
## 3  3 Spider-Man: Far from Home      Sci-Fi        4.8
## 4  4             Scary Stories      Horror        4.0
## 5  5           Blink Of An Eye Documentary        3.4
## 6  6                 Overcomer       Drama        1.6

Analysis of the data

We use the bar to compare the average score of the rating.

summary(df)
##        id         title              genre            personName       
##  Min.   :1.0   Length:30          Length:30          Length:30         
##  1st Qu.:2.0   Class :character   Class :character   Class :character  
##  Median :3.5   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :3.5                                                           
##  3rd Qu.:5.0                                                           
##  Max.   :6.0                                                           
##      score    
##  Min.   :1.0  
##  1st Qu.:3.0  
##  Median :4.0  
##  Mean   :3.5  
##  3rd Qu.:4.0  
##  Max.   :5.0
dim(df)
## [1] 30  5
barplot(df2$avg.score.~df2$id, xlab="Movies",ylab="Score",main="Average score of the Rating",names.arg = c("CrazyRich", "GoodBoys", "SpiderMan", "Scary", "BlinkEye", "Overcomer"))