Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe.
library(RMySQL)
## Loading required package: DBI
drv <- dbDriver("MySQL")
con <- dbConnect(drv, user="root", password = "password", dbname="movie", host="localhost")
#All the data
sql<-"select a.id, a.title,a.genre,b.personName,b.score
from movie a inner join review b
on a.id=b.movieID
order by title
"
df <- data.frame(dbGetQuery(con, sql))
df
## id title genre personName score
## 1 5 Blink Of An Eye Documentary paul 4
## 2 5 Blink Of An Eye Documentary Linda 4
## 3 5 Blink Of An Eye Documentary Stacy 3
## 4 5 Blink Of An Eye Documentary Andy 3
## 5 5 Blink Of An Eye Documentary Tom 3
## 6 1 Crazy Rich Asians Romance paul 5
## 7 1 Crazy Rich Asians Romance Linda 4
## 8 1 Crazy Rich Asians Romance Stacy 4
## 9 1 Crazy Rich Asians Romance Andy 3
## 10 1 Crazy Rich Asians Romance Tom 4
## 11 2 Good Boys Adventure paul 4
## 12 2 Good Boys Adventure Linda 3
## 13 2 Good Boys Adventure Stacy 3
## 14 2 Good Boys Adventure Andy 3
## 15 2 Good Boys Adventure Tom 3
## 16 6 Overcomer Drama paul 2
## 17 6 Overcomer Drama Linda 1
## 18 6 Overcomer Drama Stacy 2
## 19 6 Overcomer Drama Andy 2
## 20 6 Overcomer Drama Tom 1
## 21 4 Scary Stories Horror paul 3
## 22 4 Scary Stories Horror Linda 4
## 23 4 Scary Stories Horror Stacy 5
## 24 4 Scary Stories Horror Andy 4
## 25 4 Scary Stories Horror Tom 4
## 26 3 Spider-Man: Far from Home Sci-Fi paul 5
## 27 3 Spider-Man: Far from Home Sci-Fi Linda 4
## 28 3 Spider-Man: Far from Home Sci-Fi Stacy 5
## 29 3 Spider-Man: Far from Home Sci-Fi Andy 5
## 30 3 Spider-Man: Far from Home Sci-Fi Tom 5
#Table with the average rating
sql2<-"select a.id, a.title, a.genre,avg(score)
from movie a inner join review b
on a.id=b.movieID
group by title
order by id
"
df2 <- data.frame(dbGetQuery(con, sql2))
## Warning in .local(conn, statement, ...): Decimal MySQL column 3 imported as
## numeric
df2
## id title genre avg.score.
## 1 1 Crazy Rich Asians Romance 4.0
## 2 2 Good Boys Adventure 3.2
## 3 3 Spider-Man: Far from Home Sci-Fi 4.8
## 4 4 Scary Stories Horror 4.0
## 5 5 Blink Of An Eye Documentary 3.4
## 6 6 Overcomer Drama 1.6
We use the bar to compare the average score of the rating.
summary(df)
## id title genre personName
## Min. :1.0 Length:30 Length:30 Length:30
## 1st Qu.:2.0 Class :character Class :character Class :character
## Median :3.5 Mode :character Mode :character Mode :character
## Mean :3.5
## 3rd Qu.:5.0
## Max. :6.0
## score
## Min. :1.0
## 1st Qu.:3.0
## Median :4.0
## Mean :3.5
## 3rd Qu.:4.0
## Max. :5.0
dim(df)
## [1] 30 5
barplot(df2$avg.score.~df2$id, xlab="Movies",ylab="Score",main="Average score of the Rating",names.arg = c("CrazyRich", "GoodBoys", "SpiderMan", "Scary", "BlinkEye", "Overcomer"))