Our task is to choose six recent movies. Ask at least five people that we know to rate each of these movie that they have seen on a scale of 1 to 5. Take the results and store them in a SQL database. Load the information into an R dataframe.
Connect to MySQL database and load data
library(DBI)
con <- dbConnect(RMySQL::MySQL(),
user='root',
password='password',
dbname='movies',
host='127.0.0.1'
)
movies_df <- dbGetQuery(con, "SELECT * FROM movies;")
head(movies_df)
## name genre rating1 rating2 rating3 rating4 rating5
## 1 Padman Drama 5 4 4 3 1
## 2 The Ritual Horror 3 3 2 4 2
## 3 A Quite Place Sci-Fi 4 3 4 5 3
## 4 Ready Player One Action 4 3 4 5 3
## 5 The Last Airbender Action 1 1 1 1 2
## 6 Coco Animation 5 4 4 3 3
Which movie has the lowest average rating?
library(tidyverse)
# Add a new column of the average ratings
movies_df$avg_rating <- ((movies_df$rating1+movies_df$rating2+movies_df$rating3+movies_df$rating4+movies_df$rating5)/5)
# Visual Analysis
g <- ggplot(movies_df, aes(name, avg_rating)) +
geom_bar(stat = 'identity', aes(fill=movies_df$genre)) +
labs(title="Average Movie Rating", x="Name", y="Rating") +
ylim(0, 5) +
theme_light() +
scale_fill_discrete(name='Genre') +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
g