Introduction The focus of this assignment is to create an R dataframe that shows rates of movies by gender of the population asked rating. Movies_db data will be sourced from a tb database in MySQL and combined with a CSV file of population data, located on GitHub. The final R dataframe will have the following columns:

Title gender_person Rating

library(RMySQL)
library(tidyverse)
library(dplyr)
library(DBI)

Getting and Preparing the Data

Step 1. Connect to MySQL and retrieve the tb dataset stored in a database table.

mydb = dbConnect(MySQL(), user='root', password='Albania777', dbname='movies_db', host='localhost')

Return the movies query below and store the results a dataframe called movies

movies.df <- dbGetQuery(mydb, "select title, gender_person,rating from movies_observ")
names(movies.df) 
## [1] "title"         "gender_person" "rating"
summary(movies.df)
##     title           gender_person          rating     
##  Length:12          Length:12          Min.   :2.000  
##  Class :character   Class :character   1st Qu.:2.750  
##  Mode  :character   Mode  :character   Median :3.500  
##                                        Mean   :3.417  
##                                        3rd Qu.:4.000  
##                                        Max.   :5.000
print(movies.df)
##                 title gender_person rating
## 1       The Lion King             F      5
## 2       The Lion King             M      3
## 3      A star is Born             F      5
## 4      A star is Born             M      3
## 5  Mission Impossible             M      2
## 6  Mission Impossible             F      4
## 7      Captain Marvel             M      3
## 8      Captain Marvel             F      4
## 9             Aladdin             F      4
## 10            Aladdin             M      4
## 11           Frozen 2             F      2
## 12           Frozen 2             M      2
qplot(title, rating, data=movies.df,xlab = "Rating", ylab = "Movie", main = "Individual Movie Rating by Gender") + facet_wrap(~gender_person) + theme(axis.text.x = element_text(angle = 90, hjust = 1))

ggplot(movies.df, aes(x = reorder(title, rating), y = rating, fill = title), xlab = 'Rating',  col = I("grey")) + geom_bar(stat = "identity") + 
  ggtitle("Movie Cummulative Ratings") +  labs(x = "Movie") +  coord_flip()

dbDisconnect(mydb)
## [1] TRUE