Introduction:

This rpub contains data for several movies with 5 user reviews for each. All users are rating the same movies.

This R markdown uses Azure MySQL to retrieve data. RPubs to Azure MySQL Connection The MySQL Database Schema looks as follows: There is a movies table, a users table, and a movieratings table. There is a many to many relationship between users and movieratings.

### Install Required Packages
install.packages("DBI", repos = "http://cran.us.r-project.org")
install.packages("tidyverse",repos = "http://cran.us.r-project.org")
install.packages("magrittr", repos = "http://cran.us.r-project.org")
install.packages("dplyr", repos = "http://cran.us.r-project.org")

### Load Required Packages
library(DBI)
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.3.0
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
library(dplyr)

In this section we will make a connection to a Azure MySQL hosting the schema referenced above

### Create SQL Connection
con <- dbConnect(RMariaDB::MariaDB(), user='rreadonly@cunyspsmysql', password="", dbname='moviereviews' ,host='cunyspsmysql.mysql.database.azure.com')

### Get a list of all SQL tables
dbListTables(con)
## [1] "movierating"       "movies"            "users"            
## [4] "users_movierating"

In this section we will pull the revelent information into a dataframe.

## Gather all important fields from the SQL DB for analysis
mysql_movies <- dbGetQuery(con, 'select u.username, u.firstname, u.lastname, m.title, m.category, m.length, mr.rating, mr.review from users as u
join users_movierating as umr on u.userid = umr.userid
                           join movierating as mr on umr.movieratingid = mr.movieratingid
                           join movies as m on mr.movieid = m.movieid;')

## Let's look at the first 10 rows
head(mysql_movies, n= 10)
##    username firstname lastname                       title  category
## 1  RussellS   Russell        S Mission Impossible: Fallout    action
## 2  RussellS   Russell        S      Avengers: Infinity War    action
## 3  RussellS   Russell        S             The Quiet Place    horror
## 4  RussellS   Russell        S                  DeadPool 2    comedy
## 5  RussellS   Russell        S               Incredibles 2 animation
## 6  RussellS   Russell        S             The Equalizer 2     crime
## 7     JoelH      Joel        H Mission Impossible: Fallout    action
## 8     JoelH      Joel        H      Avengers: Infinity War    action
## 9     JoelH      Joel        H             The Quiet Place    horror
## 10    JoelH      Joel        H                  DeadPool 2    comedy
##    length rating
## 1     148      5
## 2     149      2
## 3      90      5
## 4     119      5
## 5     118      5
## 6     121      3
## 7     148      4
## 8     149      5
## 9      90      4
## 10    119      3
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              review
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                       The best thrilling movie ever made! With the best movie character ever portrayed: Tom Cruise as Ethan Hunt!
## 2                                                                                                                                                                                                                                                                                                                                                              At stake, as ever, is the fate of the universe. Why must it always be the universe? What's wrong with the fate of Hackensack? Doesn't anyone care what happens to South Dakota, or Denmark, or Peru?
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                          This is a movie about the sound of fear, but it gives us a great deal more to listen to.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Deadpool 2 is an R-rated, potty-mouthed splatterfest and a funny one.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       How great to see them back in this funny, exciting, if less audacious film.
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Once it's over, you won't necessarily be itching for an Equalizer 3.
## 7  While Mission Impossible will certainly require rewatching a couple of times to fully fathom the convolutions of the plot, its nonetheless one very well constructed and directed peice of movie making. The tension never lets up and it kept me intrigued right through to the final action set piece that's so ludicrous I didn't know whether to laugh, cheer or throw something at the screen. The accompanying score goes a long way to making it more acceptable and to be fair this is still my favourite movie in the franchise and I do like them all.
## 8                                                                                                                                                                                                                                                                                                                           Never has the Marvel Cinematic Universe seemed like more of a universe, in ways both good and bad. Infinity War-the title is almost too apt-is far from a perfect movie, but it is probably close to the best movie it could have been.
## 9                                                                                                                                                                                                                                                                                                                                                                                                                                                      A Quiet Place is an undoubtedly taxing affair for the nerves; fortunately, it's also a deeply affecting one.
## 10                                                                                                                                                                                                                                                                                                                                                  For all the impulsive flamboyance of Deadpool's patter, the liberating power of personal virtue, and the disinhibiting promise of second chances, Deadpool 2 feels narrowly impersonal and oppressively unfree.
## Check the dimensions of the data
dim(mysql_movies)
## [1] 30  8
## Let's view our table headers
names(mysql_movies)
## [1] "username"  "firstname" "lastname"  "title"     "category"  "length"   
## [7] "rating"    "review"

I want to group and create graphs based on the rating and create a chart. For this section, I will use dplyr

## Group the titles by name and average the scores  using dplyr.
avg_movie_rating <- mysql_movies %>% group_by(title) %>% summarise(AvgRating=mean(rating))

Graph: Average Movie Review for each title

## Graph movie titles vs the average score. 
gg <- ggplot(data = avg_movie_rating) + geom_point(mapping =aes(x= title, y=AvgRating, color = AvgRating))  
gg <- gg + theme(axis.text.x = element_text(angle = 90), axis.line = element_line(color = "blue", size = 1, linetype = "solid"))  + xlab("Movie Title") + ylab("Average User Review")
gg + ggtitle("Average Movie Score Review")

##

Now, let’s create a boxplot to determine how movie ratings compared for each category

## Group Movies by category and average score using dplyr
avg_movie_rating <- mysql_movies %>% group_by(title) %>% summarise(AvgRating=mean(rating))
gg2 <- ggplot(data = mysql_movies, aes(x =category, y=rating)) + geom_boxplot(fill='#A4A4A4', color='blue')
gg2 <- gg2 + theme(axis.text.x = element_text(angle = 0))  + xlab("Movie Category") + ylab("User Rating")
gg2 + ggtitle("User Rating per movie category")

Summary

Based on the sample of movies selected, it appears that animation and horror had a good year. We would need a much larger sample to determine which movies were the most popular.

dbDisconnect(con)