In this week’s assignment, I had chosen 10 recent movies and had asked my family and friends to give a 1 to 5 rating. The information is saved in an SQL that has been uploaded to GitHub, which can be downloaded.
https://raw.githubusercontent.com/jcp9010/MSDA/master/movies.sql
Below is connecting the SQL database to R via package RMySQL.
library(RMySQL)
## Loading required package: DBI
#Creating a connection with the MySQL Database and showing available tables
#To access my MySQL, there is an and password associated on my local machine.
#For reproducibility, the user = "root", password = "MSDA" may need to be replaced with your user and password or deleted from your device, depending on the way your MySQL was set up.
con <- dbConnect(RMySQL::MySQL(), dbname = "movies", user = "root", password = "MSDA", host = "localhost")
dbListTables(con)
## [1] "complete_ratings" "movies" "ratings"
## [4] "reviewers" "sample"
# Assigning data.frame variables in R from MySQL
reviewers.query <- "SELECT * FROM reviewers;"
movies.query <- "SELECT * FROM movies;"
ratings.query <- "SELECT * FROM ratings;"
complete_ratings.query <- "SELECT * FROM complete_ratings;"
reviewers <- dbGetQuery(con, reviewers.query)
movies <- dbGetQuery(con, movies.query)
ratings <- dbGetQuery(con, ratings.query)
# In this case, I will demonstrate another way of obtaining a query from MySQL into R using the dbSendQuery.
rs <- dbSendQuery(con, complete_ratings.query)
complete_ratings <- dbFetch(rs)
dbHasCompleted(rs)
## [1] TRUE
dbClearResult(rs)
## [1] TRUE
# Now to close out the con connection.
dbDisconnect(con)
## [1] TRUE
complete_ratings
## name movie_name review
## 1 S.D Star Wars: The Force Awakens 5
## 2 S.D Mad Max: Fury Road 4
## 3 S.D Interstellar 5
## 4 S.D The Wolf of Wall Street NA
## 5 S.D Rogue One: A Star Wars Story NA
## 6 S.D The Accountant 5
## 7 S.D Keeping Up with the Joneses NA
## 8 S.D Ben-Hur 3
## 9 S.D Casion Royale 4
## 10 S.D Bridge of Spies 5
## 11 A.D Star Wars: The Force Awakens 5
## 12 A.D Mad Max: Fury Road NA
## 13 A.D Interstellar NA
## 14 A.D The Wolf of Wall Street 3
## 15 A.D Rogue One: A Star Wars Story NA
## 16 A.D The Accountant 4
## 17 A.D Keeping Up with the Joneses 2
## 18 A.D Ben-Hur 3
## 19 A.D Casion Royale 4
## 20 A.D Bridge of Spies 4
## 21 J.P Star Wars: The Force Awakens 5
## 22 J.P Mad Max: Fury Road 2
## 23 J.P Interstellar 5
## 24 J.P The Wolf of Wall Street 2
## 25 J.P Rogue One: A Star Wars Story 5
## 26 J.P The Accountant 4
## 27 J.P Keeping Up with the Joneses 3
## 28 J.P Ben-Hur 4
## 29 J.P Casion Royale 4
## 30 J.P Bridge of Spies 5
## 31 P.P Star Wars: The Force Awakens NA
## 32 P.P Mad Max: Fury Road 2
## 33 P.P Interstellar 3
## 34 P.P The Wolf of Wall Street NA
## 35 P.P Rogue One: A Star Wars Story 5
## 36 P.P The Accountant 4
## 37 P.P Keeping Up with the Joneses 4
## 38 P.P Ben-Hur 2
## 39 P.P Casion Royale 3
## 40 P.P Bridge of Spies 5
## 41 C.G. Star Wars: The Force Awakens 4
## 42 C.G. Mad Max: Fury Road 5
## 43 C.G. Interstellar 4
## 44 C.G. The Wolf of Wall Street 5
## 45 C.G. Rogue One: A Star Wars Story NA
## 46 C.G. The Accountant NA
## 47 C.G. Keeping Up with the Joneses NA
## 48 C.G. Ben-Hur 3
## 49 C.G. Casion Royale 4
## 50 C.G. Bridge of Spies 4
## 51 C.H. Star Wars: The Force Awakens 4
## 52 C.H. Mad Max: Fury Road NA
## 53 C.H. Interstellar 4
## 54 C.H. The Wolf of Wall Street 1
## 55 C.H. Rogue One: A Star Wars Story 4
## 56 C.H. The Accountant 3
## 57 C.H. Keeping Up with the Joneses 4
## 58 C.H. Ben-Hur NA
## 59 C.H. Casion Royale 5
## 60 C.H. Bridge of Spies 5
## 61 Craig Star Wars: The Force Awakens 4
## 62 Craig Mad Max: Fury Road NA
## 63 Craig Interstellar 4
## 64 Craig The Wolf of Wall Street 1
## 65 Craig Rogue One: A Star Wars Story 4
## 66 Craig The Accountant 3
## 67 Craig Keeping Up with the Joneses 4
## 68 Craig Ben-Hur NA
## 69 Craig Casion Royale 5
## 70 Craig Bridge of Spies 5
## 71 Jason Star Wars: The Force Awakens 5
## 72 Jason Mad Max: Fury Road 4
## 73 Jason Interstellar 4
## 74 Jason The Wolf of Wall Street 5
## 75 Jason Rogue One: A Star Wars Story 5
## 76 Jason The Accountant 3
## 77 Jason Keeping Up with the Joneses 4
## 78 Jason Ben-Hur 4
## 79 Jason Casion Royale 5
## 80 Jason Bridge of Spies 5
With the available information, this is an opportunity to perform some analysis of the information that is provided.
Below is some information to help characterize the age of the reviewers.
summary(reviewers$age)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 26.00 28.00 30.50 37.25 41.75 60.00
hist(reviewers$age, xlab = "Age", main = "Histogram of Reviewer's Age", breaks = 20)
Below are the number of movie genres for the movies reviewed.
genres <- table(movies$genre)
genres
##
## Action Comedy Crime Drama Sci-Fi
## 4 1 1 1 3
The next step is to find the average rating of each movie.
mov <- table(complete_ratings$movie_name, complete_ratings$review)
rating.scale <- c(1,2,3,4,5)
movie.sum <- vector()
for (i in 1:nrow(mov)){
total <- 0
j <- 1
for (j in 1:5){
total <- total + mov[i,j]*rating.scale[j]
j <- j + 1
}
movie.sum <- c(movie.sum, total)
}
movie.total.count <- vector()
for (i in 1:nrow(mov)){
movie.total.count[i] <- sum(mov[i,])
}
movie.avg <- data.frame(Movie = unique(complete_ratings$movie_name), Ratings = movie.sum/movie.total.count)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.2
pl <- ggplot(movie.avg, aes(x= Movie, y= Ratings))
pl2 <- pl + geom_point(aes(color = 'blue', alpha = 0.5, size = 5)) + ggtitle("Average Ratings for the Movies")
pl3 <- pl2 + theme(axis.text.x = element_text(angle = 90, hjust = 1))
pl3
Above were some examples how databases can be imported from MySQL, and the data can be manipulated and analyzed in R.