To load the SQL data into R, the dbConnect and dbGetQuery functions are used from the RMySQL package.
A variable, con, set as the DBIConnection object, is returned by dbConnect(). The SQL query language is introduced within the dbGetQuery function.
dbname <- "videos"
dbuser <- "root"
dbpass <- "elmstreet101"
dbhost <- "localhost"
dbport <- 3306
con <- dbConnect(RMySQL::MySQL(), host=dbhost, port=dbport, dbname=dbname, user=dbuser, password=dbpass)
movie_rate <- dbGetQuery(con, "select critic_ID, critic_name, 1_WW as WonderWoman, 2_BR as BladeRunner2049, 3_PA as War4PlanetApes, 4_AM as AmericanMade, 5_LJ as LastJedi, 6_IT as IT from critic_rate")
head(movie_rate)
Numeric summary is provided here:
movie_rate2 <- movie_rate
summary(movie_rate2)
## critic_ID critic_name WonderWoman BladeRunner2049
## Min. :1 Length:5 Min. :2 Min. :2.0
## 1st Qu.:2 Class :character 1st Qu.:4 1st Qu.:3.0
## Median :3 Mode :character Median :4 Median :4.0
## Mean :3 Mean :4 Mean :3.6
## 3rd Qu.:4 3rd Qu.:5 3rd Qu.:4.0
## Max. :5 Max. :5 Max. :5.0
## War4PlanetApes AmericanMade LastJedi IT
## Min. :2 Min. :3 Min. :2.0 Min. :2.0
## 1st Qu.:2 1st Qu.:3 1st Qu.:2.0 1st Qu.:2.0
## Median :3 Median :4 Median :3.0 Median :2.0
## Mean :3 Mean :4 Mean :2.8 Mean :2.8
## 3rd Qu.:4 3rd Qu.:5 3rd Qu.:3.0 3rd Qu.:3.0
## Max. :4 Max. :5 Max. :4.0 Max. :5.0
A horizontal bar graph shows the frequency of the ratings for each movie as follows:
bwidth <- 0.5
df <- data.frame(WW = movie_rate2$WonderWoman, BR = movie_rate2$BladeRunner2049, PA = movie_rate2$War4PlanetApes, AM = movie_rate2$AmericanMade, LJ = movie_rate2$LastJedi, IT = movie_rate$IT)
ggplot(melt(df), aes(value, fill = variable)) + coord_flip() + geom_histogram(binwidth=bwidth, position = "dodge") + xlab("Movie Ratings") + ylab("Freq") + ggtitle("Movies of 2017")
## No id variables; using all as measure variables
dbDisconnect(con)
## [1] TRUE
Disconnecting from MySQL.