Assignment on RPubs
Rmd on Github
The database created for this assignment is called movies. A mysqldump of the database was taken from a MariaDB Server 5.5.64. However, the dump would be compatible with a MySQL server. To recreate the database, one would only need to create a new database CREATE DATABASE my_new_db (where my_new_db is the name of your desired database) with an account that has privileges to create databases and import the database scheme and data by executing the command mysql -u my_username -p my_new_db < movie.sql (where my_username is a user account that has privileges for my_new_db). This will recreate the database in my_new_db.
The following code calls the RMySQL library and creates a connection to a locally hosted MySQL database. After the connection is created, it loads the data into an R dataframe. If a moviegoer has not seen a movie, I stored the value as NULL on MySQL. When the data is imported into R, the NULL is listed as an NA.
library(RMySQL)
## Loading required package: DBI
db_user <- 'sslee'
db_password <- 'password#2020'
db_name <- 'movies'
db_host <- '192.168.148.128'
db_port <- 3306
mydb <- dbConnect(MySQL(), user = db_user, password = db_password, dbname = db_name, host = db_host, port = db_port)
ratingsData <- dbGetQuery(mydb,
"SELECT A.userId, A.review, B.movieTitle
FROM Movie AS B RIGHT JOIN Review AS A
ON A.movieId = B.movieId"
)
ratingsData
library(ggplot2)
qplot(review, movieTitle, data = ratingsData, facets = ~userId)
## Warning: Removed 13 rows containing missing values (geom_point).
ggplot(ratingsData, aes(x=movieTitle, y=review)) +
geom_bar(stat = "identity") +
coord_flip()
## Warning: Removed 13 rows containing missing values (position_stack).
The NA data did throw warnings in functions when analyzing the movie review dataframe. Overall, the individuals I surveyed rarely had a chance or demonstrated interest in watching all of the movies listed. I can only conjecture that the population surveyed did not watch all of the movies because of lack of time, lack of interest, or were not interested in watching movies in theaters.