I will use a straightforward MySQL Connection to create a database called MoviesDatabase. Prior to that, I will remove the database MoviesDatabase if it already exists so that I have a clean database. Upon successful creation of MoviesDatabase, I will create three tables: Viewers, Movie, and Rating. I will use insert commands to load the necessary data onto the tables. Assuming that I have the file, RomerlElizes_Assignment2.sql at the same directory, the file contents are:
# display file contents to screen.
# ref: R Markdown: How do I show file contents. Website: https://stackoverflow.com/questions/29257985/r-markdown-how-do-i-show-file-contents
cat (readLines('RomerlElizes_Assignment2.sql'), sep = '\n')
## -- Romerl Elizes
## -- DATA 607
## -- Week 2 - Part 1 SQL Programming Assignment
##
## -- Part 1 - create and populate database based on given information.
## -- a. - drop and create database MoviesDatabase
## drop database if exists MoviesDatabase;
## create database MoviesDatabase;
## use MoviesDatabase;
##
## -- b. create Viewers table and insert values into table.
## create table Viewers (
## viewerid mediumint,
## name varchar(80),
## primary key (viewerid)
## );
## insert into Viewers (viewerid, name) VALUES (1,'Charles');
## insert into Viewers (viewerid, name) VALUES (2,'Liam');
## insert into Viewers (viewerid, name) VALUES (3,'Emily');
## insert into Viewers (viewerid, name) VALUES (4,'Eric');
## insert into Viewers (viewerid, name) VALUES (5,'Evelyn');
## insert into Viewers (viewerid, name) VALUES (6,'Steven');
##
## -- c. create Movie table and insert values into table.
## create table Movie (
## movieid varchar(2),
## name varchar(40),
## primary key (movieid)
## );
## insert into Movie (movieid, name) VALUES ('SP','Spaceballs');
## insert into Movie (movieid, name) VALUES ('BE','Better Off Dead');
## insert into Movie (movieid, name) VALUES ('LB','Legally Blonde');
## insert into Movie (movieid, name) VALUES ('PA','Police Academy');
## insert into Movie (movieid, name) VALUES ('GB','Ghostbusters');
## insert into Movie (movieid, name) VALUES ('GF','Goodfellas');
##
##
## -- d. create Rating association table and insert values into table.
## create table Rating (
## viewerid mediumint,
## movieid varchar(2),
## rating mediumint
## );
##
## -- d1. - ratings by Chalres
## insert into Rating (viewerid, movieid, rating) VALUES (1,'SP',4);
## insert into Rating (viewerid, movieid, rating) VALUES (1,'BE',3);
## insert into Rating (viewerid, movieid, rating) VALUES (1,'LB',2);
## insert into Rating (viewerid, movieid, rating) VALUES (1,'PA',2);
## insert into Rating (viewerid, movieid, rating) VALUES (1,'GB',1);
## insert into Rating (viewerid, movieid, rating) VALUES (1,'GF',1);
##
## -- d2. - ratings by Liam
## insert into Rating (viewerid, movieid, rating) VALUES (2,'SP',5);
## insert into Rating (viewerid, movieid, rating) VALUES (2,'BE',4);
## insert into Rating (viewerid, movieid, rating) VALUES (2,'LB',3);
## insert into Rating (viewerid, movieid, rating) VALUES (2,'PA',4);
## insert into Rating (viewerid, movieid, rating) VALUES (2,'GB',1);
## insert into Rating (viewerid, movieid, rating) VALUES (2,'GF',2);
##
## -- d3. - ratings by Emily
## insert into Rating (viewerid, movieid, rating) VALUES (3,'SP',2);
## insert into Rating (viewerid, movieid, rating) VALUES (3,'BE',4);
## insert into Rating (viewerid, movieid, rating) VALUES (3,'LB',5);
## insert into Rating (viewerid, movieid, rating) VALUES (3,'PA',3);
## insert into Rating (viewerid, movieid, rating) VALUES (3,'GB',4);
## insert into Rating (viewerid, movieid, rating) VALUES (3,'GF',1);
##
## -- d4. - ratings by Eric
## insert into Rating (viewerid, movieid, rating) VALUES (4,'SP',1);
## insert into Rating (viewerid, movieid, rating) VALUES (4,'BE',4);
## insert into Rating (viewerid, movieid, rating) VALUES (4,'LB',2);
## insert into Rating (viewerid, movieid, rating) VALUES (4,'PA',1);
## insert into Rating (viewerid, movieid, rating) VALUES (4,'GB',5);
## insert into Rating (viewerid, movieid, rating) VALUES (4,'GF',3);
##
## -- d5. - ratings by Evelyn
## insert into Rating (viewerid, movieid, rating) VALUES (5,'SP',3);
## insert into Rating (viewerid, movieid, rating) VALUES (5,'BE',2);
## insert into Rating (viewerid, movieid, rating) VALUES (5,'LB',4);
## insert into Rating (viewerid, movieid, rating) VALUES (5,'PA',1);
## insert into Rating (viewerid, movieid, rating) VALUES (5,'GB',5);
## insert into Rating (viewerid, movieid, rating) VALUES (5,'GF',1);
##
## -- d6. - ratings by Steven
## insert into Rating (viewerid, movieid, rating) VALUES (6,'SP',2);
## insert into Rating (viewerid, movieid, rating) VALUES (6,'BE',5);
## insert into Rating (viewerid, movieid, rating) VALUES (6,'LB',1);
## insert into Rating (viewerid, movieid, rating) VALUES (6,'PA',3);
## insert into Rating (viewerid, movieid, rating) VALUES (6,'GB',4);
## insert into Rating (viewerid, movieid, rating) VALUES (6,'GF',5);
When connecting to the database, I created a new database called MoviesDatabase with a user called school with no password. If you wish to optionally add a username or password, modify the dbConnect method to read: con <- dbConnect(RMySQL::MySQL(), dbname = “MoviesDatabase”, username = “
I used the library RMySQL which was stated in the textbook and to verify that the database connection was working, I used the summary function for the connection which gives me useful information about the table.
# loadgin package
library(RMySQL)
## Warning: package 'RMySQL' was built under R version 3.4.4
## Loading required package: DBI
con <- dbConnect(RMySQL::MySQL(), dbname = "MoviesDatabase", username = "school")
summary(con)
## <MySQLConnection:0,0>
## User: school
## Host: localhost
## Dbname: MoviesDatabase
## Connection type: localhost via TCP/IP
##
## Results:
This section will focus on doing basic queries on each of the tables: Viewers, Movie, and Rating. Each output will be inputted into a data frame and displayed onto the screen.
sql <- "SELECT * FROM Viewers"
res <- dbGetQuery(con, sql)
res
## viewerid name
## 1 1 Charles
## 2 2 Liam
## 3 3 Emily
## 4 4 Eric
## 5 5 Evelyn
## 6 6 Steven
sql <- "SELECT * FROM Movie"
res <- dbGetQuery(con, sql)
res
## movieid name
## 1 BE Better Off Dead
## 2 GB Ghostbusters
## 3 GF Goodfellas
## 4 LB Legally Blonde
## 5 PA Police Academy
## 6 SP Spaceballs
sql <- "SELECT * FROM Rating"
res <- dbGetQuery(con, sql)
res
## viewerid movieid rating
## 1 1 SP 4
## 2 1 BE 3
## 3 1 LB 2
## 4 1 PA 2
## 5 1 GB 1
## 6 1 GF 1
## 7 2 SP 5
## 8 2 BE 4
## 9 2 LB 3
## 10 2 PA 4
## 11 2 GB 1
## 12 2 GF 2
## 13 3 SP 2
## 14 3 BE 4
## 15 3 LB 5
## 16 3 PA 3
## 17 3 GB 4
## 18 3 GF 1
## 19 4 SP 1
## 20 4 BE 4
## 21 4 LB 2
## 22 4 PA 1
## 23 4 GB 5
## 24 4 GF 3
## 25 5 SP 3
## 26 5 BE 2
## 27 5 LB 4
## 28 5 PA 1
## 29 5 GB 5
## 30 5 GF 1
## 31 6 SP 2
## 32 6 BE 5
## 33 6 LB 1
## 34 6 PA 3
## 35 6 GB 4
## 36 6 GF 5
This section will focus on executing more complex queries on the tables: Viewers, Movie, and Rating. Each output will be inputted into a data frame and displayed onto the screen.
I will be using the paste method and vector storage to build the queries. The vector will be used to see the complex queries more clearly.
ref: How can two strings be concatenated? Website: https://stackoverflow.com/questions/7201341/how-can-two-strings-be-concatenated
The query will join two tables, viewer and rating and display the highest average rating for each viewer. I had to experiment with group by and order by in order to get the proper results displayed. Based on the results, it shows that Steven has the highest average movie rating of 3.33 and Charles has the lowest average movie rating of 2.17. It appears Steven generally likes the listed movies while Charles is more critical of the listed movies.
sqlvector <- c ("select v.name as viewer, avg(r.rating) as averagerating",
"from viewers v",
"join rating r on v.viewerid = r.viewerid",
"group by viewer",
"order by averagerating desc" )
sql <- paste(sqlvector, collapse=" ")
res <- dbGetQuery(con, sql)
## Warning in .local(conn, statement, ...): Decimal MySQL column 1 imported as
## numeric
res
## viewer averagerating
## 1 Steven 3.3333
## 2 Liam 3.1667
## 3 Emily 3.1667
## 4 Eric 2.6667
## 5 Evelyn 2.6667
## 6 Charles 2.1667
The query will join two tables, movie and rating and display the highest average rating for each movie. Based on the results, it shows that the movie, Better Off Dead has the highest average movie rating of 3.67 and Goodfellas has the lowest average movie rating of 2.17. It appears that our listed viewers prefer a comedy like Better Off Dead than a Gangster movie like Goodfellas.
sqlvector <- c ("select m.name as movie, avg(r.rating) as averagerating",
"from movie m",
"join rating r on m.movieid = r.movieid",
"group by movie",
"order by averagerating desc" )
sql <- paste(sqlvector, collapse=" ")
res <- dbGetQuery(con, sql)
## Warning in .local(conn, statement, ...): Decimal MySQL column 1 imported as
## numeric
res
## movie averagerating
## 1 Better Off Dead 3.6667
## 2 Ghostbusters 3.3333
## 3 Legally Blonde 2.8333
## 4 Spaceballs 2.8333
## 5 Police Academy 2.3333
## 6 Goodfellas 2.1667
The query will join three tables, viewers, movie, and rating and display each viewer in alphabetical order and the list of movies the watched and their assigned ratings sorted by highest rating. Based on the results, it shows that Charles, Emily, Eric, and Evelyn liked the movies Spaceballs, Legally Blonde, Ghostbusters, and Ghostbusters respectively. Moreover, Liam and Steven did not like the movies Ghostbusters and Legally Blonde respectively.
sqlvector <- c ("select v.name as viewer, m.name as movie, r.rating as rating",
"from viewers v",
"join rating r on v.viewerid = r.viewerid",
"join movie m on m.movieid = r.movieid",
"order by viewer, rating desc" )
sql <- paste(sqlvector, collapse=" ")
res <- dbGetQuery(con, sql)
res
## viewer movie rating
## 1 Charles Spaceballs 4
## 2 Charles Better Off Dead 3
## 3 Charles Legally Blonde 2
## 4 Charles Police Academy 2
## 5 Charles Ghostbusters 1
## 6 Charles Goodfellas 1
## 7 Emily Legally Blonde 5
## 8 Emily Better Off Dead 4
## 9 Emily Ghostbusters 4
## 10 Emily Police Academy 3
## 11 Emily Spaceballs 2
## 12 Emily Goodfellas 1
## 13 Eric Ghostbusters 5
## 14 Eric Better Off Dead 4
## 15 Eric Goodfellas 3
## 16 Eric Legally Blonde 2
## 17 Eric Spaceballs 1
## 18 Eric Police Academy 1
## 19 Evelyn Ghostbusters 5
## 20 Evelyn Legally Blonde 4
## 21 Evelyn Spaceballs 3
## 22 Evelyn Better Off Dead 2
## 23 Evelyn Goodfellas 1
## 24 Evelyn Police Academy 1
## 25 Liam Spaceballs 5
## 26 Liam Police Academy 4
## 27 Liam Better Off Dead 4
## 28 Liam Legally Blonde 3
## 29 Liam Goodfellas 2
## 30 Liam Ghostbusters 1
## 31 Steven Better Off Dead 5
## 32 Steven Goodfellas 5
## 33 Steven Ghostbusters 4
## 34 Steven Police Academy 3
## 35 Steven Spaceballs 2
## 36 Steven Legally Blonde 1
To disconnect from the database connection, the call to dbDisconnect function will be executed.
dbDisconnect(con)
## [1] TRUE