Section 1: Process Steps

Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends) to rate each of these movie that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database. Load the information into an R dataframe.

  1. I setup 4 .csv files with data.
  2. I asked 5 people in my family and then used the FilmRatings.com website to identify some MPAA rating information and movie title names and distributor information and created my tables for the rating data. For some fictional people, I repeated rows in excel and used the Random function on the rating and had few more fictional names in my dataset get a rating value from 1 through 5
  3. I then imported into MYSQL through the Excel Data Export Tool and used the options to create the tables. I created 4 tables.
  4. Below listed tables and the column names
  5. Then I queried from the tables from R after opening the connection and then tried to do some plotting as best as I could wanted to write my dataframe back to mysql database but I was able to create a tablename but was not able to write back the dataframe observations into my table using dbWriteTable as there was an error.

##personrating SELECT personrating.personrating_id, personrating.Title, personrating.RaterName, personrating.Rating FROM dbmovierating.personrating;

##movieinfo SELECT movieinfo.movieinfo_id, movieinfo.Title, movieinfo.MPAARating, movieinfo.Rating, movieinfo.RATINGLONGDESC, movieinfo.REASON, movieinfo.DISTRIBUTOR FROM dbmovierating.movieinfo;

##ratingdef SELECT ratingdef.ratingdef_id, ratingdef.Ratingdef FROM dbmovierating.ratingdef;

##personinfo SELECT personinfo.personinfo_id, personinfo.FirstName, personinfo.Age, personinfo.Sex FROM dbmovierating.personinfo;

con <- DBI::dbConnect(RMySQL::MySQL(), 
                      host = "localhost",
                      username = "cunyuser",
                      dbname="dbmovierating",
                      password = "bar"
                      )
dbListTables(con)
## [1] "movieinfo"    "personinfo"   "personrating" "ratingdef"   
## [5] "table2"       "testtable"
dbDisconnect(con)
## [1] TRUE

Display my database tables and their columns

con <- DBI::dbConnect(RMySQL::MySQL(), 
                      host = "localhost",
                      username = "cunyuser",
                      dbname="dbmovierating",
                      password = "bar"
)
movieinfo_df = dbSendQuery(con, "select * from movieinfo")
dfmovieinfo = fetch(movieinfo_df, n=-1)
head(dfmovieinfo)
##   movieinfo_id                               Title MPAARating
## 1            1                      Aquaman (2018)      PG-13
## 2            2          Alita: Battle Angel (2019)      PG-13
## 3            3                 Wonder Woman (2017)      PG-13
## 4            4                 Wonder Woman (2008)      PG-13
## 5            5     Wonder Woman: Bloodlines (2019)      PG-13
## 6            6 Wonder Woman: Director's Cut (2017)          R
##                        Rating
## 1 Parental Guidance Suggested
## 2 Parental Strongly Cautioned
## 3           General Audiences
## 4 Parental Strongly Cautioned
## 5           General Audiences
## 6                  Restricted
##                                             RATINGLONGDESC
## 1 Some Material may be inappropriate for children under 13
## 2 Some Material may be inappropriate for children under 13
## 3 Some Material may be inappropriate for children under 13
## 4 Some Material may be inappropriate for children under 13
## 5 Some Material may be inappropriate for children under 13
## 6 Under 17 Requires Accompanying Parent or Adult Guardian 
##                                                                              REASON
## 1   Rated PG-13 for sequences of sci-fi violence and action, and for some language.
## 2   Rated PG-13 for sequences of sci-fi violence and action, and for some language.
## 3    Rated PG-13 for sequences of violence and action, and some suggestive content.
## 4                 Rated PG-13 for violence throughout and some suggestive material.
## 5 Rated PG-13 for sequences of fantasy action and violence, and some bloody images.
## 6                                                        Rated R for some violence.
##                        DISTRIBUTOR
## 1            Warner Bros. Pictures
## 2 Twentieth Century Fox Film Corp.
## 3            Warner Bros. Pictures
## 4                Warner Home Video
## 5  Warner Bros. Home Entertainment
## 6  Warner Home Entertainment, Inc.
summary(dfmovieinfo)
##   movieinfo_id      Title            MPAARating           Rating         
##  Min.   : 1.00   Length:26          Length:26          Length:26         
##  1st Qu.: 7.25   Class :character   Class :character   Class :character  
##  Median :13.50   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :13.50                                                           
##  3rd Qu.:19.75                                                           
##  Max.   :26.00                                                           
##  RATINGLONGDESC        REASON          DISTRIBUTOR       
##  Length:26          Length:26          Length:26         
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
## 
prsnrating_df = dbSendQuery(con, "select * from personrating")
dfprsnrating = fetch(prsnrating_df, n=-1)
head(dfprsnrating)
##   personrating_id                         Title RaterName Rating
## 1               1                Aquaman (2018)     Golan      4
## 2               2    Alita: Battle Angel (2019)     Golan      0
## 3               3           Wonder Woman (2017)     Golan      4
## 4               4      Avengers: Endgame (2019)     Golan      0
## 5               5 Avengers: Infinity War (2018)     Golan      3
## 6               6            Toy Story 4 (2019)     Golan      5
summary(dfprsnrating)
##  personrating_id    Title            RaterName             Rating     
##  Min.   : 1.00   Length:98          Length:98          Min.   :0.000  
##  1st Qu.:25.25   Class :character   Class :character   1st Qu.:1.250  
##  Median :49.50   Mode  :character   Mode  :character   Median :4.000  
##  Mean   :49.50                                         Mean   :3.143  
##  3rd Qu.:73.75                                         3rd Qu.:5.000  
##  Max.   :98.00                                         Max.   :5.000
personinfo_df = dbSendQuery(con, "select * from personinfo")
dfpersoninfo = fetch(personinfo_df, n=-1)
head(dfpersoninfo)
##   personinfo_id FirstName Age Sex
## 1             1     Peter   7   M
## 2             2      Liam   4   M
## 3             3     Golan  42   M
## 4             4 Nathaniel  62   M
## 5             5     Simba   3   M
## 6             6       Max  17   M
summary(dfpersoninfo)
##  personinfo_id   FirstName              Age            Sex           
##  Min.   : 1.0   Length:15          Min.   : 3.00   Length:15         
##  1st Qu.: 4.5   Class :character   1st Qu.: 5.00   Class :character  
##  Median : 8.0   Mode  :character   Median :17.00   Mode  :character  
##  Mean   : 8.0                      Mean   :23.47                     
##  3rd Qu.:11.5                      3rd Qu.:34.50                     
##  Max.   :15.0                      Max.   :75.00
prsnrating_df = dbSendQuery(con, "select * from personrating")
dfprsnrating = fetch(prsnrating_df, n=-1)
head(dfprsnrating)
##   personrating_id                         Title RaterName Rating
## 1               1                Aquaman (2018)     Golan      4
## 2               2    Alita: Battle Angel (2019)     Golan      0
## 3               3           Wonder Woman (2017)     Golan      4
## 4               4      Avengers: Endgame (2019)     Golan      0
## 5               5 Avengers: Infinity War (2018)     Golan      3
## 6               6            Toy Story 4 (2019)     Golan      5
summary(dfprsnrating)
##  personrating_id    Title            RaterName             Rating     
##  Min.   : 1.00   Length:98          Length:98          Min.   :0.000  
##  1st Qu.:25.25   Class :character   Class :character   1st Qu.:1.250  
##  Median :49.50   Mode  :character   Mode  :character   Median :4.000  
##  Mean   :49.50                                         Mean   :3.143  
##  3rd Qu.:73.75                                         3rd Qu.:5.000  
##  Max.   :98.00                                         Max.   :5.000
dbDisconnect(con)
## Warning: Closing open result sets
## [1] TRUE

I am using merge to merge data into one table between 3 tables and merge has created some NA values

names(dfpersoninfo)[2]<-"RaterName"
names(dfpersoninfo)
## [1] "personinfo_id" "RaterName"     "Age"           "Sex"
table1 <- merge(x = dfpersoninfo , y = dfprsnrating , by = "RaterName", all = TRUE)
Table2 <- merge(x = table1 , y = dfmovieinfo[,c("Title","MPAARating","DISTRIBUTOR")], by = "Title", all = TRUE)
##head(Table2)
summary(Table2)
##     Title            RaterName         personinfo_id      Age       
##  Length:118         Length:118         Min.   : 1    Min.   : 3.00  
##  Class :character   Class :character   1st Qu.: 4    1st Qu.: 4.00  
##  Mode  :character   Mode  :character   Median : 8    Median :17.00  
##                                        Mean   : 8    Mean   :22.77  
##                                        3rd Qu.:12    3rd Qu.:34.00  
##                                        Max.   :15    Max.   :75.00  
##                                        NA's   :19    NA's   :19     
##      Sex            personrating_id     Rating       MPAARating       
##  Length:118         Min.   : 1.00   Min.   :0.000   Length:118        
##  Class :character   1st Qu.:25.25   1st Qu.:1.250   Class :character  
##  Mode  :character   Median :49.50   Median :4.000   Mode  :character  
##                     Mean   :49.50   Mean   :3.143                     
##                     3rd Qu.:73.75   3rd Qu.:5.000                     
##                     Max.   :98.00   Max.   :5.000                     
##                     NA's   :20      NA's   :20                        
##  DISTRIBUTOR       
##  Length:118        
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 
library(knitr)
kable(Table2[1:56, ])
Title RaterName personinfo_id Age Sex personrating_id Rating MPAARating DISTRIBUTOR
Alita: Battle Angel (2019) Gamora 7 21 F 30 5 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Jackson 12 4 M 44 3 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Liam 2 4 M 16 0 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Golan 3 42 M 2 0 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Marina 14 6 F 79 1 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Sumita 11 34 F 37 5 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Togo 15 6 F 65 4 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) John 9 75 F 51 3 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Olivia 13 4 F 72 5 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Nathaniel 4 62 M 23 0 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Max 6 17 M 86 5 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Simba 5 3 M 93 4 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Peter 1 7 M 9 0 PG-13 Twentieth Century Fox Film Corp.
Alita: Battle Angel (2019) Yong Lee 10 32 F 58 3 PG-13 Twentieth Century Fox Film Corp.
Aquaman (2018) Liam 2 4 M 15 4 PG-13 Warner Bros. Pictures
Aquaman (2018) Marina 14 6 F 78 2 PG-13 Warner Bros. Pictures
Aquaman (2018) Simba 5 3 M 92 1 PG-13 Warner Bros. Pictures
Aquaman (2018) Olivia 13 4 F 71 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Jackson 12 4 M 43 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Nathaniel 4 62 M 22 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Togo 15 6 F 64 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Yong Lee 10 32 F 57 3 PG-13 Warner Bros. Pictures
Aquaman (2018) John 9 75 F 50 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Peter 1 7 M 8 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Sumita 11 34 F 36 1 PG-13 Warner Bros. Pictures
Aquaman (2018) Gamora 7 21 F 29 5 PG-13 Warner Bros. Pictures
Aquaman (2018) Max 6 17 M 85 3 PG-13 Warner Bros. Pictures
Aquaman (2018) Golan 3 42 M 1 4 PG-13 Warner Bros. Pictures
Avengers Confidential: Black Widow & Punisher (2014) NA NA NA NA NA NA PG-13 Sony Pictures Worldwide Acquisitions Inc.
Avengers, The (1998) NA NA NA NA NA NA PG-13 Warner Bros., A Division Of Time Warner Entertainment Company
Avengers: Age Of Ultron (2015) NA NA NA NA NA NA PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Nathaniel 4 62 M 25 0 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Golan 3 42 M 4 0 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Yong Lee 10 32 F 60 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Gamora 7 21 F 32 4 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Jackson 12 4 M 46 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Sumita 11 34 F 39 1 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) John 9 75 F 53 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Peter 1 7 M 11 0 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Togo 15 6 F 67 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Olivia 13 4 F 74 3 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Max 6 17 M 88 2 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Simba 5 3 M 95 2 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Marina 14 6 F 81 3 PG-13 Walt Disney Studios Motion Pictures
Avengers: Endgame (2019) Liam 2 4 M 18 0 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Marina 14 6 F 82 4 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Max 6 17 M 89 3 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Olivia 13 4 F 75 4 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Golan 3 42 M 5 3 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) John 9 75 F 54 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Jackson 12 4 M 47 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Togo 15 6 F 68 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Nathaniel 4 62 M 26 4 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Sumita 11 34 F 40 1 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Simba 5 3 M 96 5 PG-13 Walt Disney Studios Motion Pictures
Avengers: Infinity War (2018) Liam 2 4 M 19 0 PG-13 Walt Disney Studios Motion Pictures

Plot Group Mean

library(plyr)
mu <- ddply(Table2, "Age", summarise, grp.mean=mean(Rating))
head(mu)
##   Age grp.mean
## 1   3 3.428571
## 2   4 2.619048
## 3   6 3.428571
## 4   7 2.428571
## 5  17 2.571429
## 6  21 4.571429
plot(mu)

Plot Geom Point

ggplot(data = Table2) + 
  geom_point(mapping = aes(x = Title, y = RaterName)) + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Plot Histogram, Age, RaterName

ggplot(data = Table2) + 
  geom_bar(mapping = aes(x = Age, fill = RaterName))
## Warning: Removed 19 rows containing non-finite values (stat_count).

Plot Histogram, Title, by Sex, Age, RaterName

ggplot(Table2, aes(x=Rating, fill = Title)) + geom_histogram() +
  facet_wrap(Sex ~ Age ~ RaterName )
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 20 rows containing non-finite values (stat_bin).

Plot Histogram, mpaarating,by Sex, Age, RaterName - Few kids have watched R rated movie

ggplot(data = Table2) + 
  geom_bar(mapping = aes(x = Rating, fill = MPAARating), position = "fill") +  facet_wrap(Sex ~ Age ~ RaterName )
## Warning: Removed 20 rows containing non-finite values (stat_count).
## Warning: position_stack requires non-overlapping x intervals

## Warning: position_stack requires non-overlapping x intervals

## Warning: position_stack requires non-overlapping x intervals

##Showing come code that I tried to create to write back to MYSQL from my dataframme but this is not working for me by using dbWriteTable from DBI library. But showing just my work to try to figure this out

library(DBI)

Table3 = Table2[,!(names(Table2) %in% c("personrating_id"))]
con <- DBI::dbConnect(RMySQL::MySQL(), 
                      host = "localhost",
                      username = "cunyuser",
                      dbname="dbmovierating",
                      password = "bar"
)
dbListTables(con)
## [1] "movieinfo"    "personinfo"   "personrating" "ratingdef"   
## [5] "table2"       "testtable"
dbSendQuery(con, "drop table if exists dbmovierating.table2")
## <MySQLResult:NA,2,1>
dbListTables(con)
## [1] "movieinfo"    "personinfo"   "personrating" "ratingdef"   
## [5] "testtable"
dbSendQuery(con,"CREATE TABLE `table2` (
  `personrating_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `Title` varchar(255) DEFAULT NULL,
  `RaterName` varchar(255) DEFAULT NULL,
  `Age` bigint(20) DEFAULT NULL,
  `Sex` varchar(255) DEFAULT NULL,
  `Rating` bigint(20) DEFAULT NULL,
  `MPAARating` varchar(255) DEFAULT NULL,
  `DISTRIBUTOR`varchar(255) DEFAULT NULL,
  PRIMARY KEY (`personrating_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;")
## <MySQLResult:NA,2,3>
dbListTables(con)
## [1] "movieinfo"    "personinfo"   "personrating" "ratingdef"   
## [5] "table2"       "testtable"
# dbWriteTable(con, "dbmovierating.table2", Table3[1:5,], row.name=FALSE, append=TRUE)
# #dbWriteTable(con, "dbmovierating.table2", value=Table2(), append=TRUE, row.names=FALSE)
# testdf = dbSendQuery(con, "select * from dbmovierating.table2")
# testdf = fetch(testdf, n=-1)
# head(testdf)
dbDisconnect(con)
## [1] TRUE