R Markdown

For this assignment six recent movies have been chosen and rated. the ratings have been stored in sql workbench and transferred into R studio.

defaultW <- getOption("warning") 

options(warning = -1) 

options(warning = defaultW)
SQLMovieData <- dbGetQuery(conn, "SELECT * FROM assignment_2_607.`movie rating 2`;")
SQLMovieData
##     Name Movie 1 Promising Young woman Movie 2 Little Things
## 1   Fahi                             4                     3
## 2    Maz                             3                     4
## 3   Saif                             4                    NA
## 4    Adi                             5                    NA
## 5 Presly                             3                     3
## 6   Noah                            NA                     2
##   Movie 3Wonder woman 1984 Movie 4 Soul Movie 5 Bliss Movie 6 Tenet
## 1                        4            5             3             4
## 2                        2            5            NA             4
## 3                        2            5             2             3
## 4                        3            4             2             4
## 5                        4            4             2             4
## 6                        3            5             2             2
SQLMovieData$"Movie 1 Promising Young woman"[is.na(SQLMovieData$"Movie 1 Promising Young woman")]<-mean(SQLMovieData$"Movie 1 Promising Young woman",na.rm = TRUE)
SQLMovieData
##     Name Movie 1 Promising Young woman Movie 2 Little Things
## 1   Fahi                           4.0                     3
## 2    Maz                           3.0                     4
## 3   Saif                           4.0                    NA
## 4    Adi                           5.0                    NA
## 5 Presly                           3.0                     3
## 6   Noah                           3.8                     2
##   Movie 3Wonder woman 1984 Movie 4 Soul Movie 5 Bliss Movie 6 Tenet
## 1                        4            5             3             4
## 2                        2            5            NA             4
## 3                        2            5             2             3
## 4                        3            4             2             4
## 5                        4            4             2             4
## 6                        3            5             2             2

Missing values have been replaced using mean values from the rating.

library(ggplot2)
ggplot(data=ArrangedData, aes(x=Rating, y = Movie, fill = Rating, label = Rating))+
    geom_bar(stat="identity")+ 
    facet_wrap(~Name)+
    ggtitle("Movie Ratings by Family and Friends")+
    theme(axis.text.x = element_blank(),plot.title = element_text(hjust=0.5),legend.position = "right")

Standardization will not work here because the ratings belong to same range of numbers. However, if they had different range of numbers, standardization would have been necessary.