R Markdown
For this assignment six recent movies have been chosen and rated. the ratings have been stored in sql workbench and transferred into R studio.
defaultW <- getOption("warning")
options(warning = -1)
options(warning = defaultW)
SQLMovieData <- dbGetQuery(conn, "SELECT * FROM assignment_2_607.`movie rating 2`;")
SQLMovieData
## Name Movie 1 Promising Young woman Movie 2 Little Things
## 1 Fahi 4 3
## 2 Maz 3 4
## 3 Saif 4 NA
## 4 Adi 5 NA
## 5 Presly 3 3
## 6 Noah NA 2
## Movie 3Wonder woman 1984 Movie 4 Soul Movie 5 Bliss Movie 6 Tenet
## 1 4 5 3 4
## 2 2 5 NA 4
## 3 2 5 2 3
## 4 3 4 2 4
## 5 4 4 2 4
## 6 3 5 2 2
SQLMovieData$"Movie 1 Promising Young woman"[is.na(SQLMovieData$"Movie 1 Promising Young woman")]<-mean(SQLMovieData$"Movie 1 Promising Young woman",na.rm = TRUE)
SQLMovieData
## Name Movie 1 Promising Young woman Movie 2 Little Things
## 1 Fahi 4.0 3
## 2 Maz 3.0 4
## 3 Saif 4.0 NA
## 4 Adi 5.0 NA
## 5 Presly 3.0 3
## 6 Noah 3.8 2
## Movie 3Wonder woman 1984 Movie 4 Soul Movie 5 Bliss Movie 6 Tenet
## 1 4 5 3 4
## 2 2 5 NA 4
## 3 2 5 2 3
## 4 3 4 2 4
## 5 4 4 2 4
## 6 3 5 2 2
Missing values have been replaced using mean values from the rating.
library(ggplot2)
ggplot(data=ArrangedData, aes(x=Rating, y = Movie, fill = Rating, label = Rating))+
geom_bar(stat="identity")+
facet_wrap(~Name)+
ggtitle("Movie Ratings by Family and Friends")+
theme(axis.text.x = element_blank(),plot.title = element_text(hjust=0.5),legend.position = "right")

Standardization will not work here because the ratings belong to same range of numbers. However, if they had different range of numbers, standardization would have been necessary.