Reading the movie rating data from Survey into Mysql db and then into R
Overview of the solution
Mysql database has two tables Movies and Survey
Movies
-movie id (primary key)
-movie name
Survey
-name of the participant
-age group
-gender
-movie id (Reference key)
-rating
Survey table has entry for each rating the participant voted for.
In order to handle the missing data, we left join movies table with survey so that we have all entries. We apply omit so that we donโt include null values.
We plot the result on a graph with movie and mean of rating. Highest mean movie is the favorite among the participants and lowest being less favourite.
library(RMariaDB)
library(ggplot2)
survey <- dbConnect(RMariaDB::MariaDB(), user='root', password='My$ql@123', dbname='survey', host='localhost')
## Left join movies table in order to get the missing data
survey <- dbGetQuery(survey, "select s.participant,s.age_grp,s.age_grp,s.rating, m.name Movie from movies m
left join survey s
on m.id=s.movie_id
order by 5")
## Handling of missing data in the movie rating by using omit
survey <- na.omit(survey)
## View the data having the complete rating data
View(survey)
## Plot the data on a graph with movies and mean(rating)
ggplot(survey) + geom_bar(aes(Movie, rating, fill=Movie), stat="summary", fun="mean") + labs(x="Movies", y="Avg.Ratings", title="") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Conclusion
print("Based on the rating from my friends for the top 5 movies released in 2019, PARASITE movie has the highest rating and Marriage story has the lowest rating")
## [1] "Based on the rating from my friends for the top 5 movies released in 2019, PARASITE movie has the highest rating and Marriage story has the lowest rating"