Introduction

Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends if necessary) to rate each of these movies that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database of your choosing. Load the information from the SQL database into an R dataframe.

install and import libary we need

library(RMySQL)
## Loading required package: DBI
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(dbplyr)
## 
## Attaching package: 'dbplyr'
## The following objects are masked from 'package:dplyr':
## 
##     ident, sql

Connect to MySQL database running on the cloud

con <- DBI::dbConnect(RMySQL::MySQL(), dbname = "bk6qen7oifdtyy7vpeye", user="ulf4pfcs31h79txp", password="FcoDCm1FgFK1krECMfaN",host="bk6qen7oifdtyy7vpeye-mysql.services.clever-cloud.com")

#get info on the database
src_dbi(con)
## src:  mysql 8.0.15-5 [ulf4pfcs31h79txp@bk6qen7oifdtyy7vpeye-mysql.services.clever-cloud.com:/bk6qen7oifdtyy7vpeye]
## tbls: movie_ratings, movies, person

Three table created for normalization

#movies table
movies <- tbl(con, sql("SELECT movie_id, movie_title, movie_year FROM movies"))
movies
## # Source:   SQL [?? x 3]
## # Database: mysql 8.0.15-5
## #   [ulf4pfcs31h79txp@bk6qen7oifdtyy7vpeye-mysql.services.clever-cloud.com:/bk6qen7oifdtyy7vpeye]
##   movie_id movie_title               movie_year
##      <int> <chr>                          <int>
## 1        1 Spider-Man: Far From Home       2019
## 2        2 Avengers: Infinity War          2018
## 3        3 Avengers: Infinity War          2019
## 4        4 Ant-Man and the Wasp            2018
## 5        5 Captain Marvel                  2019
## 6        6 Black Panther                   2018
#ratings table
ratings <- tbl(con, sql("SELECT * FROM movie_ratings"))
person  <- tbl(con, sql("SELECT * FROM person"))
person
## # Source:   SQL [?? x 3]
## # Database: mysql 8.0.15-5
## #   [ulf4pfcs31h79txp@bk6qen7oifdtyy7vpeye-mysql.services.clever-cloud.com:/bk6qen7oifdtyy7vpeye]
##   person_id person_name   person_role
##       <int> <chr>         <chr>      
## 1         1 Jacob HU      Friend     
## 2         2 Haoying Zhang Classmate  
## 3         3 Linda         Cousin     
## 4         4 Minqin Chen   Friend     
## 5         5 Feiyu Chen    Collegue
movies_ratings <- tbl(con, sql("select movie_title,person_name,rating from movie_ratings,movies,person where movie_ratings.movie_id= movies.movie_id
and movie_ratings.person_id=person.person_id order by movie_ratings.person_id,movie_ratings.movie_id"))
movies_ratings_df <- as.data.frame(movies_ratings)
movies_ratings_df
##                  movie_title   person_name rating
## 1  Spider-Man: Far From Home      Jacob HU      5
## 2     Avengers: Infinity War      Jacob HU      5
## 3     Avengers: Infinity War      Jacob HU      5
## 4       Ant-Man and the Wasp      Jacob HU      3
## 5             Captain Marvel      Jacob HU      2
## 6              Black Panther      Jacob HU      1
## 7  Spider-Man: Far From Home Haoying Zhang      4
## 8     Avengers: Infinity War Haoying Zhang      5
## 9     Avengers: Infinity War Haoying Zhang      4
## 10      Ant-Man and the Wasp Haoying Zhang      3
## 11            Captain Marvel Haoying Zhang      3
## 12             Black Panther Haoying Zhang      2
## 13 Spider-Man: Far From Home         Linda      4
## 14    Avengers: Infinity War         Linda      4
## 15    Avengers: Infinity War         Linda      5
## 16      Ant-Man and the Wasp         Linda      3
## 17            Captain Marvel         Linda      2
## 18             Black Panther         Linda      3
## 19 Spider-Man: Far From Home   Minqin Chen      4
## 20    Avengers: Infinity War   Minqin Chen      5
## 21    Avengers: Infinity War   Minqin Chen      4
## 22      Ant-Man and the Wasp   Minqin Chen      0
## 23            Captain Marvel   Minqin Chen      2
## 24             Black Panther   Minqin Chen      0
## 25 Spider-Man: Far From Home    Feiyu Chen      5
## 26    Avengers: Infinity War    Feiyu Chen      4
## 27    Avengers: Infinity War    Feiyu Chen      0
## 28      Ant-Man and the Wasp    Feiyu Chen      3
## 29            Captain Marvel    Feiyu Chen      0
## 30             Black Panther    Feiyu Chen      2

Find the average rating by filter out those rating of zero

movies_avg <-tbl(con, sql("select movie_title as Movie,avg(rating) as Average from movie_ratings,movies,person where movie_ratings.movie_id= movies.movie_id
and movie_ratings.person_id=person.person_id and movie_ratings.rating >0 group by movie_title"))
## Warning in .local(conn, statement, ...): Decimal MySQL column 1 imported as
## numeric
movies_avg_df<-as.data.frame(movies_avg)
## Warning in .local(conn, statement, ...): Decimal MySQL column 1 imported as
## numeric
movies_avg_df
##                       Movie Average
## 1 Spider-Man: Far From Home  4.4000
## 2    Avengers: Infinity War  4.5556
## 3      Ant-Man and the Wasp  3.0000
## 4            Captain Marvel  2.2500
## 5             Black Panther  2.0000

draw a bar graph to visualize

library(ggplot2)
# Basic barplot
p<-ggplot(data=movies_avg_df, aes(x=Movie, y=Average)) +
 geom_bar(stat="identity")
p

Inconclusion

As show above, Avengers: Infinity War is most popular marvel films