Week_2_Assignment

Movies Ratings

Collecting Data

To collect the ratings of 6 of the recent popular movies, I created a Google Form and I sent the link to REAL friends. I also rated the movies myself; I watched none of the movies I included in the survey, yet I picked them because I am interested in watching them when I have time. Here is the link to the Google form: https://forms.gle/9PSCwsAJcQDfvAGPA

Connecting pgadmin4 with Rstudio:

Loading Packages:

library(RPostgres)
drv <- dbDriver("Postgres")
con <-dbCanConnect(drv = RPostgres::Postgres(),
                   dbname = "Movies Ratings",
                   port = 5432,
                   user = "postgres",
                   password = "MOHAMEDChawki2018")
con
[1] TRUE
con_1 <-dbConnect(drv = RPostgres::Postgres(),
                   dbname = "Movies Ratings",
                   port = 5432,
                   user = "postgres",
                   password = "MOHAMEDChawki2018")
con_1
<PqConnection> Movies Ratings@/tmp:5432
dbListTables(con_1)
[1] "moviesratings"     "IMDBmoviesRatings"
dbGetQuery(con_1, "SELECT * FROM moviesratings")
        Name 1. Avengers: Endgame: 2. The Little Mermaid 3. Booksmart
1       Hiba                    NA                     5           NA
2        Ali                    NA                    NA           NA
3       Omar                    NA                    NA           NA
4       Tony                    NA                    NA           NA
5    Youssef                     4                    NA            2
6       Adil                     3                    NA            5
7        Jor                     3                    NA           NA
8      Sarah                    NA                    NA           NA
9      Lilly                     4                    NA            5
10    Lauren                     5                    NA           NA
11   Jeannie                     5                    NA           NA
12  Jonathan                     5                    NA           NA
13    Saloua                    NA                    NA           NA
14      Sara                    NA                    NA           NA
15 Christine                     5                     4           NA
16     Nancy                     4                    NA           NA
17   Stephen                     2                     1            3
18    Howard                     4                    NA            3
   4. Talk to Me 5. Interstellar 6. The Irishman
1             NA              NA              NA
2              4               5              NA
3             NA              NA               5
4             NA               3               5
5             NA              NA               5
6             NA               5              NA
7             NA               3              NA
8              3              NA               4
9             NA               2              NA
10            NA               3               3
11            NA               5               3
12            NA               5              NA
13            NA              NA              NA
14            NA              NA              NA
15            NA               5              NA
16            NA              NA              NA
17            NA              NA              NA
18             4              NA              NA
ratings_df <- dbGetQuery(con_1, "SELECT * FROM moviesratings")
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)
library(gt)
library(ggplot2)
echo = FALSE
gt(ratings_df)
Name 1. Avengers: Endgame: 2. The Little Mermaid 3. Booksmart 4. Talk to Me 5. Interstellar 6. The Irishman
Hiba NA 5 NA NA NA NA
Ali NA NA NA 4 5 NA
Omar NA NA NA NA NA 5
Tony NA NA NA NA 3 5
Youssef 4 NA 2 NA NA 5
Adil 3 NA 5 NA 5 NA
Jor 3 NA NA NA 3 NA
Sarah NA NA NA 3 NA 4
Lilly 4 NA 5 NA 2 NA
Lauren 5 NA NA NA 3 3
Jeannie 5 NA NA NA 5 3
Jonathan 5 NA NA NA 5 NA
Saloua NA NA NA NA NA NA
Sara NA NA NA NA NA NA
Christine 5 4 NA NA 5 NA
Nancy 4 NA NA NA NA NA
Stephen 2 1 3 NA NA NA
Howard 4 NA 3 4 NA NA
ratings_df %>%
  dplyr::select(1:7) %>%
  dplyr::slice(1:18) %>% 
  gt(rowname_col = "Names of Respondents") %>%
  tab_spanner(label = "Movies Rated", columns = matches("1.|2.|3.|4.|5.|6.")) %>%
  tab_header(
    title = md("*Six Popular Movies Ratings*")
  ) %>%
  opt_stylize(style = 6) 
Six Popular Movies Ratings
Name Movies Rated
1. Avengers: Endgame: 2. The Little Mermaid 3. Booksmart 4. Talk to Me 5. Interstellar 6. The Irishman
Hiba NA 5 NA NA NA NA
Ali NA NA NA 4 5 NA
Omar NA NA NA NA NA 5
Tony NA NA NA NA 3 5
Youssef 4 NA 2 NA NA 5
Adil 3 NA 5 NA 5 NA
Jor 3 NA NA NA 3 NA
Sarah NA NA NA 3 NA 4
Lilly 4 NA 5 NA 2 NA
Lauren 5 NA NA NA 3 3
Jeannie 5 NA NA NA 5 3
Jonathan 5 NA NA NA 5 NA
Saloua NA NA NA NA NA NA
Sara NA NA NA NA NA NA
Christine 5 4 NA NA 5 NA
Nancy 4 NA NA NA NA NA
Stephen 2 1 3 NA NA NA
Howard 4 NA 3 4 NA NA
mean1 <- round(mean(ratings_df$`1. Avengers: Endgame:`, na.rm = TRUE),2)
mean2 <- round(mean(ratings_df$`2. The Little Mermaid`, na.rm = TRUE),2)
mean3 <- round(mean(ratings_df$`3. Booksmart`, na.rm = TRUE),2)
mean4 <-round(mean(ratings_df$`4. Talk to Me`, na.rm=TRUE),2)
mean5 <- round(mean(ratings_df$`5. Interstellar`, na.rm=TRUE),2)
mean6 <- round(mean(ratings_df$`6. The Irishman`, na.rm = TRUE),2)
Average_ratings = data.frame(Movie_Name=c('Avengers: Endgame','The Little Mermaid', 'Booksmart', 'Talk To Me', 'Interstellar', 'The Irishman'),
  Avg_ratings=c(mean1, mean2, mean3, mean4, mean5,mean6))
print(Average_ratings)
          Movie_Name Avg_ratings
1  Avengers: Endgame        4.00
2 The Little Mermaid        3.33
3          Booksmart        3.60
4         Talk To Me        3.67
5       Interstellar        4.00
6       The Irishman        4.17

Visualizing the data:

ggplot(data = Average_ratings, ) + geom_bar(mapping = aes(x = Movie_Name, y = Avg_ratings ), stat = "identity")

ratings_df2 <- data.frame(t(ratings_df))
print(ratings_df2)
                        X1   X2   X3   X4      X5   X6   X7    X8    X9    X10
Name                  Hiba  Ali Omar Tony Youssef Adil  Jor Sarah Lilly Lauren
1. Avengers: Endgame: <NA> <NA> <NA> <NA>       4    3    3  <NA>     4      5
2. The Little Mermaid    5 <NA> <NA> <NA>    <NA> <NA> <NA>  <NA>  <NA>   <NA>
3. Booksmart          <NA> <NA> <NA> <NA>       2    5 <NA>  <NA>     5   <NA>
4. Talk to Me         <NA>    4 <NA> <NA>    <NA> <NA> <NA>     3  <NA>   <NA>
5. Interstellar       <NA>    5 <NA>    3    <NA>    5    3  <NA>     2      3
6. The Irishman       <NA> <NA>    5    5       5 <NA> <NA>     4  <NA>      3
                          X11      X12    X13  X14       X15   X16     X17
Name                  Jeannie Jonathan Saloua Sara Christine Nancy Stephen
1. Avengers: Endgame:       5        5   <NA> <NA>         5     4       2
2. The Little Mermaid    <NA>     <NA>   <NA> <NA>         4  <NA>       1
3. Booksmart             <NA>     <NA>   <NA> <NA>      <NA>  <NA>       3
4. Talk to Me            <NA>     <NA>   <NA> <NA>      <NA>  <NA>    <NA>
5. Interstellar             5        5   <NA> <NA>         5  <NA>    <NA>
6. The Irishman             3     <NA>   <NA> <NA>      <NA>  <NA>    <NA>
                         X18
Name                  Howard
1. Avengers: Endgame:      4
2. The Little Mermaid   <NA>
3. Booksmart               3
4. Talk to Me              4
5. Interstellar         <NA>
6. The Irishman         <NA>

Conclusion:

After calculating the Global Baseline Estimate using Excel, I found it was 3.89. Considering 3.89 as an average threshold, the most recommended movies are “Avengers: Endgame”, “Interstellar”, and ” The Irishman”. The movies that were rated have different Genres which makes it though to recommend any movie to any of the respondents, especially to Saloua and Sara. They both happened to not watch any of these movies. Based on the average ratings of each movie, the Irishman got the greatest average of all “4.17”

References: