Purpose: By doing a regresson analysis, we want to know: 1) Among the 27 variables given, which of them are critical in telling the IMDB rating of a movie. 2) Is there any correlation between genre & IMDB raging,face number in poster & IMDB rating,director name & IMDB rating and duration & IMDB rating. 3) Predict the IMDB Score using our model

m<- read.csv('movie_metadata.csv')

Step 1: Data Collection

This data set was found from Kaggle. The author scraped 5000+ movies from IMDB website using a Python library called “scrapy” and obtain all needed 28 variables for 5043 movies and 4906 posters (998MB), spanning across 100 years in 66 countries. There are 2399 unique director names, and thousands of actors/actresses. Below are the 28 variables: “movie_title” “color” “num_critic_for_reviews” “movie_facebook_likes” “duration” “director_name” “director_facebook_likes” “actor_3_name” “actor_3_facebook_likes” “actor_2_name” “actor_2_facebook_likes” “actor_1_name” “actor_1_facebook_likes” “gross” “genres” “num_voted_users” “cast_total_facebook_likes” “facenumber_in_poster” “plot_keywords” “movie_imdb_link” “num_user_for_reviews” “language” “country” “content_rating” “budget” “title_year” “imdb_score” “aspect_ratio”

This dataset is a proof of concept. It can be used for experimental and learning purpose.For comprehensive movie analysis and accurate movie ratings prediction, 28 attributes from 5000 movies might not be enough. A decent dataset could contain hundreds of attributes from 50K or more movies, and requires tons of feature engineering.

Step 2 : Data cleaning and exploration

Assign the first word of genres as the genre of each movie:(genres been split into words in Excel):

# remove columns X-X.8
which(colnames(m)=='genres')
[1] 10
which(colnames(m)=='X.8')
[1] 19
m<-m[,-c(11:19)]

Only keep movie data for USA, bacause the “budget” variable was not all converted to US dollars, which might cause a problem in later analysis. If we want to convert all budgets into US dollarts, we have to take in to consideration for inflation as well. This might make the problem more complicated. Therefore, for pratice purpose, we decided to only study data for movies of USA.

movie.usa<-m[which(m[,'country']=='USA'),]

Double check:

movie.usa$country
   [1] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
  [23] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
  [45] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
  [67] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
  [89] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [111] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [133] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [155] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [177] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [199] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [221] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [243] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [265] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [287] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [309] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [331] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [353] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [375] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [397] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [419] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [441] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [463] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [485] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [507] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [529] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [551] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [573] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [595] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [617] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [639] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [661] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [683] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [705] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [727] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [749] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [771] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [793] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [815] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [837] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [859] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [881] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [903] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [925] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [947] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [969] USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA USA
 [991] USA USA USA USA USA USA USA USA USA USA
 [ reached getOption("max.print") -- omitted 2807 entries ]
66 Levels:  Afghanistan Argentina Aruba Australia Bahamas Belgium Brazil Bulgaria ... West Germany

Remove ‘language’ since after removing all countries except for USA, there is only 4 languages aside from English, not meaningful for our prediction.

summary(movie.usa$language)
           Aboriginal     Arabic    Aramaic    Bosnian  Cantonese    Chinese      Czech 
        10          0          0          1          1          1          0          0 
    Danish       Dari      Dutch   Dzongkha    English   Filipino     French     German 
         0          1          0          0       3779          1          0          0 
     Greek     Hebrew      Hindi  Hungarian  Icelandic Indonesian    Italian   Japanese 
         0          1          1          0          0          0          0          1 
   Kannada     Kazakh     Korean   Mandarin       Maya  Mongolian       None  Norwegian 
         0          0          0          0          1          0          1          0 
   Panjabi    Persian     Polish Portuguese   Romanian    Russian  Slovenian    Spanish 
         0          0          0          0          0          0          0          7 
   Swahili    Swedish      Tamil     Telugu       Thai       Urdu Vietnamese       Zulu 
         0          0          0          0          0          0          1          0 
movie.usa<-movie.usa[, -which(names(movie.usa)=='language')]

Remove ‘movie_imdb_link’ column since it’s not useful for our analysis and store the rest od the data as ‘movie’.

movie.df= data.frame(movie.usa)
mm<-movie.df[, -which(names(movie.df)=='movie_imdb_link')] 
str(mm)
'data.frame':   3807 obs. of  26 variables:
 $ color                    : Factor w/ 3 levels ""," Black and White",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ director_name            : Factor w/ 2399 levels "","\xcc\xe4mile Gaudreault",..: 926 799 379 106 2030 1652 1225 2394 284 799 ...
 $ num_critic_for_reviews   : int  723 302 813 462 392 324 635 673 434 313 ...
 $ duration                 : int  178 169 164 132 156 100 141 183 169 151 ...
 $ director_facebook_likes  : int  0 563 22000 475 0 15 0 0 0 563 ...
 $ actor_3_facebook_likes   : int  855 1000 23000 530 4000 284 19000 2000 903 1000 ...
 $ actor_2_name             : Factor w/ 3033 levels "","50 Cent","A. Michael Baldwin",..: 1408 2218 534 2549 1228 801 2440 1704 1911 2218 ...
 $ actor_1_facebook_likes   : int  1000 40000 27000 640 24000 799 26000 15000 18000 40000 ...
 $ gross                    : int  760505847 309404152 448130642 73058679 336530303 200807262 458991599 330249062 200069408 423032628 ...
 $ genres                   : Factor w/ 21 levels "Action","Adventure",..: 1 1 1 1 1 2 1 1 1 1 ...
 $ actor_1_name             : Factor w/ 2098 levels "","\xcc\xd2lafur Darri \xcc\xd2lafsson",..: 303 982 1968 441 786 221 337 740 1104 982 ...
 $ movie_title              : Factor w/ 4917 levels "[Rec] 2\xe5\xca",..: 397 2731 3707 1960 3289 3459 398 460 3416 2732 ...
 $ num_voted_users          : int  886204 471220 1144337 212204 383056 294810 462669 371639 240396 522040 ...
 $ cast_total_facebook_likes: int  4834 48350 106759 1873 46055 2036 92000 24450 29991 48486 ...
 $ actor_3_name             : Factor w/ 3522 levels "","\xcc\xd2scar Jaenada",..: 3442 1393 1769 2714 1969 2162 3018 57 1134 1393 ...
 $ facenumber_in_poster     : int  0 0 0 1 0 1 4 0 0 2 ...
 $ plot_keywords            : Factor w/ 4761 levels "","10 year old|dog|florida|girl|supermarket",..: 1320 4283 3484 651 4745 29 1142 1564 3312 2188 ...
 $ num_user_for_reviews     : int  3054 1238 2701 738 1902 387 1117 3018 2367 1832 ...
 $ country                  : Factor w/ 66 levels "","Afghanistan",..: 65 65 65 65 65 65 65 65 65 65 ...
 $ content_rating           : Factor w/ 19 levels "","Approved",..: 10 10 10 10 10 9 10 10 10 10 ...
 $ budget                   : num  2.37e+08 3.00e+08 2.50e+08 2.64e+08 2.58e+08 ...
 $ title_year               : int  2009 2007 2012 2012 2007 2010 2015 2016 2006 2006 ...
 $ actor_2_facebook_likes   : int  936 5000 23000 632 11000 553 21000 4000 10000 5000 ...
 $ imdb_score               : num  7.9 7.1 8.5 6.6 6.2 7.8 7.5 6.9 6.1 7.3 ...
 $ aspect_ratio             : num  1.78 2.35 2.35 2.35 2.35 1.85 2.35 2.35 2.35 2.35 ...
 $ movie_facebook_likes     : int  33000 0 164000 24000 0 29000 118000 197000 0 5000 ...

Check for missing values:

library(Amelia)
Loading required package: Rcpp
package ‘Rcpp’ was built under R version 3.3.2## 
## Amelia II: Multiple Imputation
## (Version 1.7.4, built: 2015-12-05)
## Copyright (C) 2005-2017 James Honaker, Gary King and Matthew Blackwell
## Refer to http://gking.harvard.edu/amelia/ for more information
## 
missmap(mm, main = "Missing values vs observed")

sapply(mm,function(x) sum(is.na(x))) # number of missing values for each variable 
                    color             director_name    num_critic_for_reviews 
                        0                         0                        39 
                 duration   director_facebook_likes    actor_3_facebook_likes 
                        6                        74                        13 
             actor_2_name    actor_1_facebook_likes                     gross 
                        0                         4                       572 
                   genres              actor_1_name               movie_title 
                        0                         0                         0 
          num_voted_users cast_total_facebook_likes              actor_3_name 
                        0                         0                         0 
     facenumber_in_poster             plot_keywords      num_user_for_reviews 
                       12                         0                        13 
                  country            content_rating                    budget 
                        0                         0                       298 
               title_year    actor_2_facebook_likes                imdb_score 
                       74                         7                         0 
             aspect_ratio      movie_facebook_likes 
                      222                         0 

We noticed that there are many missing values for budget,aspect ratio and gross.

Omit missing values:

movie<-na.omit(mm)
sapply(movie,function(x) sum(is.na(x))) # double check for missing values
                    color             director_name    num_critic_for_reviews 
                        0                         0                         0 
                 duration   director_facebook_likes    actor_3_facebook_likes 
                        0                         0                         0 
             actor_2_name    actor_1_facebook_likes                     gross 
                        0                         0                         0 
                   genres              actor_1_name               movie_title 
                        0                         0                         0 
          num_voted_users cast_total_facebook_likes              actor_3_name 
                        0                         0                         0 
     facenumber_in_poster             plot_keywords      num_user_for_reviews 
                        0                         0                         0 
                  country            content_rating                    budget 
                        0                         0                         0 
               title_year    actor_2_facebook_likes                imdb_score 
                        0                         0                         0 
             aspect_ratio      movie_facebook_likes 
                        0                         0 
library(psych)
package ‘psych’ was built under R version 3.3.2
Attaching package: ‘psych’

The following objects are masked from ‘package:ggplot2’:

    %+%, alpha

The following object is masked from ‘package:car’:

    logit
library(car)
library(RColorBrewer) 
library(corrplot)
library(ggplot2)

Explore title_year predictor:

range(movie$title_year) # check movie title year
[1] 1920 2016
sum(with(movie,title_year=='2009')) # 145
[1] 145
sum(with(movie,title_year=='2014')) # 121
[1] 121

Visualization of title Year vs. Score:

scatterplot(x=movie$title_year,y=movie$imdb_score)

There are many outliers for title year. The mojority of data points are around the year of 2000 and later,which make sense that this is less movies in the early years. Also, an intering notice is that movies from early years tend to have higher scores.

Visualization of IMDB Score:

max(movie$imdb_score) # 9.4
[1] 9.3
ggplot(movie, aes(x = imdb_score)) +
        geom_histogram(aes(fill = ..count..), binwidth =0.5) +
        scale_x_continuous(name = "IMDB Score",
                           breaks = seq(0,10),
                           limits=c(1, 10)) +
        ggtitle("Histogram of Movie IMDB Score") +
        scale_fill_gradient("Count", low = "blue", high = "red")

sum(with(movie,imdb_score>=8))
[1] 148
# 148 movies with IMDB score greater or equal to 8.

IMDB score looks normal.The highest score is 9.4 out of scale 10. And we can consider movies with a score greater or equal to 8 a great movie from many perspectives.

Exploring correlation :

pairs.panels(movie[c('director_name','duration','facenumber_in_poster','imdb_score','genres')])

from the plot, only duration and IMBD score has a high correlation. face number in posters has a negative correaltion with IMBD score. genre has little correlatin with score Interesting, director name has no correlation with IMDB score

pairs.panels(movie[c('color','actor_1_name','title_year','imdb_score','aspect_ratio','gross')])

Color and title year has highly positive correlation. Color and aspect ratia,gross has smaller positive correlations. Actor 1 namem has very small positive correlation with gross, meaning who plays the movies does not have impact on the gross. Title year and aspect ratio and color are highly positively correlated. IMDB score has very small positive correlation with actor 1 name ,which means who was the actor 1 does not make the movie has a higher score. Interestingly, IMDB score has a negative correlation with title year,which means the old movies seems to have a higher score. the result agrees with out pbservation from the scatter plot. IMDB and aspect ratio has small positive correlation. IMDB has a strong positive correlation with gross.

Corplot for all numerical variables:

nums<- sapply(movie,is.numeric) # select numeric columns
movie.num<- movie[,nums]
corrplot(cor(movie.num),method='ellipse') 

Note: corrplot cannot use data.frame, use cor() to change it to matrix.

From the correlation plot, we can tell that: Face number in poster has negative correlation with all other predictors. Cast total facebook likes and actor 1 facebook likes has a stronger positive correlation. budget and gross have strong correaltion which is not surprising. Interestingly, IMDB scores has strong positive corrlation with number of critics for review, which means the more the critics review, the higher the score.Duration and number of voted users also have strong positive correlation with IMDB scores.

Find the pairs of correlations

corr.test(movie.num,y=NULL,use='pairwise',method='pearson',adjust='holm',alpha=0.05) # x must be numeric
Call:corr.test(x = movie.num, y = NULL, use = "pairwise", method = "pearson", 
    adjust = "holm", alpha = 0.05)
Correlation matrix 
                          num_critic_for_reviews duration director_facebook_likes
num_critic_for_reviews                      1.00     0.26                    0.19
duration                                    0.26     1.00                    0.21
director_facebook_likes                     0.19     0.21                    1.00
actor_3_facebook_likes                      0.28     0.14                    0.12
actor_1_facebook_likes                      0.17     0.09                    0.09
gross                                       0.48     0.28                    0.14
num_voted_users                             0.60     0.37                    0.32
cast_total_facebook_likes                   0.25     0.13                    0.12
facenumber_in_poster                       -0.03     0.01                   -0.05
num_user_for_reviews                        0.57     0.36                    0.24
budget                                      0.49     0.30                    0.09
title_year                                  0.42    -0.11                   -0.06
actor_2_facebook_likes                      0.28     0.15                    0.12
imdb_score                                  0.36     0.38                    0.22
aspect_ratio                                0.18     0.16                    0.05
movie_facebook_likes                        0.71     0.25                    0.17
                          actor_3_facebook_likes actor_1_facebook_likes gross num_voted_users
num_critic_for_reviews                      0.28                   0.17  0.48            0.60
duration                                    0.14                   0.09  0.28            0.37
director_facebook_likes                     0.12                   0.09  0.14            0.32
actor_3_facebook_likes                      1.00                   0.25  0.30            0.28
actor_1_facebook_likes                      0.25                   1.00  0.13            0.17
gross                                       0.30                   0.13  1.00            0.64
num_voted_users                             0.28                   0.17  0.64            1.00
cast_total_facebook_likes                   0.48                   0.95  0.22            0.25
facenumber_in_poster                        0.10                   0.05 -0.04           -0.04
num_user_for_reviews                        0.22                   0.12  0.55            0.78
budget                                      0.27                   0.15  0.64            0.40
title_year                                  0.13                   0.09  0.06            0.03
actor_2_facebook_likes                      0.55                   0.38  0.25            0.25
imdb_score                                  0.09                   0.12  0.27            0.51
aspect_ratio                                0.05                   0.05  0.07            0.09
movie_facebook_likes                        0.31                   0.12  0.38            0.52
                          cast_total_facebook_likes facenumber_in_poster num_user_for_reviews
num_critic_for_reviews                         0.25                -0.03                 0.57
duration                                       0.13                 0.01                 0.36
director_facebook_likes                        0.12                -0.05                 0.24
actor_3_facebook_likes                         0.48                 0.10                 0.22
actor_1_facebook_likes                         0.95                 0.05                 0.12
gross                                          0.22                -0.04                 0.55
num_voted_users                                0.25                -0.04                 0.78
cast_total_facebook_likes                      1.00                 0.07                 0.18
facenumber_in_poster                           0.07                 1.00                -0.09
num_user_for_reviews                           0.18                -0.09                 1.00
budget                                         0.23                -0.03                 0.40
title_year                                     0.13                 0.08                 0.03
actor_2_facebook_likes                         0.63                 0.07                 0.20
imdb_score                                     0.14                -0.07                 0.35
aspect_ratio                                   0.07                 0.01                 0.10
movie_facebook_likes                           0.21                 0.01                 0.39
                          budget title_year actor_2_facebook_likes imdb_score aspect_ratio
num_critic_for_reviews      0.49       0.42                   0.28       0.36         0.18
duration                    0.30      -0.11                   0.15       0.38         0.16
director_facebook_likes     0.09      -0.06                   0.12       0.22         0.05
actor_3_facebook_likes      0.27       0.13                   0.55       0.09         0.05
actor_1_facebook_likes      0.15       0.09                   0.38       0.12         0.05
gross                       0.64       0.06                   0.25       0.27         0.07
num_voted_users             0.40       0.03                   0.25       0.51         0.09
cast_total_facebook_likes   0.23       0.13                   0.63       0.14         0.07
facenumber_in_poster       -0.03       0.08                   0.07      -0.07         0.01
num_user_for_reviews        0.40       0.03                   0.20       0.35         0.10
budget                      1.00       0.25                   0.25       0.07         0.18
title_year                  0.25       1.00                   0.13      -0.14         0.22
actor_2_facebook_likes      0.25       0.13                   1.00       0.13         0.07
imdb_score                  0.07      -0.14                   0.13       1.00         0.04
aspect_ratio                0.18       0.22                   0.07       0.04         1.00
movie_facebook_likes        0.33       0.31                   0.25       0.29         0.11
                          movie_facebook_likes
num_critic_for_reviews                    0.71
duration                                  0.25
director_facebook_likes                   0.17
actor_3_facebook_likes                    0.31
actor_1_facebook_likes                    0.12
gross                                     0.38
num_voted_users                           0.52
cast_total_facebook_likes                 0.21
facenumber_in_poster                      0.01
num_user_for_reviews                      0.39
budget                                    0.33
title_year                                0.31
actor_2_facebook_likes                    0.25
imdb_score                                0.29
aspect_ratio                              0.11
movie_facebook_likes                      1.00
Sample Size 
[1] 3005
Probability values (Entries above the diagonal are adjusted for multiple tests.) 
                          num_critic_for_reviews duration director_facebook_likes
num_critic_for_reviews                      0.00     0.00                    0.00
duration                                    0.00     0.00                    0.00
director_facebook_likes                     0.00     0.00                    0.00
actor_3_facebook_likes                      0.00     0.00                    0.00
actor_1_facebook_likes                      0.00     0.00                    0.00
gross                                       0.00     0.00                    0.00
num_voted_users                             0.00     0.00                    0.00
cast_total_facebook_likes                   0.00     0.00                    0.00
facenumber_in_poster                        0.09     0.66                    0.00
num_user_for_reviews                        0.00     0.00                    0.00
budget                                      0.00     0.00                    0.00
title_year                                  0.00     0.00                    0.00
actor_2_facebook_likes                      0.00     0.00                    0.00
imdb_score                                  0.00     0.00                    0.00
aspect_ratio                                0.00     0.00                    0.01
movie_facebook_likes                        0.00     0.00                    0.00
                          actor_3_facebook_likes actor_1_facebook_likes gross num_voted_users
num_critic_for_reviews                      0.00                   0.00  0.00            0.00
duration                                    0.00                   0.00  0.00            0.00
director_facebook_likes                     0.00                   0.00  0.00            0.00
actor_3_facebook_likes                      0.00                   0.00  0.00            0.00
actor_1_facebook_likes                      0.00                   0.00  0.00            0.00
gross                                       0.00                   0.00  0.00            0.00
num_voted_users                             0.00                   0.00  0.00            0.00
cast_total_facebook_likes                   0.00                   0.00  0.00            0.00
facenumber_in_poster                        0.00                   0.01  0.05            0.02
num_user_for_reviews                        0.00                   0.00  0.00            0.00
budget                                      0.00                   0.00  0.00            0.00
title_year                                  0.00                   0.00  0.00            0.10
actor_2_facebook_likes                      0.00                   0.00  0.00            0.00
imdb_score                                  0.00                   0.00  0.00            0.00
aspect_ratio                                0.01                   0.00  0.00            0.00
movie_facebook_likes                        0.00                   0.00  0.00            0.00
                          cast_total_facebook_likes facenumber_in_poster num_user_for_reviews
num_critic_for_reviews                            0                 0.65                 0.00
duration                                          0                 1.00                 0.00
director_facebook_likes                           0                 0.06                 0.00
actor_3_facebook_likes                            0                 0.00                 0.00
actor_1_facebook_likes                            0                 0.07                 0.00
gross                                             0                 0.37                 0.00
num_voted_users                                   0                 0.17                 0.00
cast_total_facebook_likes                         0                 0.00                 0.00
facenumber_in_poster                              0                 0.00                 0.00
num_user_for_reviews                              0                 0.00                 0.00
budget                                            0                 0.14                 0.00
title_year                                        0                 0.00                 0.12
actor_2_facebook_likes                            0                 0.00                 0.00
imdb_score                                        0                 0.00                 0.00
aspect_ratio                                      0                 0.55                 0.00
movie_facebook_likes                              0                 0.50                 0.00
                          budget title_year actor_2_facebook_likes imdb_score aspect_ratio
num_critic_for_reviews      0.00       0.00                   0.00       0.00         0.00
duration                    0.00       0.00                   0.00       0.00         0.00
director_facebook_likes     0.00       0.04                   0.00       0.00         0.10
actor_3_facebook_likes      0.00       0.00                   0.00       0.00         0.07
actor_1_facebook_likes      0.00       0.00                   0.00       0.00         0.05
gross                       0.00       0.04                   0.00       0.00         0.00
num_voted_users             0.00       0.65                   0.00       0.00         0.00
cast_total_facebook_likes   0.00       0.00                   0.00       0.00         0.00
facenumber_in_poster        0.65       0.00                   0.01       0.00         1.00
num_user_for_reviews        0.00       0.65                   0.00       0.00         0.00
budget                      0.00       0.00                   0.00       0.00         0.00
title_year                  0.00       0.00                   0.00       0.00         0.00
actor_2_facebook_likes      0.00       0.00                   0.00       0.00         0.00
imdb_score                  0.00       0.00                   0.00       0.00         0.34
aspect_ratio                0.00       0.00                   0.00       0.04         0.00
movie_facebook_likes        0.00       0.00                   0.00       0.00         0.00
                          movie_facebook_likes
num_critic_for_reviews                       0
duration                                     0
director_facebook_likes                      0
actor_3_facebook_likes                       0
actor_1_facebook_likes                       0
gross                                        0
num_voted_users                              0
cast_total_facebook_likes                    0
facenumber_in_poster                         1
num_user_for_reviews                         0
budget                                       0
title_year                                   0
actor_2_facebook_likes                       0
imdb_score                                   0
aspect_ratio                                 0
movie_facebook_likes                         0

 To see confidence intervals of the correlations, print with the short=FALSE option
# Boxplots for significant categorical predictors
Boxplot(movie$imdb_score,movie$color)
 [1] "2110" "1763" "2467" "2216" "2391" "2541" "270"  "1708" "2477" "423"  "1530" "2444"

Black and white movies seems to have a hither meadian rate, and overall a little higher scores. Colors movies have many outliers.

Boxplot for genre:

fill <- "Blue"
line <- "Red"
ggplot(movie, aes(x = genres, y =imdb_score)) +
        geom_boxplot(fill = fill, colour = line) +
        scale_y_continuous(name = "IMDB Score",
                           breaks = seq(0, 11, 0.5),
                           limits=c(0, 11)) +
        scale_x_discrete(name = "Genres") +
        ggtitle("Boxplot of IMDB Score and Genres")

From the boxplot of genres, “Documentation” has the highest median score.And Trill movies has the lowest median. But it is also because there is 1 observation for thrill movies in our data set.

summary(movie$genres)
     Action   Adventure   Animation   Biography      Comedy       Crime Documentary 
        751         291          36         137         853         204          25 
      Drama      Family     Fantasy   Film-Noir   Game-Show     History      Horror 
        506           3          31           0           0           0         138 
      Music     Musical     Mystery     Romance      Sci-Fi    Thriller     Western 
          0           2          16           2           7           1           2 

Boxplots for “title year’:

library(ggplot2)
fill <- "Blue"
line <- "Red"
ggplot(movie, aes(x = as.factor(title_year), y =imdb_score)) +
        geom_boxplot(fill = fill, colour = line) +
        scale_y_continuous(name = "IMDB Score",
                           breaks = seq(1.5, 10, 0.5),
                           limits=c(1.5, 10)) +
        scale_x_discrete(name = "title_year") +
        ggtitle("Boxplot of IMDB Score and Genres")

The median of imdb score of all years seem different. So let’s try to treat title_year as categorical.

# Scatter plot matrix for correlation significant numerical variables
scatterplotMatrix(~movie$imdb_score+movie$num_voted_users+movie$num_critic_for_reviews+movie$num_user_for_reviews+movie$duration+movie$facenumber_in_poster+movie$gross+movie$movie_facebook_likes+movie$director_facebook_likes+movie$cast_total_facebook_likes+movie$budget)

Step 3: fitting regression model

movie.sig<-movie[,c('imdb_score','num_voted_users','num_critic_for_reviews','num_user_for_reviews','duration','facenumber_in_poster','gross','movie_facebook_likes','director_facebook_likes','cast_total_facebook_likes','budget','title_year','genres')]

Step function to check AIC criteria:

null=lm(movie.sig$imdb_score~1) # set null model
summary(null)

Call:
lm(formula = movie.sig$imdb_score ~ 1)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.7873 -0.5873  0.1127  0.7127  2.9127 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.3873     0.0192   332.6   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.053 on 3004 degrees of freedom
  1. Full model is linear additive model
full1=lm(movie.sig$imdb_score~movie.sig$num_voted_users+movie.sig$num_critic_for_reviews+movie.sig$num_user_for_reviews+movie.sig$duration+movie.sig$facenumber_in_poster+movie.sig$gross+movie.sig$movie_facebook_likes+movie.sig$director_facebook_likes+movie.sig$cast_total_facebook_likes+movie.sig$budget+movie.sig$title_year+factor(movie.sig$genres))
summary(full1)

Call:
lm(formula = movie.sig$imdb_score ~ movie.sig$num_voted_users + 
    movie.sig$num_critic_for_reviews + movie.sig$num_user_for_reviews + 
    movie.sig$duration + movie.sig$facenumber_in_poster + movie.sig$gross + 
    movie.sig$movie_facebook_likes + movie.sig$director_facebook_likes + 
    movie.sig$cast_total_facebook_likes + movie.sig$budget + 
    movie.sig$title_year + factor(movie.sig$genres))

Residuals:
    Min      1Q  Median      3Q     Max 
-4.9157 -0.3693  0.0835  0.4993  2.0350 

Coefficients:
                                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)                          5.413e+01  3.604e+00  15.019  < 2e-16 ***
movie.sig$num_voted_users            3.158e-06  1.757e-07  17.969  < 2e-16 ***
movie.sig$num_critic_for_reviews     3.333e-03  2.119e-04  15.727  < 2e-16 ***
movie.sig$num_user_for_reviews      -4.887e-04  5.976e-05  -8.177 4.26e-16 ***
movie.sig$duration                   8.491e-03  7.848e-04  10.820  < 2e-16 ***
movie.sig$facenumber_in_poster      -1.750e-02  6.947e-03  -2.519  0.01182 *  
movie.sig$gross                      2.247e-10  3.096e-10   0.726  0.46808    
movie.sig$movie_facebook_likes      -4.007e-06  9.702e-07  -4.131 3.72e-05 ***
movie.sig$director_facebook_likes    2.832e-07  4.562e-06   0.062  0.95051    
movie.sig$cast_total_facebook_likes  1.110e-06  7.323e-07   1.516  0.12975    
movie.sig$budget                    -4.486e-09  5.125e-10  -8.753  < 2e-16 ***
movie.sig$title_year                -2.467e-02  1.797e-03 -13.727  < 2e-16 ***
factor(movie.sig$genres)Adventure    3.458e-01  5.448e-02   6.347 2.53e-10 ***
factor(movie.sig$genres)Animation    6.621e-01  1.345e-01   4.924 8.93e-07 ***
factor(movie.sig$genres)Biography    6.557e-01  7.661e-02   8.558  < 2e-16 ***
factor(movie.sig$genres)Comedy       1.532e-01  4.361e-02   3.513  0.00045 ***
factor(movie.sig$genres)Crime        4.551e-01  6.464e-02   7.040 2.37e-12 ***
factor(movie.sig$genres)Documentary  9.270e-01  1.608e-01   5.765 8.98e-09 ***
factor(movie.sig$genres)Drama        5.326e-01  4.904e-02  10.861  < 2e-16 ***
factor(movie.sig$genres)Family       2.201e-01  4.521e-01   0.487  0.62639    
factor(movie.sig$genres)Fantasy     -1.629e-01  1.448e-01  -1.125  0.26068    
factor(movie.sig$genres)Horror      -3.858e-01  7.777e-02  -4.961 7.41e-07 ***
factor(movie.sig$genres)Musical     -4.133e-01  5.573e-01  -0.742  0.45839    
factor(movie.sig$genres)Mystery      1.968e-01  1.979e-01   0.995  0.32005    
factor(movie.sig$genres)Romance      5.466e-01  5.506e-01   0.993  0.32095    
factor(movie.sig$genres)Sci-Fi       2.551e-01  2.960e-01   0.862  0.38870    
factor(movie.sig$genres)Thriller    -4.301e-01  7.786e-01  -0.552  0.58077    
factor(movie.sig$genres)Western     -1.037e-01  5.521e-01  -0.188  0.85101    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7768 on 2977 degrees of freedom
Multiple R-squared:  0.4604,    Adjusted R-squared:  0.4555 
F-statistic: 94.07 on 27 and 2977 DF,  p-value: < 2.2e-16
step(null,scope = list(lower=null,upper=full1),direction = 'forward')
Start:  AIC=309.81
movie.sig$imdb_score ~ 1

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_voted_users            1    871.90 2457.2 -600.74
+ movie.sig$duration                   1    491.13 2838.0 -167.82
+ movie.sig$num_critic_for_reviews     1    428.38 2900.8 -102.10
+ movie.sig$num_user_for_reviews       1    407.62 2921.5  -80.68
+ factor(movie.sig$genres)            16    331.02 2998.1   27.10
+ movie.sig$movie_facebook_likes       1    282.82 3046.3   45.02
+ movie.sig$gross                      1    242.62 3086.5   84.42
+ movie.sig$director_facebook_likes    1    166.17 3163.0  157.95
+ movie.sig$title_year                 1     69.27 3259.9  248.63
+ movie.sig$cast_total_facebook_likes  1     64.28 3264.8  253.22
+ movie.sig$budget                     1     16.26 3312.9  297.09
+ movie.sig$facenumber_in_poster       1     15.14 3314.0  298.11
<none>                                             3329.1  309.81

Step:  AIC=-600.74
movie.sig$imdb_score ~ movie.sig$num_voted_users

                                      Df Sum of Sq    RSS     AIC
+ factor(movie.sig$genres)            16   311.531 2145.7 -976.12
+ movie.sig$duration                   1   147.786 2309.4 -785.13
+ movie.sig$title_year                 1    84.649 2372.6 -704.08
+ movie.sig$budget                     1    73.211 2384.0 -689.63
+ movie.sig$num_user_for_reviews       1    21.297 2435.9 -624.90
+ movie.sig$gross                      1    16.929 2440.3 -619.51
+ movie.sig$num_critic_for_reviews     1    14.632 2442.6 -616.69
+ movie.sig$director_facebook_likes    1    13.657 2443.6 -615.49
+ movie.sig$facenumber_in_poster       1     6.789 2450.4 -607.05
+ movie.sig$movie_facebook_likes       1     2.627 2454.6 -601.95
<none>                                             2457.2 -600.74
+ movie.sig$cast_total_facebook_likes  1     0.524 2456.7 -599.38

Step:  AIC=-976.12
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres)

                                      Df Sum of Sq    RSS      AIC
+ movie.sig$title_year                 1    79.623 2066.1 -1087.75
+ movie.sig$duration                   1    74.584 2071.1 -1080.44
+ movie.sig$budget                     1    28.689 2117.0 -1014.57
+ movie.sig$num_critic_for_reviews     1    23.116 2122.6 -1006.67
+ movie.sig$num_user_for_reviews       1    12.251 2133.4  -991.33
+ movie.sig$director_facebook_likes    1     3.707 2142.0  -979.32
+ movie.sig$facenumber_in_poster       1     3.274 2142.4  -978.71
+ movie.sig$movie_facebook_likes       1     1.686 2144.0  -976.49
<none>                                             2145.7  -976.12
+ movie.sig$gross                      1     1.391 2144.3  -976.07
+ movie.sig$cast_total_facebook_likes  1     0.362 2145.3  -974.63

Step:  AIC=-1087.75
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_critic_for_reviews     1   125.091 1941.0 -1273.4
+ movie.sig$duration                   1    55.857 2010.2 -1168.1
+ movie.sig$movie_facebook_likes       1    21.746 2044.3 -1117.5
+ movie.sig$num_user_for_reviews       1    11.741 2054.3 -1102.9
+ movie.sig$budget                     1     9.196 2056.9 -1099.2
+ movie.sig$cast_total_facebook_likes  1     2.923 2063.2 -1090.0
+ movie.sig$director_facebook_likes    1     1.740 2064.3 -1088.3
<none>                                             2066.1 -1087.8
+ movie.sig$facenumber_in_poster       1     1.084 2065.0 -1087.3
+ movie.sig$gross                      1     0.638 2065.4 -1086.7

Step:  AIC=-1273.43
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$budget                     1    36.627 1904.4 -1328.7
+ movie.sig$num_user_for_reviews       1    35.326 1905.7 -1326.6
+ movie.sig$duration                   1    34.873 1906.1 -1325.9
+ movie.sig$gross                      1     7.359 1933.6 -1282.8
+ movie.sig$movie_facebook_likes       1     1.397 1939.6 -1273.6
<none>                                             1941.0 -1273.4
+ movie.sig$facenumber_in_poster       1     0.926 1940.1 -1272.9
+ movie.sig$director_facebook_likes    1     0.644 1940.3 -1272.4
+ movie.sig$cast_total_facebook_likes  1     0.572 1940.4 -1272.3

Step:  AIC=-1328.68
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$duration                   1    58.373 1846.0 -1420.2
+ movie.sig$num_user_for_reviews       1    27.052 1877.3 -1369.7
+ movie.sig$movie_facebook_likes       1     2.576 1901.8 -1330.8
+ movie.sig$cast_total_facebook_likes  1     2.005 1902.3 -1329.8
<none>                                             1904.4 -1328.7
+ movie.sig$facenumber_in_poster       1     1.071 1903.3 -1328.4
+ movie.sig$director_facebook_likes    1     0.557 1903.8 -1327.6
+ movie.sig$gross                      1     0.074 1904.3 -1326.8

Step:  AIC=-1420.23
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_user_for_reviews       1    33.825 1812.2 -1473.8
+ movie.sig$movie_facebook_likes       1     4.702 1841.3 -1425.9
+ movie.sig$facenumber_in_poster       1     2.488 1843.5 -1422.3
+ movie.sig$cast_total_facebook_likes  1     1.601 1844.4 -1420.8
<none>                                             1846.0 -1420.2
+ movie.sig$gross                      1     0.196 1845.8 -1418.5
+ movie.sig$director_facebook_likes    1     0.043 1845.9 -1418.3

Step:  AIC=-1473.81
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$movie_facebook_likes       1   10.4792 1801.7 -1489.2
+ movie.sig$facenumber_in_poster       1    3.7911 1808.4 -1478.1
<none>                                             1812.2 -1473.8
+ movie.sig$cast_total_facebook_likes  1    0.9926 1811.2 -1473.5
+ movie.sig$gross                      1    0.3569 1811.8 -1472.4
+ movie.sig$director_facebook_likes    1    0.0128 1812.2 -1471.8

Step:  AIC=-1489.23
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$movie_facebook_likes

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$facenumber_in_poster       1    3.5218 1798.2 -1493.1
<none>                                             1801.7 -1489.2
+ movie.sig$cast_total_facebook_likes  1    1.0918 1800.6 -1489.0
+ movie.sig$gross                      1    0.3413 1801.3 -1487.8
+ movie.sig$director_facebook_likes    1    0.0167 1801.7 -1487.3

Step:  AIC=-1493.11
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$movie_facebook_likes + movie.sig$facenumber_in_poster

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$cast_total_facebook_likes  1   1.41883 1796.7 -1493.5
<none>                                             1798.2 -1493.1
+ movie.sig$gross                      1   0.33944 1797.8 -1491.7
+ movie.sig$director_facebook_likes    1   0.00320 1798.2 -1491.1

Step:  AIC=-1493.48
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$movie_facebook_likes + movie.sig$facenumber_in_poster + 
    movie.sig$cast_total_facebook_likes

                                    Df Sum of Sq    RSS     AIC
<none>                                           1796.7 -1493.5
+ movie.sig$gross                    1   0.31546 1796.4 -1492.0
+ movie.sig$director_facebook_likes  1   0.00000 1796.7 -1491.5

Call:
lm(formula = movie.sig$imdb_score ~ movie.sig$num_voted_users + 
    factor(movie.sig$genres) + movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$movie_facebook_likes + movie.sig$facenumber_in_poster + 
    movie.sig$cast_total_facebook_likes)

Coefficients:
                        (Intercept)            movie.sig$num_voted_users  
                          5.446e+01                            3.203e-06  
  factor(movie.sig$genres)Adventure    factor(movie.sig$genres)Animation  
                          3.495e-01                            6.687e-01  
  factor(movie.sig$genres)Biography       factor(movie.sig$genres)Comedy  
                          6.564e-01                            1.558e-01  
      factor(movie.sig$genres)Crime  factor(movie.sig$genres)Documentary  
                          4.522e-01                            9.302e-01  
      factor(movie.sig$genres)Drama       factor(movie.sig$genres)Family  
                          5.326e-01                            2.466e-01  
    factor(movie.sig$genres)Fantasy       factor(movie.sig$genres)Horror  
                         -1.616e-01                           -3.839e-01  
    factor(movie.sig$genres)Musical      factor(movie.sig$genres)Mystery  
                         -4.044e-01                            1.950e-01  
    factor(movie.sig$genres)Romance       factor(movie.sig$genres)Sci-Fi  
                          5.455e-01                            2.483e-01  
   factor(movie.sig$genres)Thriller      factor(movie.sig$genres)Western  
                         -4.271e-01                           -9.845e-02  
               movie.sig$title_year     movie.sig$num_critic_for_reviews  
                         -2.483e-02                            3.339e-03  
                   movie.sig$budget                   movie.sig$duration  
                         -4.311e-09                            8.481e-03  
     movie.sig$num_user_for_reviews       movie.sig$movie_facebook_likes  
                         -4.876e-04                           -4.010e-06  
     movie.sig$facenumber_in_poster  movie.sig$cast_total_facebook_likes  
                         -1.753e-02                            1.121e-06  
  1. full model is polynomial regresison model with interaction terms:
full2=lm(movie.sig$imdb_score~poly(movie.sig$num_voted_users,2)+poly(movie.sig$num_critic_for_reviews,2)+poly(movie.sig$num_user_for_reviews,2)+poly(movie.sig$duration,2)+movie.sig$facenumber_in_poster+poly(movie.sig$gross,2)+poly(movie.sig$movie_facebook_likes,2)+movie.sig$director_facebook_likes+movie.sig$cast_total_facebook_likes+movie.sig$budget+movie.sig$title_year+movie.sig$genres+movie.sig$facenumber_in_poster*movie.sig$num_critic_for_reviews+movie.sig$num_user_for_reviews*movie.sig$num_voted_users+movie.sig$num_voted_users*movie.sig$gross+movie.sig$gross*movie.sig$budget)
summary(full2)

Call:
lm(formula = movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 
    2) + poly(movie.sig$num_critic_for_reviews, 2) + poly(movie.sig$num_user_for_reviews, 
    2) + poly(movie.sig$duration, 2) + movie.sig$facenumber_in_poster + 
    poly(movie.sig$gross, 2) + poly(movie.sig$movie_facebook_likes, 
    2) + movie.sig$director_facebook_likes + movie.sig$cast_total_facebook_likes + 
    movie.sig$budget + movie.sig$title_year + movie.sig$genres + 
    movie.sig$facenumber_in_poster * movie.sig$num_critic_for_reviews + 
    movie.sig$num_user_for_reviews * movie.sig$num_voted_users + 
    movie.sig$num_voted_users * movie.sig$gross + movie.sig$gross * 
    movie.sig$budget)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.3608 -0.3549  0.0642  0.4619  2.1792 

Coefficients: (4 not defined because of singularities)
                                                                  Estimate Std. Error t value
(Intercept)                                                      5.948e+01  3.617e+00  16.446
poly(movie.sig$num_voted_users, 2)1                              2.305e+01  3.426e+00   6.727
poly(movie.sig$num_voted_users, 2)2                             -1.873e+01  2.200e+00  -8.514
poly(movie.sig$num_critic_for_reviews, 2)1                       1.393e+01  1.661e+00   8.388
poly(movie.sig$num_critic_for_reviews, 2)2                      -9.490e+00  1.004e+00  -9.452
poly(movie.sig$num_user_for_reviews, 2)1                        -1.760e+01  2.325e+00  -7.568
poly(movie.sig$num_user_for_reviews, 2)2                         4.166e+00  1.593e+00   2.615
poly(movie.sig$duration, 2)1                                     1.087e+01  9.246e-01  11.755
poly(movie.sig$duration, 2)2                                    -3.883e+00  7.809e-01  -4.973
movie.sig$facenumber_in_poster                                  -2.093e-02  1.106e-02  -1.892
poly(movie.sig$gross, 2)1                                       -1.454e+01  2.418e+00  -6.012
poly(movie.sig$gross, 2)2                                       -5.285e+00  1.483e+00  -3.565
poly(movie.sig$movie_facebook_likes, 2)1                         2.580e+00  1.322e+00   1.952
poly(movie.sig$movie_facebook_likes, 2)2                         2.283e-01  8.238e-01   0.277
movie.sig$director_facebook_likes                                4.608e-06  4.409e-06   1.045
movie.sig$cast_total_facebook_likes                              2.533e-07  7.013e-07   0.361
movie.sig$budget                                                -7.852e-09  7.213e-10 -10.886
movie.sig$title_year                                            -2.656e-02  1.809e-03 -14.680
movie.sig$genresAdventure                                        3.727e-01  5.268e-02   7.075
movie.sig$genresAnimation                                        7.564e-01  1.298e-01   5.828
movie.sig$genresBiography                                        6.264e-01  7.351e-02   8.522
movie.sig$genresComedy                                           1.576e-01  4.205e-02   3.747
movie.sig$genresCrime                                            4.558e-01  6.236e-02   7.309
movie.sig$genresDocumentary                                      9.738e-01  1.542e-01   6.317
movie.sig$genresDrama                                            5.230e-01  4.726e-02  11.067
movie.sig$genresFamily                                           5.958e-01  4.362e-01   1.366
movie.sig$genresFantasy                                         -1.891e-01  1.387e-01  -1.364
movie.sig$genresHorror                                          -3.533e-01  7.597e-02  -4.650
movie.sig$genresMusical                                         -4.744e-01  5.328e-01  -0.890
movie.sig$genresMystery                                          1.947e-01  1.891e-01   1.029
movie.sig$genresRomance                                          6.094e-01  5.254e-01   1.160
movie.sig$genresSci-Fi                                           1.471e-01  2.827e-01   0.520
movie.sig$genresThriller                                        -3.085e-01  7.433e-01  -0.415
movie.sig$genresWestern                                         -4.204e-02  5.272e-01  -0.080
movie.sig$num_critic_for_reviews                                        NA         NA      NA
movie.sig$num_user_for_reviews                                          NA         NA      NA
movie.sig$num_voted_users                                               NA         NA      NA
movie.sig$gross                                                         NA         NA      NA
movie.sig$facenumber_in_poster:movie.sig$num_critic_for_reviews -1.291e-06  4.260e-05  -0.030
movie.sig$num_user_for_reviews:movie.sig$num_voted_users         7.966e-10  2.817e-10   2.828
movie.sig$num_voted_users:movie.sig$gross                        1.498e-15  1.105e-15   1.355
movie.sig$budget:movie.sig$gross                                 2.946e-17  4.104e-18   7.180
                                                                Pr(>|t|)    
(Intercept)                                                      < 2e-16 ***
poly(movie.sig$num_voted_users, 2)1                             2.07e-11 ***
poly(movie.sig$num_voted_users, 2)2                              < 2e-16 ***
poly(movie.sig$num_critic_for_reviews, 2)1                       < 2e-16 ***
poly(movie.sig$num_critic_for_reviews, 2)2                       < 2e-16 ***
poly(movie.sig$num_user_for_reviews, 2)1                        5.01e-14 ***
poly(movie.sig$num_user_for_reviews, 2)2                        0.008973 ** 
poly(movie.sig$duration, 2)1                                     < 2e-16 ***
poly(movie.sig$duration, 2)2                                    6.98e-07 ***
movie.sig$facenumber_in_poster                                  0.058589 .  
poly(movie.sig$gross, 2)1                                       2.05e-09 ***
poly(movie.sig$gross, 2)2                                       0.000370 ***
poly(movie.sig$movie_facebook_likes, 2)1                        0.051079 .  
poly(movie.sig$movie_facebook_likes, 2)2                        0.781673    
movie.sig$director_facebook_likes                               0.296005    
movie.sig$cast_total_facebook_likes                             0.717999    
movie.sig$budget                                                 < 2e-16 ***
movie.sig$title_year                                             < 2e-16 ***
movie.sig$genresAdventure                                       1.86e-12 ***
movie.sig$genresAnimation                                       6.23e-09 ***
movie.sig$genresBiography                                        < 2e-16 ***
movie.sig$genresComedy                                          0.000183 ***
movie.sig$genresCrime                                           3.45e-13 ***
movie.sig$genresDocumentary                                     3.06e-10 ***
movie.sig$genresDrama                                            < 2e-16 ***
movie.sig$genresFamily                                          0.172120    
movie.sig$genresFantasy                                         0.172721    
movie.sig$genresHorror                                          3.46e-06 ***
movie.sig$genresMusical                                         0.373334    
movie.sig$genresMystery                                         0.303477    
movie.sig$genresRomance                                         0.246252    
movie.sig$genresSci-Fi                                          0.602769    
movie.sig$genresThriller                                        0.678129    
movie.sig$genresWestern                                         0.936448    
movie.sig$num_critic_for_reviews                                      NA    
movie.sig$num_user_for_reviews                                        NA    
movie.sig$num_voted_users                                             NA    
movie.sig$gross                                                       NA    
movie.sig$facenumber_in_poster:movie.sig$num_critic_for_reviews 0.975820    
movie.sig$num_user_for_reviews:movie.sig$num_voted_users        0.004714 ** 
movie.sig$num_voted_users:movie.sig$gross                       0.175492    
movie.sig$budget:movie.sig$gross                                8.80e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.741 on 2967 degrees of freedom
Multiple R-squared:  0.5107,    Adjusted R-squared:  0.5046 
F-statistic: 83.69 on 37 and 2967 DF,  p-value: < 2.2e-16
step(null,scope=list(lower=null,upper=full2),direction='forward')
Start:  AIC=309.81
movie.sig$imdb_score ~ 1

                                            Df Sum of Sq    RSS     AIC
+ poly(movie.sig$num_voted_users, 2)         2    976.96 2352.2 -730.05
+ movie.sig$num_voted_users                  1    871.90 2457.2 -600.74
+ poly(movie.sig$duration, 2)                2    536.11 2793.0 -213.83
+ poly(movie.sig$num_user_for_reviews, 2)    2    483.99 2845.1 -158.27
+ poly(movie.sig$num_critic_for_reviews, 2)  2    436.49 2892.6 -108.52
+ movie.sig$num_critic_for_reviews           1    428.38 2900.8 -102.10
+ movie.sig$num_user_for_reviews             1    407.62 2921.5  -80.68
+ poly(movie.sig$movie_facebook_likes, 2)    2    317.80 3011.3   12.32
+ movie.sig$genres                          16    331.02 2998.1   27.10
+ poly(movie.sig$gross, 2)                   2    251.27 3077.9   77.99
+ movie.sig$gross                            1    242.62 3086.5   84.42
+ movie.sig$director_facebook_likes          1    166.17 3163.0  157.95
+ movie.sig$title_year                       1     69.27 3259.9  248.63
+ movie.sig$cast_total_facebook_likes        1     64.28 3264.8  253.22
+ movie.sig$budget                           1     16.26 3312.9  297.09
+ movie.sig$facenumber_in_poster             1     15.14 3314.0  298.11
<none>                                                   3329.1  309.81

Step:  AIC=-730.05
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2)

                                            Df Sum of Sq    RSS      AIC
+ movie.sig$genres                          16    337.58 2014.6 -1163.60
+ poly(movie.sig$duration, 2)                2    137.87 2214.3  -907.55
+ movie.sig$budget                           1    133.09 2219.1  -903.07
+ movie.sig$title_year                       1    101.46 2250.7  -860.55
+ poly(movie.sig$gross, 2)                   2     58.78 2293.4  -802.09
+ movie.sig$gross                            1     54.53 2297.6  -798.53
+ poly(movie.sig$num_user_for_reviews, 2)    2     29.12 2323.1  -763.48
+ movie.sig$num_user_for_reviews             1     25.39 2326.8  -760.66
+ movie.sig$director_facebook_likes          1     17.94 2334.2  -751.05
+ movie.sig$facenumber_in_poster             1      6.62 2345.5  -736.52
+ poly(movie.sig$num_critic_for_reviews, 2)  2      5.36 2346.8  -732.90
<none>                                                   2352.2  -730.05
+ movie.sig$num_critic_for_reviews           1      0.18 2352.0  -728.28
+ movie.sig$cast_total_facebook_likes        1      0.15 2352.0  -728.23
+ poly(movie.sig$movie_facebook_likes, 2)    2      1.29 2350.9  -727.70

Step:  AIC=-1163.6
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres

                                            Df Sum of Sq    RSS     AIC
+ movie.sig$title_year                       1    97.775 1916.8 -1311.1
+ movie.sig$budget                           1    65.238 1949.3 -1260.5
+ poly(movie.sig$duration, 2)                2    65.750 1948.8 -1259.3
+ movie.sig$gross                            1    19.722 1994.9 -1191.2
+ poly(movie.sig$gross, 2)                   2    20.698 1993.9 -1190.6
+ poly(movie.sig$num_user_for_reviews, 2)    2    20.024 1994.6 -1189.6
+ movie.sig$num_user_for_reviews             1    14.834 1999.8 -1183.8
+ poly(movie.sig$num_critic_for_reviews, 2)  2     9.375 2005.2 -1173.6
+ movie.sig$director_facebook_likes          1     6.114 2008.5 -1170.7
+ movie.sig$facenumber_in_poster             1     3.792 2010.8 -1167.3
<none>                                                   2014.6 -1163.6
+ movie.sig$cast_total_facebook_likes        1     0.355 2014.2 -1162.1
+ movie.sig$num_critic_for_reviews           1     0.042 2014.5 -1161.7
+ poly(movie.sig$movie_facebook_likes, 2)    2     0.813 2013.8 -1160.8

Step:  AIC=-1311.1
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year

                                            Df Sum of Sq    RSS     AIC
+ poly(movie.sig$num_critic_for_reviews, 2)  2    73.976 1842.8 -1425.4
+ poly(movie.sig$duration, 2)                2    49.885 1866.9 -1386.3
+ movie.sig$num_critic_for_reviews           1    43.723 1873.1 -1378.4
+ movie.sig$budget                           1    32.246 1884.6 -1360.1
+ poly(movie.sig$num_user_for_reviews, 2)    2    21.755 1895.0 -1341.4
+ poly(movie.sig$gross, 2)                   2    19.623 1897.2 -1338.0
+ movie.sig$gross                            1    17.879 1898.9 -1337.3
+ poly(movie.sig$movie_facebook_likes, 2)    2    18.788 1898.0 -1336.7
+ movie.sig$num_user_for_reviews             1    14.396 1902.4 -1331.8
+ movie.sig$director_facebook_likes          1     3.373 1913.4 -1314.4
<none>                                                   1916.8 -1311.1
+ movie.sig$facenumber_in_poster             1     1.216 1915.6 -1311.0
+ movie.sig$cast_total_facebook_likes        1     0.300 1916.5 -1309.6

Step:  AIC=-1425.37
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2)

                                          Df Sum of Sq    RSS     AIC
+ poly(movie.sig$num_user_for_reviews, 2)  2    54.189 1788.6 -1511.1
+ movie.sig$budget                         1    46.017 1796.8 -1499.4
+ poly(movie.sig$duration, 2)              2    38.533 1804.3 -1484.9
+ movie.sig$num_user_for_reviews           1    33.751 1809.1 -1478.9
+ poly(movie.sig$gross, 2)                 2    20.602 1822.2 -1455.2
+ movie.sig$gross                          1    16.630 1826.2 -1450.6
+ poly(movie.sig$movie_facebook_likes, 2)  2     8.227 1834.6 -1434.8
+ movie.sig$director_facebook_likes        1     2.296 1840.5 -1427.1
<none>                                                 1842.8 -1425.4
+ movie.sig$facenumber_in_poster           1     0.831 1842.0 -1424.7
+ movie.sig$cast_total_facebook_likes      1     0.104 1842.7 -1423.5

Step:  AIC=-1511.06
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2)

                                          Df Sum of Sq    RSS     AIC
+ poly(movie.sig$duration, 2)              2    51.219 1737.4 -1594.4
+ movie.sig$budget                         1    34.907 1753.7 -1568.3
+ poly(movie.sig$gross, 2)                 2    20.882 1767.8 -1542.3
+ movie.sig$gross                          1    16.727 1771.9 -1537.3
+ poly(movie.sig$movie_facebook_likes, 2)  2     3.910 1784.7 -1513.6
+ movie.sig$director_facebook_likes        1     2.540 1786.1 -1513.3
+ movie.sig$facenumber_in_poster           1     1.970 1786.7 -1512.4
<none>                                                 1788.6 -1511.1
+ movie.sig$cast_total_facebook_likes      1     0.022 1788.6 -1509.1

Step:  AIC=-1594.36
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2) + poly(movie.sig$duration, 
    2)

                                          Df Sum of Sq    RSS     AIC
+ movie.sig$budget                         1    62.211 1675.2 -1701.9
+ poly(movie.sig$gross, 2)                 2    30.406 1707.0 -1643.4
+ movie.sig$gross                          1    23.936 1713.5 -1634.0
+ movie.sig$facenumber_in_poster           1     4.139 1733.3 -1599.5
<none>                                                 1737.4 -1594.4
+ movie.sig$director_facebook_likes        1     0.946 1736.5 -1594.0
+ poly(movie.sig$movie_facebook_likes, 2)  2     1.928 1735.5 -1593.7
+ movie.sig$cast_total_facebook_likes      1     0.064 1737.4 -1592.5

Step:  AIC=-1701.94
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2) + poly(movie.sig$duration, 
    2) + movie.sig$budget

                                          Df Sum of Sq    RSS     AIC
+ movie.sig$facenumber_in_poster           1    5.0599 1670.2 -1709.0
+ poly(movie.sig$gross, 2)                 2    4.5359 1670.7 -1706.1
+ movie.sig$gross                          1    1.8995 1673.3 -1703.3
<none>                                                 1675.2 -1701.9
+ movie.sig$director_facebook_likes        1    0.6239 1674.6 -1701.1
+ movie.sig$cast_total_facebook_likes      1    0.2471 1675.0 -1700.4
+ poly(movie.sig$movie_facebook_likes, 2)  2    0.8695 1674.3 -1699.5

Step:  AIC=-1709.03
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2) + poly(movie.sig$duration, 
    2) + movie.sig$budget + movie.sig$facenumber_in_poster

                                          Df Sum of Sq    RSS     AIC
+ poly(movie.sig$gross, 2)                 2    4.6247 1665.5 -1713.4
+ movie.sig$gross                          1    1.9720 1668.2 -1710.6
<none>                                                 1670.2 -1709.0
+ movie.sig$director_facebook_likes        1    0.4874 1669.7 -1707.9
+ movie.sig$cast_total_facebook_likes      1    0.4443 1669.7 -1707.8
+ poly(movie.sig$movie_facebook_likes, 2)  2    0.8414 1669.3 -1706.5

Step:  AIC=-1713.36
movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 2) + movie.sig$genres + 
    movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2) + poly(movie.sig$duration, 
    2) + movie.sig$budget + movie.sig$facenumber_in_poster + 
    poly(movie.sig$gross, 2)

                                          Df Sum of Sq    RSS     AIC
<none>                                                 1665.5 -1713.4
+ movie.sig$director_facebook_likes        1   0.49076 1665.0 -1712.2
+ movie.sig$cast_total_facebook_likes      1   0.41310 1665.1 -1712.1
+ poly(movie.sig$movie_facebook_likes, 2)  2   1.10614 1664.4 -1711.4

Call:
lm(formula = movie.sig$imdb_score ~ poly(movie.sig$num_voted_users, 
    2) + movie.sig$genres + movie.sig$title_year + poly(movie.sig$num_critic_for_reviews, 
    2) + poly(movie.sig$num_user_for_reviews, 2) + poly(movie.sig$duration, 
    2) + movie.sig$budget + movie.sig$facenumber_in_poster + 
    poly(movie.sig$gross, 2))

Coefficients:
                               (Intercept)         poly(movie.sig$num_voted_users, 2)1  
                                 5.851e+01                                   3.249e+01  
       poly(movie.sig$num_voted_users, 2)2                   movie.sig$genresAdventure  
                                -1.320e+01                                   3.770e-01  
                 movie.sig$genresAnimation                   movie.sig$genresBiography  
                                 7.306e-01                                   6.559e-01  
                    movie.sig$genresComedy                       movie.sig$genresCrime  
                                 1.875e-01                                   4.845e-01  
               movie.sig$genresDocumentary                       movie.sig$genresDrama  
                                 1.037e+00                                   5.524e-01  
                    movie.sig$genresFamily                     movie.sig$genresFantasy  
                                 2.093e-01                                  -1.231e-01  
                    movie.sig$genresHorror                     movie.sig$genresMusical  
                                -2.986e-01                                  -4.597e-01  
                   movie.sig$genresMystery                     movie.sig$genresRomance  
                                 2.304e-01                                   6.151e-01  
                    movie.sig$genresSci-Fi                    movie.sig$genresThriller  
                                 1.706e-01                                  -2.631e-01  
                   movie.sig$genresWestern                        movie.sig$title_year  
                                 5.056e-02                                  -2.605e-02  
poly(movie.sig$num_critic_for_reviews, 2)1  poly(movie.sig$num_critic_for_reviews, 2)2  
                                 1.634e+01                                  -6.906e+00  
  poly(movie.sig$num_user_for_reviews, 2)1    poly(movie.sig$num_user_for_reviews, 2)2  
                                -1.209e+01                                   7.641e+00  
              poly(movie.sig$duration, 2)1                poly(movie.sig$duration, 2)2  
                                 1.072e+01                                  -3.800e+00  
                          movie.sig$budget              movie.sig$facenumber_in_poster  
                                -4.048e-09                                  -2.026e-02  
                 poly(movie.sig$gross, 2)1                   poly(movie.sig$gross, 2)2  
                                -2.770e+00                                   1.851e+00  
  1. full3: additive model with interaction
full3=
lm(movie.sig$imdb_score ~movie.sig$num_voted_users+movie.sig$num_critic_for_reviews+movie.sig$num_user_for_reviews+movie.sig$duration+movie.sig$facenumber_in_poster+movie.sig$gross+movie.sig$movie_facebook_likes+movie.sig$director_facebook_likes+movie.sig$cast_total_facebook_likes+movie.sig$budget+movie.sig$title_year+factor(movie.sig$genres)+movie.sig$duration*movie.sig$num_voted_users+movie.sig$num_voted_users*movie.sig$num_user_for_reviews+movie.sig$gross*movie.sig$budget,data=movie.sig)
summary(full3)

Call:
lm(formula = movie.sig$imdb_score ~ movie.sig$num_voted_users + 
    movie.sig$num_critic_for_reviews + movie.sig$num_user_for_reviews + 
    movie.sig$duration + movie.sig$facenumber_in_poster + movie.sig$gross + 
    movie.sig$movie_facebook_likes + movie.sig$director_facebook_likes + 
    movie.sig$cast_total_facebook_likes + movie.sig$budget + 
    movie.sig$title_year + factor(movie.sig$genres) + movie.sig$duration * 
    movie.sig$num_voted_users + movie.sig$num_voted_users * movie.sig$num_user_for_reviews + 
    movie.sig$gross * movie.sig$budget, data = movie.sig)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.0519 -0.3700  0.0863  0.4828  2.0996 

Coefficients:
                                                           Estimate Std. Error t value
(Intercept)                                               4.748e+01  3.592e+00  13.218
movie.sig$num_voted_users                                 7.890e-06  4.790e-07  16.472
movie.sig$num_critic_for_reviews                          2.427e-03  2.275e-04  10.669
movie.sig$num_user_for_reviews                           -3.039e-04  6.998e-05  -4.343
movie.sig$duration                                        1.277e-02  9.200e-04  13.882
movie.sig$facenumber_in_poster                           -1.858e-02  6.806e-03  -2.730
movie.sig$gross                                          -1.469e-09  4.191e-10  -3.505
movie.sig$movie_facebook_likes                           -2.370e-06  9.659e-07  -2.454
movie.sig$director_facebook_likes                         3.969e-06  4.482e-06   0.885
movie.sig$cast_total_facebook_likes                       7.641e-07  7.181e-07   1.064
movie.sig$budget                                         -5.900e-09  5.917e-10  -9.971
movie.sig$title_year                                     -2.154e-02  1.790e-03 -12.032
factor(movie.sig$genres)Adventure                         3.308e-01  5.338e-02   6.196
factor(movie.sig$genres)Animation                         7.426e-01  1.319e-01   5.629
factor(movie.sig$genres)Biography                         6.551e-01  7.512e-02   8.720
factor(movie.sig$genres)Comedy                            1.515e-01  4.284e-02   3.537
factor(movie.sig$genres)Crime                             4.496e-01  6.353e-02   7.077
factor(movie.sig$genres)Documentary                       8.960e-01  1.579e-01   5.676
factor(movie.sig$genres)Drama                             4.965e-01  4.835e-02  10.269
factor(movie.sig$genres)Family                            3.329e-01  4.432e-01   0.751
factor(movie.sig$genres)Fantasy                          -1.544e-01  1.419e-01  -1.089
factor(movie.sig$genres)Horror                           -3.577e-01  7.638e-02  -4.683
factor(movie.sig$genres)Musical                          -2.616e-01  5.459e-01  -0.479
factor(movie.sig$genres)Mystery                           1.263e-01  1.939e-01   0.652
factor(movie.sig$genres)Romance                           5.476e-01  5.392e-01   1.016
factor(movie.sig$genres)Sci-Fi                            1.673e-01  2.900e-01   0.577
factor(movie.sig$genres)Thriller                         -4.858e-01  7.627e-01  -0.637
factor(movie.sig$genres)Western                          -1.277e-01  5.408e-01  -0.236
movie.sig$num_voted_users:movie.sig$duration             -3.052e-08  3.447e-09  -8.852
movie.sig$num_voted_users:movie.sig$num_user_for_reviews -3.752e-10  9.851e-11  -3.809
movie.sig$gross:movie.sig$budget                          1.411e-17  2.887e-18   4.886
                                                         Pr(>|t|)    
(Intercept)                                               < 2e-16 ***
movie.sig$num_voted_users                                 < 2e-16 ***
movie.sig$num_critic_for_reviews                          < 2e-16 ***
movie.sig$num_user_for_reviews                           1.46e-05 ***
movie.sig$duration                                        < 2e-16 ***
movie.sig$facenumber_in_poster                           0.006371 ** 
movie.sig$gross                                          0.000463 ***
movie.sig$movie_facebook_likes                           0.014175 *  
movie.sig$director_facebook_likes                        0.376035    
movie.sig$cast_total_facebook_likes                      0.287447    
movie.sig$budget                                          < 2e-16 ***
movie.sig$title_year                                      < 2e-16 ***
factor(movie.sig$genres)Adventure                        6.60e-10 ***
factor(movie.sig$genres)Animation                        1.98e-08 ***
factor(movie.sig$genres)Biography                         < 2e-16 ***
factor(movie.sig$genres)Comedy                           0.000411 ***
factor(movie.sig$genres)Crime                            1.83e-12 ***
factor(movie.sig$genres)Documentary                      1.51e-08 ***
factor(movie.sig$genres)Drama                             < 2e-16 ***
factor(movie.sig$genres)Family                           0.452648    
factor(movie.sig$genres)Fantasy                          0.276414    
factor(movie.sig$genres)Horror                           2.95e-06 ***
factor(movie.sig$genres)Musical                          0.631791    
factor(movie.sig$genres)Mystery                          0.514773    
factor(movie.sig$genres)Romance                          0.309947    
factor(movie.sig$genres)Sci-Fi                           0.563982    
factor(movie.sig$genres)Thriller                         0.524230    
factor(movie.sig$genres)Western                          0.813336    
movie.sig$num_voted_users:movie.sig$duration              < 2e-16 ***
movie.sig$num_voted_users:movie.sig$num_user_for_reviews 0.000143 ***
movie.sig$gross:movie.sig$budget                         1.08e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7607 on 2974 degrees of freedom
Multiple R-squared:  0.483, Adjusted R-squared:  0.4778 
F-statistic: 92.63 on 30 and 2974 DF,  p-value: < 2.2e-16
step(null,scope=list(lower=null,upper=full3),direction='forward')
Start:  AIC=309.81
movie.sig$imdb_score ~ 1

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_voted_users            1    871.90 2457.2 -600.74
+ movie.sig$duration                   1    491.13 2838.0 -167.82
+ movie.sig$num_critic_for_reviews     1    428.38 2900.8 -102.10
+ movie.sig$num_user_for_reviews       1    407.62 2921.5  -80.68
+ factor(movie.sig$genres)            16    331.02 2998.1   27.10
+ movie.sig$movie_facebook_likes       1    282.82 3046.3   45.02
+ movie.sig$gross                      1    242.62 3086.5   84.42
+ movie.sig$director_facebook_likes    1    166.17 3163.0  157.95
+ movie.sig$title_year                 1     69.27 3259.9  248.63
+ movie.sig$cast_total_facebook_likes  1     64.28 3264.8  253.22
+ movie.sig$budget                     1     16.26 3312.9  297.09
+ movie.sig$facenumber_in_poster       1     15.14 3314.0  298.11
<none>                                             3329.1  309.81

Step:  AIC=-600.74
movie.sig$imdb_score ~ movie.sig$num_voted_users

                                      Df Sum of Sq    RSS     AIC
+ factor(movie.sig$genres)            16   311.531 2145.7 -976.12
+ movie.sig$duration                   1   147.786 2309.4 -785.13
+ movie.sig$title_year                 1    84.649 2372.6 -704.08
+ movie.sig$budget                     1    73.211 2384.0 -689.63
+ movie.sig$num_user_for_reviews       1    21.297 2435.9 -624.90
+ movie.sig$gross                      1    16.929 2440.3 -619.51
+ movie.sig$num_critic_for_reviews     1    14.632 2442.6 -616.69
+ movie.sig$director_facebook_likes    1    13.657 2443.6 -615.49
+ movie.sig$facenumber_in_poster       1     6.789 2450.4 -607.05
+ movie.sig$movie_facebook_likes       1     2.627 2454.6 -601.95
<none>                                             2457.2 -600.74
+ movie.sig$cast_total_facebook_likes  1     0.524 2456.7 -599.38

Step:  AIC=-976.12
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres)

                                      Df Sum of Sq    RSS      AIC
+ movie.sig$title_year                 1    79.623 2066.1 -1087.75
+ movie.sig$duration                   1    74.584 2071.1 -1080.44
+ movie.sig$budget                     1    28.689 2117.0 -1014.57
+ movie.sig$num_critic_for_reviews     1    23.116 2122.6 -1006.67
+ movie.sig$num_user_for_reviews       1    12.251 2133.4  -991.33
+ movie.sig$director_facebook_likes    1     3.707 2142.0  -979.32
+ movie.sig$facenumber_in_poster       1     3.274 2142.4  -978.71
+ movie.sig$movie_facebook_likes       1     1.686 2144.0  -976.49
<none>                                             2145.7  -976.12
+ movie.sig$gross                      1     1.391 2144.3  -976.07
+ movie.sig$cast_total_facebook_likes  1     0.362 2145.3  -974.63

Step:  AIC=-1087.75
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_critic_for_reviews     1   125.091 1941.0 -1273.4
+ movie.sig$duration                   1    55.857 2010.2 -1168.1
+ movie.sig$movie_facebook_likes       1    21.746 2044.3 -1117.5
+ movie.sig$num_user_for_reviews       1    11.741 2054.3 -1102.9
+ movie.sig$budget                     1     9.196 2056.9 -1099.2
+ movie.sig$cast_total_facebook_likes  1     2.923 2063.2 -1090.0
+ movie.sig$director_facebook_likes    1     1.740 2064.3 -1088.3
<none>                                             2066.1 -1087.8
+ movie.sig$facenumber_in_poster       1     1.084 2065.0 -1087.3
+ movie.sig$gross                      1     0.638 2065.4 -1086.7

Step:  AIC=-1273.43
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$budget                     1    36.627 1904.4 -1328.7
+ movie.sig$num_user_for_reviews       1    35.326 1905.7 -1326.6
+ movie.sig$duration                   1    34.873 1906.1 -1325.9
+ movie.sig$gross                      1     7.359 1933.6 -1282.8
+ movie.sig$movie_facebook_likes       1     1.397 1939.6 -1273.6
<none>                                             1941.0 -1273.4
+ movie.sig$facenumber_in_poster       1     0.926 1940.1 -1272.9
+ movie.sig$director_facebook_likes    1     0.644 1940.3 -1272.4
+ movie.sig$cast_total_facebook_likes  1     0.572 1940.4 -1272.3

Step:  AIC=-1328.68
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$duration                   1    58.373 1846.0 -1420.2
+ movie.sig$num_user_for_reviews       1    27.052 1877.3 -1369.7
+ movie.sig$movie_facebook_likes       1     2.576 1901.8 -1330.8
+ movie.sig$cast_total_facebook_likes  1     2.005 1902.3 -1329.8
<none>                                             1904.4 -1328.7
+ movie.sig$facenumber_in_poster       1     1.071 1903.3 -1328.4
+ movie.sig$director_facebook_likes    1     0.557 1903.8 -1327.6
+ movie.sig$gross                      1     0.074 1904.3 -1326.8

Step:  AIC=-1420.23
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration

                                               Df Sum of Sq    RSS     AIC
+ movie.sig$num_voted_users:movie.sig$duration  1    70.848 1775.1 -1535.8
+ movie.sig$num_user_for_reviews                1    33.825 1812.2 -1473.8
+ movie.sig$movie_facebook_likes                1     4.702 1841.3 -1425.9
+ movie.sig$facenumber_in_poster                1     2.488 1843.5 -1422.3
+ movie.sig$cast_total_facebook_likes           1     1.601 1844.4 -1420.8
<none>                                                      1846.0 -1420.2
+ movie.sig$gross                               1     0.196 1845.8 -1418.5
+ movie.sig$director_facebook_likes             1     0.043 1845.9 -1418.3

Step:  AIC=-1535.83
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_voted_users:movie.sig$duration

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$num_user_for_reviews       1   26.4426 1748.7 -1578.9
+ movie.sig$facenumber_in_poster       1    2.9576 1772.2 -1538.8
+ movie.sig$cast_total_facebook_likes  1    1.1823 1774.0 -1535.8
<none>                                             1775.1 -1535.8
+ movie.sig$movie_facebook_likes       1    0.9446 1774.2 -1535.4
+ movie.sig$director_facebook_likes    1    0.3854 1774.8 -1534.5
+ movie.sig$gross                      1    0.0191 1775.1 -1533.9

Step:  AIC=-1578.93
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$num_voted_users:movie.sig$duration

                                                           Df Sum of Sq    RSS     AIC
+ movie.sig$num_voted_users:movie.sig$num_user_for_reviews  1    5.4845 1743.2 -1586.4
+ movie.sig$facenumber_in_poster                            1    4.1664 1744.5 -1584.1
+ movie.sig$movie_facebook_likes                            1    3.9301 1744.8 -1583.7
<none>                                                                  1748.7 -1578.9
+ movie.sig$cast_total_facebook_likes                       1    0.7354 1748.0 -1578.2
+ movie.sig$director_facebook_likes                         1    0.2660 1748.4 -1577.4
+ movie.sig$gross                                           1    0.0008 1748.7 -1576.9

Step:  AIC=-1586.37
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$num_voted_users:movie.sig$duration + movie.sig$num_voted_users:movie.sig$num_user_for_reviews

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$facenumber_in_poster       1    4.0181 1739.2 -1591.3
+ movie.sig$movie_facebook_likes       1    3.2754 1739.9 -1590.0
<none>                                             1743.2 -1586.4
+ movie.sig$cast_total_facebook_likes  1    0.6359 1742.6 -1585.5
+ movie.sig$director_facebook_likes    1    0.3798 1742.8 -1585.0
+ movie.sig$gross                      1    0.0475 1743.2 -1584.5

Step:  AIC=-1591.31
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$facenumber_in_poster + movie.sig$num_voted_users:movie.sig$duration + 
    movie.sig$num_voted_users:movie.sig$num_user_for_reviews

                                      Df Sum of Sq    RSS     AIC
+ movie.sig$movie_facebook_likes       1   3.11243 1736.1 -1594.7
<none>                                             1739.2 -1591.3
+ movie.sig$cast_total_facebook_likes  1   0.90996 1738.3 -1590.9
+ movie.sig$director_facebook_likes    1   0.29041 1738.9 -1589.8
+ movie.sig$gross                      1   0.04757 1739.1 -1589.4

Step:  AIC=-1594.69
movie.sig$imdb_score ~ movie.sig$num_voted_users + factor(movie.sig$genres) + 
    movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$facenumber_in_poster + movie.sig$movie_facebook_likes + 
    movie.sig$num_voted_users:movie.sig$duration + movie.sig$num_voted_users:movie.sig$num_user_for_reviews

                                      Df Sum of Sq    RSS     AIC
<none>                                             1736.1 -1594.7
+ movie.sig$cast_total_facebook_likes  1   0.97305 1735.1 -1594.4
+ movie.sig$director_facebook_likes    1   0.27990 1735.8 -1593.2
+ movie.sig$gross                      1   0.03634 1736.0 -1592.8

Call:
lm(formula = movie.sig$imdb_score ~ movie.sig$num_voted_users + 
    factor(movie.sig$genres) + movie.sig$title_year + movie.sig$num_critic_for_reviews + 
    movie.sig$budget + movie.sig$duration + movie.sig$num_user_for_reviews + 
    movie.sig$facenumber_in_poster + movie.sig$movie_facebook_likes + 
    movie.sig$num_voted_users:movie.sig$duration + movie.sig$num_voted_users:movie.sig$num_user_for_reviews)

Coefficients:
                                             (Intercept)  
                                               4.817e+01  
                               movie.sig$num_voted_users  
                                               7.152e-06  
                       factor(movie.sig$genres)Adventure  
                                               3.300e-01  
                       factor(movie.sig$genres)Animation  
                                               7.097e-01  
                       factor(movie.sig$genres)Biography  
                                               6.794e-01  
                          factor(movie.sig$genres)Comedy  
                                               1.675e-01  
                           factor(movie.sig$genres)Crime  
                                               4.784e-01  
                     factor(movie.sig$genres)Documentary  
                                               9.449e-01  
                           factor(movie.sig$genres)Drama  
                                               5.252e-01  
                          factor(movie.sig$genres)Family  
                                               2.260e-01  
                         factor(movie.sig$genres)Fantasy  
                                              -1.422e-01  
                          factor(movie.sig$genres)Horror  
                                              -3.440e-01  
                         factor(movie.sig$genres)Musical  
                                              -3.165e-01  
                         factor(movie.sig$genres)Mystery  
                                               1.499e-01  
                         factor(movie.sig$genres)Romance  
                                               5.682e-01  
                          factor(movie.sig$genres)Sci-Fi  
                                               1.953e-01  
                        factor(movie.sig$genres)Thriller  
                                              -4.097e-01  
                         factor(movie.sig$genres)Western  
                                              -4.521e-02  
                                    movie.sig$title_year  
                                              -2.189e-02  
                        movie.sig$num_critic_for_reviews  
                                               2.566e-03  
                                        movie.sig$budget  
                                              -4.370e-09  
                                      movie.sig$duration  
                                               1.206e-02  
                          movie.sig$num_user_for_reviews  
                                              -3.210e-04  
                          movie.sig$facenumber_in_poster  
                                              -1.750e-02  
                          movie.sig$movie_facebook_likes  
                                              -2.239e-06  
            movie.sig$num_voted_users:movie.sig$duration  
                                              -2.661e-08  
movie.sig$num_voted_users:movie.sig$num_user_for_reviews  
                                              -2.729e-10  

For convinience to interpret the result, I will start with Full3(additive mode with interactiin terms). After checking residual, then decide should we add higher order terms.

Split data into Test and Train:

indx = sample(1:nrow(movie.sig), as.integer(0.9*nrow(movie.sig)))
indx # ramdomize rows, save 90% of data into index
   [1] 1459  687  622  680 1895 1987  720 2522 1606 2667 2526 1304 1473  170 1039 2805  344
  [18] 2297  466 2457 2183 2781 1441  803 2206 1406  956 2236 2130 2220  879 1190 2757  324
  [35] 2321 2602 1743 2312 1840 1980 1930  473 1014 1510 2384 1195  520 2442  177    8  475
  [52]  478 1058 2654 2966  258 1132 1455 1867 1385 1947 1207 1550  526 1670 2487 2825  346
  [69] 2953  373 1082  187 2090  716 1729  756 2434 1458 1931 1942  566 1241 1488 2907 1837
  [86]  511 2764  866  701 2879 2732 2137 1408 2878 2808 1126  663 2722 2096  154 2156  683
 [103] 1909 2952 2333 1033 2897 1365 2012 2122  203  839 1129 1658  970 2671 2370 2664 1652
 [120] 2898 1872 2231 2596 2438 2655  638 2738 2749 2293 2557  456  654 1167 1245 1416 1559
 [137] 1929 1047 2858 2358  754 2705  943  724  439  325 1189 2138  125 2046  890 1893 2766
 [154] 2177 1656  805   32  118  166 2098  528  405  129 2067 2748 1740  845  283 1998  316
 [171] 1125  580 2322 1140 1388 2449 1228 2892  745 2784 2511  568  145  567  294 2540 2224
 [188] 2880  577 2842 1653   49 2502  801 1092 1276 2594 2209  718 2372 1732  261   14 1067
 [205]  864 1394 2943 2223 1174  448 2288 1130 2478 2396 2876 2977 2736 1297 2834 2488  278
 [222] 1041  614 1420 1215 1200 1075 1182 2944 1617 2520 2399 1321 1166 2796 1518 1763 2920
 [239] 1730  337 2750 2300 2049  550 1376 1769 2426   82 2350  953 1798 1592  352 1091 1512
 [256] 2648 2486 2239 2633 2988 1700 1948 1019  470  454 2428  376   75 2865 2378 1862 2051
 [273] 2266  476  780 1484  656 2308  894  617  398  913 2184 2946 2999 1615 2002  789  436
 [290] 1211   64 2192 2140  922  250 1979 1920 2505 2393  495 2121 1149 2809 1985   74  194
 [307] 2277 1721  686 1119  161 1806 2506  148 2128  886 1111  856 2037  912  134  882 1274
 [324] 1582  952 1504 1235 2491   68 2429 2937 1866 2219 2367  380 2200  366 1432 2727 1822
 [341] 2916 1679 1471 1257  164 2803   73  869 1439  521 2160 1733 2843  779  700 2101  684
 [358] 2674  901 2886 2261  807  988 2762 1278  106  232 1252 1203  576 2310 1961 2807 2411
 [375] 1050 1158  655  109  180  812 2084 1568 1774 2395  875 1939 1834 2066  791 1004 2436
 [392]  107 2271 1699 2303 2249  447   71 2592  849 2338 1009 1312  794 1976  165 2402 2143
 [409]  340 2039 2117 1879 1090  898 1279 2702 3000 2070 2289  169 1030 1693  282 1904 1894
 [426]  823 2696 2621 1614 1752  605 1311 2846 2230 2600 2719  648 2650 2362 2643 2877  666
 [443]  112 1650 1786 2334 2790 2631 2383 2024 2448 1248 1431  360 2100 1736 2485 2454 1480
 [460] 1906 2814  793 1669 1833 1026 2139 1065  618 2881    5 2088 1694  480 2829 1450  243
 [477] 2683 1773 2695 1803  897 2382  111  989 1294 2662 1148 1269  623  105 1397  357 1444
 [494] 1181 1509 1505 1221  911  263  190 2769 1263 2893 2961 1513 1854  916  332 3001  133
 [511] 2328 1982 1242 1169 1162  564  861 1638 2060 1779 2119 1059  149 1819 2458  546 2133
 [528] 2777 1478  596 1845 2198  650 2213 1555 2347 1392 1856 2295 1573  948 1794  670 2237
 [545] 2001 1981  840  982 2968  752 2565 2590 2132  160  738  298 2175 2578  847 1426 2154
 [562] 1151  222 1310 1830 2099 2246 2682 1565 2135 2496  764 1369  972  459  290  992 1630
 [579] 1358  865 1110 2863 1564  287  624 2913  978   13  319  240 2938 1688  512   63 2519
 [596] 2811 1675 2637 2948 1001  602 2305 1002 1071 1061 2058  438 2555   87   95 1583  608
 [613]  994  534 2896  429  556  917 1628  174 1303  171  558 1433 2989  296 1179 2343   70
 [630] 2055 1453  765 1475  522 1449 2553 2385 2844 2681 2006 1481 2348  626 2222 2279   22
 [647] 2780 2167   45 2452 2389  214 1413 1624 1500 2831  414  652  966  518 1899 2765  225
 [664] 1745 2391 1283  510 1016 1536  707 1623 1844 2653   11 2172 1407 2450 1777 2207 1120
 [681] 1270  770 1187 2409 1896  998 2755  559 2582  404 1463 1299  492  539 1135 1003 2118
 [698] 1134 1145  843 2315 2353  268 2290 1411 1277 1508 1994 1968 2415 2883 2480 1863 2958
 [715] 2240 1106  712 1422 1996 2441 1386 1051 2641 2123 2437 2004 2972   39 2813 2280  585
 [732] 2847 1919 1057 2984   89 2884 2950 1831 1771 2447 1418 2693 2659 2285  678 2606  751
 [749] 1762 1916 2574 1199  104 1680 2775  334 2284 1813 2327 1851 1391 1622 1847  772 2795
 [766] 1801 1191 1755 2020 1558  674 2112 1335  204  137 2013 2235 1527  467  728 1836 2895
 [783]  792   80 1626 1155 2109  693 1309  382  163  906 2439 2691 1805 1264 1428 2142  692
 [800] 1477   60  775 2991  393 1170 2394 1870 1829  984 1494 1250 1322  743 1775 2497 2355
 [817] 2741 1659 2922 2188   10 2782  273  842 2023  369  777  867 1457  547 1052 2234 1186
 [834] 2194 1519 1711  859 2421 1913 1910 1604 1206   36 2361  402  312 2116   57 1908 1240
 [851]  810 2745 2369 2752 1479  372 1613   44 1112   86 1883 2268 1681 2332  482 1541 1561
 [868]  788  152  941  349  395  237 1371 1006 1557  443  390 1495 2927 2083  959 2646 1231
 [885] 2152 2461  961    3 1724 1156 2743 2832 2489  272 2687  536  834 1799 2164 1725  651
 [902]  672 2204  588 2767 1672 1881  990  140 1402 2559 1983  681 2162 2500  433  387  335
 [919] 1959 2178  837 2901 2583  841  315 2826  314 1074 1702 2311  757 1902 1122  257 2597
 [936] 1222  570 2957 2357 1668 2306 2566  183  175 1064  491 1005 1340 1054 1757  833 1609
 [953]  130  938 2414 1667 2558 2349 2131  560 2015 2607 1826  442  926 2304 2345  783   50
 [970]   18 1472  474 2619 2509 1717  158 1103 2773 2982 2424 2605 1213  735 1238  950 2267
 [987]  432   54 2660 2874 1522  410 2157 2074 2010 1701 2900 1419 1034 2314
 [ reached getOption("max.print") -- omitted 1704 entries ]
movie_train = movie.sig[indx,]
movie_test = movie.sig[-indx,]
# lm.fit 1: linear model with interaction term dropping insig predictors.
# insig terms: director facebooklike','movie fb like' and 'cast total fb likes' from summary(full3)
# Note: nothing to do with step function we choose for full3.
lm.fit1<-lm(movie_train$imdb_score~movie_train$num_voted_users+movie_train$num_critic_for_reviews+movie_train$num_user_for_reviews+movie_train$duration+movie_train$facenumber_in_poster+movie_train$gross+movie_train$budget+movie_train$title_year+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews+movie_train$gross*movie_train$budget)
summary(lm.fit1)

Call:
lm(formula = movie_train$imdb_score ~ movie_train$num_voted_users + 
    movie_train$num_critic_for_reviews + movie_train$num_user_for_reviews + 
    movie_train$duration + movie_train$facenumber_in_poster + 
    movie_train$gross + movie_train$budget + movie_train$title_year + 
    factor(movie_train$genres) + movie_train$duration * movie_train$num_voted_users + 
    movie_train$num_voted_users * movie_train$num_user_for_reviews + 
    movie_train$gross * movie_train$budget)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.1959 -0.3517  0.0879  0.4768  2.0449 

Coefficients:
                                                               Estimate Std. Error t value
(Intercept)                                                   4.580e+01  3.742e+00  12.241
movie_train$num_voted_users                                   8.024e-06  5.020e-07  15.984
movie_train$num_critic_for_reviews                            2.074e-03  2.033e-04  10.205
movie_train$num_user_for_reviews                             -2.258e-04  7.116e-05  -3.173
movie_train$duration                                          1.257e-02  9.638e-04  13.040
movie_train$facenumber_in_poster                             -1.522e-02  7.082e-03  -2.150
movie_train$gross                                            -1.469e-09  4.415e-10  -3.327
movie_train$budget                                           -6.183e-09  6.179e-10 -10.006
movie_train$title_year                                       -2.067e-02  1.864e-03 -11.086
factor(movie_train$genres)Adventure                           3.175e-01  5.634e-02   5.636
factor(movie_train$genres)Animation                           7.754e-01  1.370e-01   5.659
factor(movie_train$genres)Biography                           6.939e-01  7.815e-02   8.880
factor(movie_train$genres)Comedy                              1.399e-01  4.485e-02   3.120
factor(movie_train$genres)Crime                               4.577e-01  6.648e-02   6.885
factor(movie_train$genres)Documentary                         8.612e-01  1.571e-01   5.482
factor(movie_train$genres)Drama                               4.855e-01  5.013e-02   9.684
factor(movie_train$genres)Family                              3.124e-01  4.397e-01   0.711
factor(movie_train$genres)Fantasy                            -1.592e-01  1.457e-01  -1.093
factor(movie_train$genres)Horror                             -3.674e-01  8.077e-02  -4.548
factor(movie_train$genres)Musical                             2.891e-01  7.584e-01   0.381
factor(movie_train$genres)Mystery                             1.888e-01  2.123e-01   0.889
factor(movie_train$genres)Romance                             5.237e-01  5.351e-01   0.979
factor(movie_train$genres)Sci-Fi                              3.399e-01  3.401e-01   0.999
factor(movie_train$genres)Thriller                           -5.425e-01  7.569e-01  -0.717
factor(movie_train$genres)Western                            -1.221e-01  5.355e-01  -0.228
movie_train$num_voted_users:movie_train$duration             -3.153e-08  3.582e-09  -8.802
movie_train$num_voted_users:movie_train$num_user_for_reviews -4.288e-10  1.021e-10  -4.200
movie_train$gross:movie_train$budget                          1.442e-17  2.967e-18   4.862
                                                             Pr(>|t|)    
(Intercept)                                                   < 2e-16 ***
movie_train$num_voted_users                                   < 2e-16 ***
movie_train$num_critic_for_reviews                            < 2e-16 ***
movie_train$num_user_for_reviews                             0.001525 ** 
movie_train$duration                                          < 2e-16 ***
movie_train$facenumber_in_poster                             0.031655 *  
movie_train$gross                                            0.000889 ***
movie_train$budget                                            < 2e-16 ***
movie_train$title_year                                        < 2e-16 ***
factor(movie_train$genres)Adventure                          1.92e-08 ***
factor(movie_train$genres)Animation                          1.68e-08 ***
factor(movie_train$genres)Biography                           < 2e-16 ***
factor(movie_train$genres)Comedy                             0.001826 ** 
factor(movie_train$genres)Crime                              7.17e-12 ***
factor(movie_train$genres)Documentary                        4.59e-08 ***
factor(movie_train$genres)Drama                               < 2e-16 ***
factor(movie_train$genres)Family                             0.477401    
factor(movie_train$genres)Fantasy                            0.274661    
factor(movie_train$genres)Horror                             5.65e-06 ***
factor(movie_train$genres)Musical                            0.703066    
factor(movie_train$genres)Mystery                            0.373894    
factor(movie_train$genres)Romance                            0.327889    
factor(movie_train$genres)Sci-Fi                             0.317717    
factor(movie_train$genres)Thriller                           0.473572    
factor(movie_train$genres)Western                            0.819678    
movie_train$num_voted_users:movie_train$duration              < 2e-16 ***
movie_train$num_voted_users:movie_train$num_user_for_reviews 2.76e-05 ***
movie_train$gross:movie_train$budget                         1.23e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7548 on 2676 degrees of freedom
Multiple R-squared:  0.4806,    Adjusted R-squared:  0.4753 
F-statistic:  91.7 on 27 and 2676 DF,  p-value: < 2.2e-16

The P-value is very samll.All terms are significant but face number in posters is the least significant variable.Adjusted R^2 is 0.4727, which means 47.27% of the variability can be explained by this model.

Do Lack of fit test to see if removing the predictors improve model performance:

#lm.full: full linear model with interaction terms on train dataset.
lm.full<-lm(movie_train$imdb_score~movie_train$num_voted_users+movie_train$num_critic_for_reviews+movie_train$num_user_for_reviews+movie_train$duration+movie_train$facenumber_in_poster+movie_train$gross+movie_train$movie_facebook_likes+movie_train$director_facebook_likes+movie_train$cast_total_facebook_likes+movie_train$budget+movie_train$title_year+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews+movie_train$gross*movie_train$budget)
anova(lm.full,lm.fit1) # H0: reduced model fits===lack of fit=0
Analysis of Variance Table

Model 1: movie_train$imdb_score ~ movie_train$num_voted_users + movie_train$num_critic_for_reviews + 
    movie_train$num_user_for_reviews + movie_train$duration + 
    movie_train$facenumber_in_poster + movie_train$gross + movie_train$movie_facebook_likes + 
    movie_train$director_facebook_likes + movie_train$cast_total_facebook_likes + 
    movie_train$budget + movie_train$title_year + factor(movie_train$genres) + 
    movie_train$duration * movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews + movie_train$gross * movie_train$budget
Model 2: movie_train$imdb_score ~ movie_train$num_voted_users + movie_train$num_critic_for_reviews + 
    movie_train$num_user_for_reviews + movie_train$duration + 
    movie_train$facenumber_in_poster + movie_train$gross + movie_train$budget + 
    movie_train$title_year + factor(movie_train$genres) + movie_train$duration * 
    movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews + movie_train$gross * movie_train$budget
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   2673 1521.0                           
2   2676 1524.5 -3   -3.4834 2.0406 0.1061

The P-value of the partial F-test is 0.1379, which means dropping ‘director facebooklike’,‘movie fb like’ and ‘cast total fb likes’ did improve model performance.

Diagnostics:

plot(lm.fit1)
not plotting observations with leverage one:
  1943

not plotting observations with leverage one:
  1943

NaNs producedNaNs produced

# residual vs fitted indicates might be higher order term. Normal plot not good.
library(car)
package ‘car’ was built under R version 3.3.2
residualPlots(lm.fit1)
library(car)
residualPlots(lm.fit1)

                                   Test stat Pr(>|t|)
movie_train$num_voted_users           -8.392    0.000
movie_train$num_critic_for_reviews    -8.103    0.000
movie_train$num_user_for_reviews       3.422    0.001
movie_train$duration                  -4.590    0.000
movie_train$facenumber_in_poster       0.222    0.824
movie_train$gross                     -4.093    0.000
movie_train$budget                     5.060    0.000
movie_train$title_year                -3.571    0.000
factor(movie_train$genres)                NA       NA
Tukey test                           -14.631    0.000

All of the residual vs predictor plots have a general trend of cerviture, which indicates the current model does not fit. Higher order terms should be included.

Fit model with higer order terms:

# lm.fit2: model based on lm.fit1 adding higer order for all variables except for 'face number in poster' and 'title-year'.
lm.fit2<-lm(movie_train$imdb_score~poly(movie_train$num_voted_users,2)+poly(movie_train$num_critic_for_reviews,2)+poly(movie_train$num_user_for_reviews,2)+poly(movie_train$duration,2)+movie_train$facenumber_in_poster+poly(movie_train$gross,2)+poly(movie_train$budget,2)+movie_train$title_year+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews+movie_train$gross*movie_train$budget)
summary(lm.fit2)

Call:
lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 
    2) + poly(movie_train$num_critic_for_reviews, 2) + poly(movie_train$num_user_for_reviews, 
    2) + poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    poly(movie_train$gross, 2) + poly(movie_train$budget, 2) + 
    movie_train$title_year + factor(movie_train$genres) + movie_train$duration * 
    movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews + movie_train$gross * movie_train$budget)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.2968 -0.3438  0.0658  0.4506  2.1797 

Coefficients: (5 not defined because of singularities)
                                                               Estimate Std. Error t value
(Intercept)                                                   5.135e+01  3.743e+00  13.718
poly(movie_train$num_voted_users, 2)1                         4.155e+01  4.974e+00   8.353
poly(movie_train$num_voted_users, 2)2                        -1.676e+01  2.294e+00  -7.305
poly(movie_train$num_critic_for_reviews, 2)1                  1.286e+01  1.322e+00   9.726
poly(movie_train$num_critic_for_reviews, 2)2                 -7.240e+00  8.612e-01  -8.406
poly(movie_train$num_user_for_reviews, 2)1                   -1.726e+01  2.285e+00  -7.555
poly(movie_train$num_user_for_reviews, 2)2                    2.394e+00  1.545e+00   1.550
poly(movie_train$duration, 2)1                                1.390e+01  1.090e+00  12.761
poly(movie_train$duration, 2)2                               -3.545e+00  7.736e-01  -4.583
movie_train$facenumber_in_poster                             -1.772e-02  6.853e-03  -2.586
poly(movie_train$gross, 2)1                                  -7.344e+00  2.222e+00  -3.305
poly(movie_train$gross, 2)2                                  -2.567e+00  1.249e+00  -2.054
poly(movie_train$budget, 2)1                                 -1.507e+01  1.982e+00  -7.604
poly(movie_train$budget, 2)2                                  5.687e+00  1.123e+00   5.064
movie_train$title_year                                       -2.249e-02  1.873e-03 -12.007
factor(movie_train$genres)Adventure                           3.607e-01  5.507e-02   6.551
factor(movie_train$genres)Animation                           8.269e-01  1.336e-01   6.189
factor(movie_train$genres)Biography                           6.398e-01  7.582e-02   8.439
factor(movie_train$genres)Comedy                              1.206e-01  4.368e-02   2.760
factor(movie_train$genres)Crime                               4.265e-01  6.442e-02   6.620
factor(movie_train$genres)Documentary                         8.692e-01  1.525e-01   5.698
factor(movie_train$genres)Drama                               4.738e-01  4.877e-02   9.715
factor(movie_train$genres)Family                              3.901e-01  4.307e-01   0.906
factor(movie_train$genres)Fantasy                            -1.906e-01  1.411e-01  -1.350
factor(movie_train$genres)Horror                             -3.979e-01  7.997e-02  -4.976
factor(movie_train$genres)Musical                            -5.804e-04  7.333e-01  -0.001
factor(movie_train$genres)Mystery                             1.861e-01  2.052e-01   0.907
factor(movie_train$genres)Romance                             5.716e-01  5.168e-01   1.106
factor(movie_train$genres)Sci-Fi                              2.334e-01  3.287e-01   0.710
factor(movie_train$genres)Thriller                           -4.679e-01  7.312e-01  -0.640
factor(movie_train$genres)Western                            -8.372e-02  5.174e-01  -0.162
movie_train$duration                                                 NA         NA      NA
movie_train$num_voted_users                                          NA         NA      NA
movie_train$num_user_for_reviews                                     NA         NA      NA
movie_train$gross                                                    NA         NA      NA
movie_train$budget                                                   NA         NA      NA
movie_train$duration:movie_train$num_voted_users             -1.902e-08  3.604e-09  -5.278
movie_train$num_voted_users:movie_train$num_user_for_reviews  1.098e-09  3.055e-10   3.595
movie_train$gross:movie_train$budget                          1.447e-17  5.637e-18   2.566
                                                             Pr(>|t|)    
(Intercept)                                                   < 2e-16 ***
poly(movie_train$num_voted_users, 2)1                         < 2e-16 ***
poly(movie_train$num_voted_users, 2)2                        3.64e-13 ***
poly(movie_train$num_critic_for_reviews, 2)1                  < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)2                  < 2e-16 ***
poly(movie_train$num_user_for_reviews, 2)1                   5.71e-14 ***
poly(movie_train$num_user_for_reviews, 2)2                   0.121257    
poly(movie_train$duration, 2)1                                < 2e-16 ***
poly(movie_train$duration, 2)2                               4.80e-06 ***
movie_train$facenumber_in_poster                             0.009763 ** 
poly(movie_train$gross, 2)1                                  0.000962 ***
poly(movie_train$gross, 2)2                                  0.040030 *  
poly(movie_train$budget, 2)1                                 3.96e-14 ***
poly(movie_train$budget, 2)2                                 4.39e-07 ***
movie_train$title_year                                        < 2e-16 ***
factor(movie_train$genres)Adventure                          6.85e-11 ***
factor(movie_train$genres)Animation                          6.99e-10 ***
factor(movie_train$genres)Biography                           < 2e-16 ***
factor(movie_train$genres)Comedy                             0.005817 ** 
factor(movie_train$genres)Crime                              4.32e-11 ***
factor(movie_train$genres)Documentary                        1.34e-08 ***
factor(movie_train$genres)Drama                               < 2e-16 ***
factor(movie_train$genres)Family                             0.365151    
factor(movie_train$genres)Fantasy                            0.177040    
factor(movie_train$genres)Horror                             6.92e-07 ***
factor(movie_train$genres)Musical                            0.999369    
factor(movie_train$genres)Mystery                            0.364726    
factor(movie_train$genres)Romance                            0.268800    
factor(movie_train$genres)Sci-Fi                             0.477651    
factor(movie_train$genres)Thriller                           0.522324    
factor(movie_train$genres)Western                            0.871459    
movie_train$duration                                               NA    
movie_train$num_voted_users                                        NA    
movie_train$num_user_for_reviews                                   NA    
movie_train$gross                                                  NA    
movie_train$budget                                                 NA    
movie_train$duration:movie_train$num_voted_users             1.41e-07 ***
movie_train$num_voted_users:movie_train$num_user_for_reviews 0.000331 ***
movie_train$gross:movie_train$budget                         0.010334 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7286 on 2670 degrees of freedom
Multiple R-squared:  0.5171,    Adjusted R-squared:  0.5111 
F-statistic: 86.64 on 33 and 2670 DF,  p-value: < 2.2e-16

The second order term for ‘num user for reviews’ is not sig, can be droped. The second order term for ‘gross’ is sig but close to not sig, can be droped. The interaction for ‘gross’ and ‘budget’ is not very significant, could be droped.

# lm.fit3: based on lm.fit2 dropping second order term for 'number of users for review', 'gross' and budget*gross
lm.fit3<-lm(movie_train$imdb_score~poly(movie_train$num_voted_users,2)+poly(movie_train$num_critic_for_reviews,2)+movie_train$num_user_for_reviews+poly(movie_train$duration,2)+movie_train$facenumber_in_poster+movie_train$gross+poly(movie_train$budget,2)+movie_train$title_year+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews)
summary(lm.fit3)

Call:
lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 
    2) + poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + movie_train$title_year + 
    factor(movie_train$genres) + movie_train$duration * movie_train$num_voted_users + 
    movie_train$num_voted_users * movie_train$num_user_for_reviews)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.3148 -0.3437  0.0642  0.4562  2.1678 

Coefficients: (2 not defined because of singularities)
                                                               Estimate Std. Error t value
(Intercept)                                                   5.026e+01  3.705e+00  13.566
poly(movie_train$num_voted_users, 2)1                         3.777e+01  4.488e+00   8.417
poly(movie_train$num_voted_users, 2)2                        -1.839e+01  2.155e+00  -8.536
poly(movie_train$num_critic_for_reviews, 2)1                  1.239e+01  1.304e+00   9.500
poly(movie_train$num_critic_for_reviews, 2)2                 -6.883e+00  8.398e-01  -8.196
movie_train$num_user_for_reviews                             -8.981e-04  9.504e-05  -9.450
poly(movie_train$duration, 2)1                                1.370e+01  1.087e+00  12.606
poly(movie_train$duration, 2)2                               -3.474e+00  7.732e-01  -4.493
movie_train$facenumber_in_poster                             -1.710e-02  6.854e-03  -2.495
movie_train$gross                                            -6.501e-10  3.199e-10  -2.032
poly(movie_train$budget, 2)1                                 -1.111e+01  1.172e+00  -9.483
poly(movie_train$budget, 2)2                                  7.544e+00  8.093e-01   9.322
movie_train$title_year                                       -2.176e-02  1.851e-03 -11.756
factor(movie_train$genres)Adventure                           3.666e-01  5.500e-02   6.664
factor(movie_train$genres)Animation                           8.306e-01  1.336e-01   6.218
factor(movie_train$genres)Biography                           6.505e-01  7.581e-02   8.581
factor(movie_train$genres)Comedy                              1.246e-01  4.363e-02   2.855
factor(movie_train$genres)Crime                               4.374e-01  6.435e-02   6.797
factor(movie_train$genres)Documentary                         8.714e-01  1.526e-01   5.709
factor(movie_train$genres)Drama                               4.776e-01  4.877e-02   9.792
factor(movie_train$genres)Family                              1.873e-01  4.243e-01   0.441
factor(movie_train$genres)Fantasy                            -1.894e-01  1.411e-01  -1.343
factor(movie_train$genres)Horror                             -3.967e-01  7.985e-02  -4.969
factor(movie_train$genres)Musical                            -9.698e-02  7.326e-01  -0.132
factor(movie_train$genres)Mystery                             2.004e-01  2.054e-01   0.976
factor(movie_train$genres)Romance                             5.801e-01  5.173e-01   1.121
factor(movie_train$genres)Sci-Fi                              2.208e-01  3.288e-01   0.672
factor(movie_train$genres)Thriller                           -4.749e-01  7.320e-01  -0.649
factor(movie_train$genres)Western                            -8.251e-02  5.179e-01  -0.159
movie_train$duration                                                 NA         NA      NA
movie_train$num_voted_users                                          NA         NA      NA
movie_train$duration:movie_train$num_voted_users             -1.881e-08  3.566e-09  -5.276
movie_train$num_user_for_reviews:movie_train$num_voted_users  1.460e-09  2.193e-10   6.657
                                                             Pr(>|t|)    
(Intercept)                                                   < 2e-16 ***
poly(movie_train$num_voted_users, 2)1                         < 2e-16 ***
poly(movie_train$num_voted_users, 2)2                         < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)1                  < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)2                 3.82e-16 ***
movie_train$num_user_for_reviews                              < 2e-16 ***
poly(movie_train$duration, 2)1                                < 2e-16 ***
poly(movie_train$duration, 2)2                               7.34e-06 ***
movie_train$facenumber_in_poster                              0.01265 *  
movie_train$gross                                             0.04222 *  
poly(movie_train$budget, 2)1                                  < 2e-16 ***
poly(movie_train$budget, 2)2                                  < 2e-16 ***
movie_train$title_year                                        < 2e-16 ***
factor(movie_train$genres)Adventure                          3.22e-11 ***
factor(movie_train$genres)Animation                          5.82e-10 ***
factor(movie_train$genres)Biography                           < 2e-16 ***
factor(movie_train$genres)Comedy                              0.00434 ** 
factor(movie_train$genres)Crime                              1.31e-11 ***
factor(movie_train$genres)Documentary                        1.26e-08 ***
factor(movie_train$genres)Drama                               < 2e-16 ***
factor(movie_train$genres)Family                              0.65895    
factor(movie_train$genres)Fantasy                             0.17950    
factor(movie_train$genres)Horror                             7.17e-07 ***
factor(movie_train$genres)Musical                             0.89470    
factor(movie_train$genres)Mystery                             0.32923    
factor(movie_train$genres)Romance                             0.26223    
factor(movie_train$genres)Sci-Fi                              0.50193    
factor(movie_train$genres)Thriller                            0.51656    
factor(movie_train$genres)Western                             0.87343    
movie_train$duration                                               NA    
movie_train$num_voted_users                                        NA    
movie_train$duration:movie_train$num_voted_users             1.43e-07 ***
movie_train$num_user_for_reviews:movie_train$num_voted_users 3.37e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7294 on 2673 degrees of freedom
Multiple R-squared:  0.5155,    Adjusted R-squared:  0.5101 
F-statistic:  94.8 on 30 and 2673 DF,  p-value: < 2.2e-16
anova(lm.fit2,lm.fit3) 
Analysis of Variance Table

Model 1: movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
    poly(movie_train$num_critic_for_reviews, 2) + poly(movie_train$num_user_for_reviews, 
    2) + poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    poly(movie_train$gross, 2) + poly(movie_train$budget, 2) + 
    movie_train$title_year + factor(movie_train$genres) + movie_train$duration * 
    movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews + movie_train$gross * movie_train$budget
Model 2: movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
    poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + movie_train$title_year + 
    factor(movie_train$genres) + movie_train$duration * movie_train$num_voted_users + 
    movie_train$num_voted_users * movie_train$num_user_for_reviews
  Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
1   2670 1417.3                              
2   2673 1422.0 -3   -4.6802 2.9388 0.03202 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

P-value for lack of fit test is : 0.074. Meaning lm.fit3 is better than lm.fit2. R^2 for lm.fit3: 0.5075, 50.75% of variation could be explained by this model.

Diagnostics for lm.fit3:

plot(lm.fit3)
not plotting observations with leverage one:
  696, 2019

not plotting observations with leverage one:
  696, 2019

library(car)
package ‘car’ was built under R version 3.3.2
residualPlots(lm.fit3)
library(car)
residualPlots(lm.fit3)

                                            Test stat Pr(>|t|)
poly(movie_train$num_voted_users, 2)               NA       NA
poly(movie_train$num_critic_for_reviews, 2)        NA       NA
movie_train$num_user_for_reviews                1.452    0.147
poly(movie_train$duration, 2)                      NA       NA
movie_train$facenumber_in_poster                0.545    0.586
movie_train$gross                              -0.351    0.725
poly(movie_train$budget, 2)                        NA       NA
movie_train$title_year                         -5.751    0.000
factor(movie_train$genres)                         NA       NA
movie_train$duration                            0.309    0.757
movie_train$num_voted_users                    -0.816    0.415
Tukey test                                    -12.501    0.000

The plot is way better than lm.fit2. All the residuals vs predictors are strainght lines except for title year. So, let’t try to add second order for title year.

# lm.fit4: based on lm.fit3 addting second order for title year.
lm.fit4<-lm(movie_train$imdb_score~poly(movie_train$num_voted_users,2)+poly(movie_train$num_critic_for_reviews,2)+movie_train$num_user_for_reviews+poly(movie_train$duration,2)+movie_train$facenumber_in_poster+movie_train$gross+poly(movie_train$budget,2)+poly(movie_train$title_year,2)+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews)
summary(lm.fit4)

Call:
lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 
    2) + poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + poly(movie_train$title_year, 
    2) + factor(movie_train$genres) + movie_train$duration * 
    movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.2522 -0.3390  0.0530  0.4512  2.1654 

Coefficients: (2 not defined because of singularities)
                                                               Estimate Std. Error t value
(Intercept)                                                   6.671e+00  6.245e-02 106.824
poly(movie_train$num_voted_users, 2)1                         3.453e+01  4.496e+00   7.679
poly(movie_train$num_voted_users, 2)2                        -1.862e+01  2.142e+00  -8.693
poly(movie_train$num_critic_for_reviews, 2)1                  1.630e+01  1.464e+00  11.135
poly(movie_train$num_critic_for_reviews, 2)2                 -7.726e+00  8.476e-01  -9.115
movie_train$num_user_for_reviews                             -1.012e-03  9.652e-05 -10.483
poly(movie_train$duration, 2)1                                1.350e+01  1.081e+00  12.485
poly(movie_train$duration, 2)2                               -3.353e+00  7.689e-01  -4.361
movie_train$facenumber_in_poster                             -1.385e-02  6.836e-03  -2.026
movie_train$gross                                            -5.902e-10  3.182e-10  -1.855
poly(movie_train$budget, 2)1                                 -1.121e+01  1.165e+00  -9.623
poly(movie_train$budget, 2)2                                  7.985e+00  8.081e-01   9.881
poly(movie_train$title_year, 2)1                             -1.287e+01  9.894e-01 -13.003
poly(movie_train$title_year, 2)2                             -4.846e+00  8.428e-01  -5.751
factor(movie_train$genres)Adventure                           3.733e-01  5.469e-02   6.826
factor(movie_train$genres)Animation                           8.806e-01  1.331e-01   6.618
factor(movie_train$genres)Biography                           6.519e-01  7.536e-02   8.651
factor(movie_train$genres)Comedy                              1.284e-01  4.338e-02   2.960
factor(movie_train$genres)Crime                               4.424e-01  6.398e-02   6.916
factor(movie_train$genres)Documentary                         9.256e-01  1.520e-01   6.089
factor(movie_train$genres)Drama                               4.851e-01  4.850e-02  10.004
factor(movie_train$genres)Family                              1.515e-01  4.218e-01   0.359
factor(movie_train$genres)Fantasy                            -2.432e-01  1.406e-01  -1.731
factor(movie_train$genres)Horror                             -4.202e-01  7.948e-02  -5.287
factor(movie_train$genres)Musical                            -1.489e-01  7.283e-01  -0.204
factor(movie_train$genres)Mystery                             2.139e-01  2.042e-01   1.048
factor(movie_train$genres)Romance                             6.070e-01  5.143e-01   1.180
factor(movie_train$genres)Sci-Fi                              2.309e-01  3.268e-01   0.707
factor(movie_train$genres)Thriller                           -2.927e-01  7.283e-01  -0.402
factor(movie_train$genres)Western                            -6.708e-02  5.148e-01  -0.130
movie_train$duration                                                 NA         NA      NA
movie_train$num_voted_users                                          NA         NA      NA
movie_train$duration:movie_train$num_voted_users             -1.767e-08  3.550e-09  -4.978
movie_train$num_user_for_reviews:movie_train$num_voted_users  1.584e-09  2.191e-10   7.229
                                                             Pr(>|t|)    
(Intercept)                                                   < 2e-16 ***
poly(movie_train$num_voted_users, 2)1                        2.23e-14 ***
poly(movie_train$num_voted_users, 2)2                         < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)1                  < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)2                  < 2e-16 ***
movie_train$num_user_for_reviews                              < 2e-16 ***
poly(movie_train$duration, 2)1                                < 2e-16 ***
poly(movie_train$duration, 2)2                               1.34e-05 ***
movie_train$facenumber_in_poster                               0.0428 *  
movie_train$gross                                              0.0637 .  
poly(movie_train$budget, 2)1                                  < 2e-16 ***
poly(movie_train$budget, 2)2                                  < 2e-16 ***
poly(movie_train$title_year, 2)1                              < 2e-16 ***
poly(movie_train$title_year, 2)2                             9.90e-09 ***
factor(movie_train$genres)Adventure                          1.08e-11 ***
factor(movie_train$genres)Animation                          4.39e-11 ***
factor(movie_train$genres)Biography                           < 2e-16 ***
factor(movie_train$genres)Comedy                               0.0031 ** 
factor(movie_train$genres)Crime                              5.81e-12 ***
factor(movie_train$genres)Documentary                        1.30e-09 ***
factor(movie_train$genres)Drama                               < 2e-16 ***
factor(movie_train$genres)Family                               0.7194    
factor(movie_train$genres)Fantasy                              0.0836 .  
factor(movie_train$genres)Horror                             1.34e-07 ***
factor(movie_train$genres)Musical                              0.8380    
factor(movie_train$genres)Mystery                              0.2948    
factor(movie_train$genres)Romance                              0.2379    
factor(movie_train$genres)Sci-Fi                               0.4798    
factor(movie_train$genres)Thriller                             0.6878    
factor(movie_train$genres)Western                              0.8963    
movie_train$duration                                               NA    
movie_train$num_voted_users                                        NA    
movie_train$duration:movie_train$num_voted_users             6.85e-07 ***
movie_train$num_user_for_reviews:movie_train$num_voted_users 6.30e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.725 on 2672 degrees of freedom
Multiple R-squared:  0.5214,    Adjusted R-squared:  0.5159 
F-statistic: 93.91 on 31 and 2672 DF,  p-value: < 2.2e-16
anova(lm.fit4,lm.fit3)
Analysis of Variance Table

Model 1: movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
    poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + poly(movie_train$title_year, 
    2) + factor(movie_train$genres) + movie_train$duration * 
    movie_train$num_voted_users + movie_train$num_voted_users * 
    movie_train$num_user_for_reviews
Model 2: movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
    poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + movie_train$title_year + 
    factor(movie_train$genres) + movie_train$duration * movie_train$num_voted_users + 
    movie_train$num_voted_users * movie_train$num_user_for_reviews
  Res.Df    RSS Df Sum of Sq     F  Pr(>F)    
1   2672 1404.6                               
2   2673 1422.0 -1   -17.384 33.07 9.9e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

P value is so small, reject null, meaning adding second order term for title year did not improve model.

Marginal Model plot:

marginalModelPlots(lm.fit3)
Splines and/or polynomials replaced by a fitted linear combination
residualPlots(lm.fit3)

The plots of the response versus the individual predictors display the conditional distribution of the response given each predictor, ignoring the other predictors. From our plots, our model is really good.since the marginal relationship between the response and the predictor are overlapping.

Check for residual ourliers:

library(car)
qqPlot(lm.fit3$residuals,id.n = 10)
 702 1013  113 2411 1129 1165  526 1256 2591  617 
   1    2    3    4    5    6    7    8    9   10 

outlierTest(lm.fit3) # H0: residual is not an outlier
      rstudent unadjusted p-value Bonferonni p
1017 -7.518666         7.5090e-14   2.0289e-10
665  -5.454427         5.3633e-08   1.4492e-04
237  -5.370977         8.5054e-08   2.2982e-04
438  -5.336489         1.0271e-07   2.7754e-04
1638 -4.976677         6.8794e-07   1.8588e-03
1490 -4.821383         1.5056e-06   4.0680e-03
1885 -4.751121         2.1299e-06   5.7551e-03
2004 -4.621856         3.9838e-06   1.0764e-02
230  -4.600509         4.4111e-06   1.1919e-02
1688 -4.463399         8.4012e-06   2.2700e-02

All of the 10 residuals have significant p-values, therefore, we can drop them.

Before we drop, let’s do some digsnostics to double check which to drop.

library(car)
influencePlot(lm.fit3, id.n=10)

From the influcence plot, we decided to drop observations: 2572,1423,860,1520,509,682,1017,848,361,237

# lm.fit5: model based on lm.fit3 removing 10 outliers.
movie_train<-movie_train[-c(2572,1423,860,1520,509,682,1017,848,361,237),]
lm.fit5<-lm(movie_train$imdb_score~poly(movie_train$num_voted_users,2)+poly(movie_train$num_critic_for_reviews,2)+movie_train$num_user_for_reviews+poly(movie_train$duration,2)+movie_train$facenumber_in_poster+movie_train$gross+poly(movie_train$budget,2)+movie_train$title_year+factor(movie_train$genres)+movie_train$duration*movie_train$num_voted_users+movie_train$num_voted_users*movie_train$num_user_for_reviews)
summary(lm.fit5)

Call:
lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 
    2) + poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
    poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + 
    movie_train$gross + poly(movie_train$budget, 2) + movie_train$title_year + 
    factor(movie_train$genres) + movie_train$duration * movie_train$num_voted_users + 
    movie_train$num_voted_users * movie_train$num_user_for_reviews)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0199 -0.3417  0.0631  0.4482  2.1921 

Coefficients: (2 not defined because of singularities)
                                                               Estimate Std. Error t value
(Intercept)                                                   4.939e+01  3.662e+00  13.489
poly(movie_train$num_voted_users, 2)1                         3.607e+01  4.258e+00   8.470
poly(movie_train$num_voted_users, 2)2                        -1.726e+01  1.912e+00  -9.028
poly(movie_train$num_critic_for_reviews, 2)1                  1.179e+01  1.292e+00   9.124
poly(movie_train$num_critic_for_reviews, 2)2                 -6.890e+00  8.185e-01  -8.418
movie_train$num_user_for_reviews                             -8.997e-04  9.718e-05  -9.258
poly(movie_train$duration, 2)1                                1.310e+01  1.087e+00  12.055
poly(movie_train$duration, 2)2                               -3.713e+00  7.669e-01  -4.841
movie_train$facenumber_in_poster                             -1.770e-02  6.739e-03  -2.627
movie_train$gross                                            -6.864e-10  3.180e-10  -2.158
poly(movie_train$budget, 2)1                                 -1.122e+01  1.151e+00  -9.749
poly(movie_train$budget, 2)2                                  7.581e+00  7.955e-01   9.530
movie_train$title_year                                       -2.135e-02  1.829e-03 -11.669
factor(movie_train$genres)Adventure                           3.775e-01  5.422e-02   6.963
factor(movie_train$genres)Animation                           8.417e-01  1.314e-01   6.405
factor(movie_train$genres)Biography                           6.552e-01  7.454e-02   8.791
factor(movie_train$genres)Comedy                              1.319e-01  4.292e-02   3.074
factor(movie_train$genres)Crime                               4.393e-01  6.342e-02   6.927
factor(movie_train$genres)Documentary                         1.102e+00  1.530e-01   7.203
factor(movie_train$genres)Drama                               4.833e-01  4.797e-02  10.075
factor(movie_train$genres)Family                              2.445e-01  5.087e-01   0.481
factor(movie_train$genres)Fantasy                            -1.847e-01  1.387e-01  -1.332
factor(movie_train$genres)Horror                             -3.908e-01  7.853e-02  -4.976
factor(movie_train$genres)Musical                            -1.044e-01  7.202e-01  -0.145
factor(movie_train$genres)Mystery                             2.057e-01  2.019e-01   1.019
factor(movie_train$genres)Sci-Fi                              2.135e-01  3.232e-01   0.660
factor(movie_train$genres)Thriller                           -4.767e-01  7.196e-01  -0.662
movie_train$duration                                                 NA         NA      NA
movie_train$num_voted_users                                          NA         NA      NA
movie_train$duration:movie_train$num_voted_users             -1.610e-08  3.652e-09  -4.408
movie_train$num_user_for_reviews:movie_train$num_voted_users  1.481e-09  2.349e-10   6.306
                                                             Pr(>|t|)    
(Intercept)                                                   < 2e-16 ***
poly(movie_train$num_voted_users, 2)1                         < 2e-16 ***
poly(movie_train$num_voted_users, 2)2                         < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)1                  < 2e-16 ***
poly(movie_train$num_critic_for_reviews, 2)2                  < 2e-16 ***
movie_train$num_user_for_reviews                              < 2e-16 ***
poly(movie_train$duration, 2)1                                < 2e-16 ***
poly(movie_train$duration, 2)2                               1.36e-06 ***
movie_train$facenumber_in_poster                              0.00867 ** 
movie_train$gross                                             0.03099 *  
poly(movie_train$budget, 2)1                                  < 2e-16 ***
poly(movie_train$budget, 2)2                                  < 2e-16 ***
movie_train$title_year                                        < 2e-16 ***
factor(movie_train$genres)Adventure                          4.19e-12 ***
factor(movie_train$genres)Animation                          1.77e-10 ***
factor(movie_train$genres)Biography                           < 2e-16 ***
factor(movie_train$genres)Comedy                              0.00213 ** 
factor(movie_train$genres)Crime                              5.37e-12 ***
factor(movie_train$genres)Documentary                        7.61e-13 ***
factor(movie_train$genres)Drama                               < 2e-16 ***
factor(movie_train$genres)Family                              0.63080    
factor(movie_train$genres)Fantasy                             0.18300    
factor(movie_train$genres)Horror                             6.89e-07 ***
factor(movie_train$genres)Musical                             0.88477    
factor(movie_train$genres)Mystery                             0.30847    
factor(movie_train$genres)Sci-Fi                              0.50900    
factor(movie_train$genres)Thriller                            0.50773    
movie_train$duration                                               NA    
movie_train$num_voted_users                                        NA    
movie_train$duration:movie_train$num_voted_users             1.09e-05 ***
movie_train$num_user_for_reviews:movie_train$num_voted_users 3.35e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.717 on 2665 degrees of freedom
Multiple R-squared:  0.5223,    Adjusted R-squared:  0.5173 
F-statistic: 104.1 on 28 and 2665 DF,  p-value: < 2.2e-16
compareCoefs(lm.fit3, lm.fit5)

Call:
1: lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
  poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
  poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + movie_train$gross + 
  poly(movie_train$budget, 2) + movie_train$title_year + factor(movie_train$genres) + 
  movie_train$duration * movie_train$num_voted_users + movie_train$num_voted_users * 
  movie_train$num_user_for_reviews)
2: lm(formula = movie_train$imdb_score ~ poly(movie_train$num_voted_users, 2) + 
  poly(movie_train$num_critic_for_reviews, 2) + movie_train$num_user_for_reviews + 
  poly(movie_train$duration, 2) + movie_train$facenumber_in_poster + movie_train$gross + 
  poly(movie_train$budget, 2) + movie_train$title_year + factor(movie_train$genres) + 
  movie_train$duration * movie_train$num_voted_users + movie_train$num_voted_users * 
  movie_train$num_user_for_reviews)
                                                                Est. 1      SE 1    Est. 2
(Intercept)                                                   5.03e+01  3.70e+00  4.94e+01
poly(movie_train$num_voted_users, 2)1                         3.78e+01  4.49e+00  3.61e+01
poly(movie_train$num_voted_users, 2)2                        -1.84e+01  2.15e+00 -1.73e+01
poly(movie_train$num_critic_for_reviews, 2)1                  1.24e+01  1.30e+00  1.18e+01
poly(movie_train$num_critic_for_reviews, 2)2                 -6.88e+00  8.40e-01 -6.89e+00
movie_train$num_user_for_reviews                             -8.98e-04  9.50e-05 -9.00e-04
poly(movie_train$duration, 2)1                                1.37e+01  1.09e+00  1.31e+01
poly(movie_train$duration, 2)2                               -3.47e+00  7.73e-01 -3.71e+00
movie_train$facenumber_in_poster                             -1.71e-02  6.85e-03 -1.77e-02
movie_train$gross                                            -6.50e-10  3.20e-10 -6.86e-10
poly(movie_train$budget, 2)1                                 -1.11e+01  1.17e+00 -1.12e+01
poly(movie_train$budget, 2)2                                  7.54e+00  8.09e-01  7.58e+00
movie_train$title_year                                       -2.18e-02  1.85e-03 -2.13e-02
factor(movie_train$genres)Adventure                           3.67e-01  5.50e-02  3.78e-01
factor(movie_train$genres)Animation                           8.31e-01  1.34e-01  8.42e-01
factor(movie_train$genres)Biography                           6.50e-01  7.58e-02  6.55e-01
factor(movie_train$genres)Comedy                              1.25e-01  4.36e-02  1.32e-01
factor(movie_train$genres)Crime                               4.37e-01  6.44e-02  4.39e-01
factor(movie_train$genres)Documentary                         8.71e-01  1.53e-01  1.10e+00
factor(movie_train$genres)Drama                               4.78e-01  4.88e-02  4.83e-01
factor(movie_train$genres)Family                              1.87e-01  4.24e-01  2.45e-01
factor(movie_train$genres)Fantasy                            -1.89e-01  1.41e-01 -1.85e-01
factor(movie_train$genres)Horror                             -3.97e-01  7.99e-02 -3.91e-01
factor(movie_train$genres)Musical                            -9.70e-02  7.33e-01 -1.04e-01
factor(movie_train$genres)Mystery                             2.00e-01  2.05e-01  2.06e-01
factor(movie_train$genres)Romance                             5.80e-01  5.17e-01          
factor(movie_train$genres)Sci-Fi                              2.21e-01  3.29e-01  2.13e-01
factor(movie_train$genres)Thriller                           -4.75e-01  7.32e-01 -4.77e-01
factor(movie_train$genres)Western                            -8.25e-02  5.18e-01          
movie_train$duration                                                                      
movie_train$num_voted_users                                                               
movie_train$duration:movie_train$num_voted_users             -1.88e-08  3.57e-09 -1.61e-08
movie_train$num_user_for_reviews:movie_train$num_voted_users  1.46e-09  2.19e-10  1.48e-09
                                                                  SE 2
(Intercept)                                                   3.66e+00
poly(movie_train$num_voted_users, 2)1                         4.26e+00
poly(movie_train$num_voted_users, 2)2                         1.91e+00
poly(movie_train$num_critic_for_reviews, 2)1                  1.29e+00
poly(movie_train$num_critic_for_reviews, 2)2                  8.18e-01
movie_train$num_user_for_reviews                              9.72e-05
poly(movie_train$duration, 2)1                                1.09e+00
poly(movie_train$duration, 2)2                                7.67e-01
movie_train$facenumber_in_poster                              6.74e-03
movie_train$gross                                             3.18e-10
poly(movie_train$budget, 2)1                                  1.15e+00
poly(movie_train$budget, 2)2                                  7.95e-01
movie_train$title_year                                        1.83e-03
factor(movie_train$genres)Adventure                           5.42e-02
factor(movie_train$genres)Animation                           1.31e-01
factor(movie_train$genres)Biography                           7.45e-02
factor(movie_train$genres)Comedy                              4.29e-02
factor(movie_train$genres)Crime                               6.34e-02
factor(movie_train$genres)Documentary                         1.53e-01
factor(movie_train$genres)Drama                               4.80e-02
factor(movie_train$genres)Family                              5.09e-01
factor(movie_train$genres)Fantasy                             1.39e-01
factor(movie_train$genres)Horror                              7.85e-02
factor(movie_train$genres)Musical                             7.20e-01
factor(movie_train$genres)Mystery                             2.02e-01
factor(movie_train$genres)Romance                                     
factor(movie_train$genres)Sci-Fi                              3.23e-01
factor(movie_train$genres)Thriller                            7.20e-01
factor(movie_train$genres)Western                                     
movie_train$duration                                                  
movie_train$num_voted_users                                           
movie_train$duration:movie_train$num_voted_users              3.65e-09
movie_train$num_user_for_reviews:movie_train$num_voted_users  2.35e-10

Removing outliers did not change the result too much.

Diagnostics for lm.fit5:

library(car)
residualPlots(lm.fit5)
library(car)
residualPlots(lm.fit5)

                                            Test stat Pr(>|t|)
poly(movie_train$num_voted_users, 2)               NA       NA
poly(movie_train$num_critic_for_reviews, 2)        NA       NA
movie_train$num_user_for_reviews                1.401    0.161
poly(movie_train$duration, 2)                      NA       NA
movie_train$facenumber_in_poster                0.607    0.544
movie_train$gross                              -0.212    0.832
poly(movie_train$budget, 2)                        NA       NA
movie_train$title_year                         -5.570    0.000
factor(movie_train$genres)                         NA       NA
movie_train$duration                           -0.528    0.598
movie_train$num_voted_users                    -0.942    0.346
Tukey test                                    -12.189    0.000

Looks good except for residuals vs fitted values show some curviture.

plot(lm.fit5)
not plotting observations with leverage one:
  69, 1934

not plotting observations with leverage one:
  69, 1934

Now,let’s look at model assumption for both lm.fit3 and lm.fit5:

# normality
shapiro.test(lm.fit3$residuals)

    Shapiro-Wilk normality test

data:  lm.fit3$residuals
W = 0.93923, p-value < 2.2e-16
shapiro.test(lm.fit5$residuals)

    Shapiro-Wilk normality test

data:  lm.fit5$residuals
W = 0.94636, p-value < 2.2e-16

Both models failed the normality assumption. I think this is due to the many outliers in the data set.

# equal variance : H0: variance is not constant
ncvTest(lm.fit3)
Non-constant Variance Score Test 
Variance formula: ~ fitted.values 
Chisquare = 145.0484    Df = 1     p = 2.095931e-33 
ncvTest(lm.fit5)
Non-constant Variance Score Test 
Variance formula: ~ fitted.values 
Chisquare = 145.0484    Df = 1     p = 2.095931e-33 

Both models passed the equal variance assumption.

This is just to explore more interesting facts Plots for data with fitted regression line:

library(ggplot2)
package ‘ggplot2’ was built under R version 3.3.2
ggplot(data=movie_train,aes(x=duration,y=imdb_score,colour=factor(genres)))+stat_smooth(method=lm,fullrange = FALSE)+geom_point()

library(ggplot2)
ggplot(data=movie_train,aes(x=num_voted_users,y=imdb_score,colour=factor(genres)))+stat_smooth(method=lm,fullrange = FALSE)+geom_point()

library(ggplot2)
ggplot(data=movie_train,aes(x=facenumber_in_poster,y=imdb_score,colour=factor(genres)))+stat_smooth(method=lm,fullrange = FALSE)+geom_point()

library(ggplot2)
ggplot(data=movie_train,aes(x=gross,y=imdb_score,colour=factor(genres)))+stat_smooth(method=lm,fullrange = FALSE)+geom_point()

library(ggplot2)
ggplot(data=movie_train,aes(x=budget,y=imdb_score,colour=factor(genres)))+stat_smooth(method=lm,fullrange = FALSE)+geom_point()

Step 4: Making predictions on the test dataset

Rewriting model lm.fit5 in another notation: # Note, if write in lm(train\(score~train\)x1+train$x2….), it will create the same number of values with the train data set when predict().

# lm.fit6 =lm.fit 5 using difference writing
lm.fit6<-lm(imdb_score~poly(num_voted_users,2)+poly(num_critic_for_reviews,2)+num_user_for_reviews+poly(duration,2)+facenumber_in_poster+gross+poly(budget,2)+title_year+genres+duration*num_voted_users+num_voted_users*num_user_for_reviews,data=data.frame(movie_train))
summary(lm.fit6)

Call:
lm(formula = imdb_score ~ poly(num_voted_users, 2) + poly(num_critic_for_reviews, 
    2) + num_user_for_reviews + poly(duration, 2) + facenumber_in_poster + 
    gross + poly(budget, 2) + title_year + genres + duration * 
    num_voted_users + num_voted_users * num_user_for_reviews, 
    data = data.frame(movie_train))

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0199 -0.3417  0.0631  0.4482  2.1921 

Coefficients: (2 not defined because of singularities)
                                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)                           4.939e+01  3.662e+00  13.489  < 2e-16 ***
poly(num_voted_users, 2)1             3.607e+01  4.258e+00   8.470  < 2e-16 ***
poly(num_voted_users, 2)2            -1.726e+01  1.912e+00  -9.028  < 2e-16 ***
poly(num_critic_for_reviews, 2)1      1.179e+01  1.292e+00   9.124  < 2e-16 ***
poly(num_critic_for_reviews, 2)2     -6.890e+00  8.185e-01  -8.418  < 2e-16 ***
num_user_for_reviews                 -8.997e-04  9.718e-05  -9.258  < 2e-16 ***
poly(duration, 2)1                    1.310e+01  1.087e+00  12.055  < 2e-16 ***
poly(duration, 2)2                   -3.713e+00  7.669e-01  -4.841 1.36e-06 ***
facenumber_in_poster                 -1.770e-02  6.739e-03  -2.627  0.00867 ** 
gross                                -6.864e-10  3.180e-10  -2.158  0.03099 *  
poly(budget, 2)1                     -1.122e+01  1.151e+00  -9.749  < 2e-16 ***
poly(budget, 2)2                      7.581e+00  7.955e-01   9.530  < 2e-16 ***
title_year                           -2.135e-02  1.829e-03 -11.669  < 2e-16 ***
genresAdventure                       3.775e-01  5.422e-02   6.963 4.19e-12 ***
genresAnimation                       8.417e-01  1.314e-01   6.405 1.77e-10 ***
genresBiography                       6.552e-01  7.454e-02   8.791  < 2e-16 ***
genresComedy                          1.319e-01  4.292e-02   3.074  0.00213 ** 
genresCrime                           4.393e-01  6.342e-02   6.927 5.37e-12 ***
genresDocumentary                     1.102e+00  1.530e-01   7.203 7.61e-13 ***
genresDrama                           4.833e-01  4.797e-02  10.075  < 2e-16 ***
genresFamily                          2.445e-01  5.087e-01   0.481  0.63080    
genresFantasy                        -1.847e-01  1.387e-01  -1.332  0.18300    
genresHorror                         -3.908e-01  7.853e-02  -4.976 6.89e-07 ***
genresMusical                        -1.044e-01  7.202e-01  -0.145  0.88477    
genresMystery                         2.057e-01  2.019e-01   1.019  0.30847    
genresSci-Fi                          2.135e-01  3.232e-01   0.660  0.50900    
genresThriller                       -4.767e-01  7.196e-01  -0.662  0.50773    
duration                                     NA         NA      NA       NA    
num_voted_users                              NA         NA      NA       NA    
duration:num_voted_users             -1.610e-08  3.652e-09  -4.408 1.09e-05 ***
num_user_for_reviews:num_voted_users  1.481e-09  2.349e-10   6.306 3.35e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.717 on 2665 degrees of freedom
Multiple R-squared:  0.5223,    Adjusted R-squared:  0.5173 
F-statistic: 104.1 on 28 and 2665 DF,  p-value: < 2.2e-16
pr<-predict.lm(lm.fit6,newdata = data.frame(movie_test),interval = 'confidence')
prediction from a rank-deficient fit may be misleading
pr
          fit      lwr      upr
6    7.283794 6.961082 7.606506
58   5.601212 5.455020 5.747404
75   5.921525 5.769292 6.073758
91   7.076609 6.945594 7.207625
96   7.300318 7.055997 7.544639
110  6.182307 6.065289 6.299324
126  7.308791 7.139406 7.478176
133  5.039100 4.921396 5.156804
140  6.138358 6.020223 6.256493
147  5.545048 5.420151 5.669946
148  7.438595 7.195529 7.681661
187  7.713198 7.516667 7.909729
206  8.013502 7.817922 8.209082
228  6.721862 6.307709 7.136016
229  5.212996 5.075929 5.350063
234  5.604565 5.490636 5.718494
293  6.235727 6.155310 6.316144
295  6.462250 6.385914 6.538586
300  6.630399 6.537012 6.723785
307  5.977920 5.900503 6.055336
332  5.838179 5.725647 5.950712
341  7.880073 7.551832 8.208314
342  5.393556 5.302269 5.484843
347  6.197072 6.090796 6.303348
361  5.824857 5.736939 5.912774
364  7.034808 6.949784 7.119832
369  5.453146 5.361252 5.545041
374  6.595132 6.373851 6.816412
390  5.527691 5.419353 5.636030
399  7.052506 6.957343 7.147669
401  7.585817 7.431986 7.739649
410  6.377673 6.275528 6.479819
413  5.611617 5.500098 5.723136
416  5.764839 5.693482 5.836196
422  5.591627 5.518604 5.664651
427  5.693757 5.607101 5.780412
433  6.431746 6.327910 6.535582
448  5.918329 5.771181 6.065476
482  5.492688 5.404294 5.581083
495  5.266105 5.152707 5.379503
500  6.156436 6.020799 6.292072
510  8.159068 7.815930 8.502207
549  6.533162 6.447668 6.618657
557  5.412561 5.327596 5.497527
573  5.355342 5.253819 5.456865
580  6.219821 6.016168 6.423475
591  5.850221 5.770026 5.930415
593  6.045939 5.977434 6.114445
640  7.409309 7.264856 7.553763
655  9.332587 8.723534 9.941640
656  7.436584 7.309997 7.563171
673  5.577126 5.475291 5.678962
675  6.257584 6.179447 6.335721
676  6.301542 6.231706 6.371379
696  5.892425 5.773144 6.011705
707  7.552570 7.319117 7.786023
715  5.504569 5.369260 5.639877
746  6.109287 5.467713 6.750860
748  5.423282 5.347300 5.499264
754  6.131325 6.055557 6.207094
821  5.830406 5.719803 5.941010
826  5.639553 5.552655 5.726452
828  7.560593 7.444525 7.676661
837  6.987393 6.290385 7.684401
856  8.765150 8.520057 9.010242
867  7.174240 7.077694 7.270785
876  5.929131 5.856383 6.001879
889  6.504289 6.423404 6.585173
893  6.541391 6.474521 6.608261
894  7.279606 7.186910 7.372303
905  5.698541 5.569067 5.828014
916  6.186823 6.036703 6.336944
930  6.738695 6.622053 6.855337
937  7.162173 6.751125 7.573220
972  5.903300 5.802902 6.003698
978  6.044203 5.971660 6.116746
988  6.294255 6.178753 6.409758
990  5.149862 5.073185 5.226540
992  5.907266 5.832287 5.982246
999  6.290384 6.175418 6.405350
1009 5.888397 5.798136 5.978657
1066 5.742642 5.660825 5.824460
1074 5.672596 5.578611 5.766580
1076 5.821598 5.711399 5.931797
1099 6.212266 6.115181 6.309351
1112 5.739309 5.638777 5.839840
1118 5.305699 5.222783 5.388615
1122 6.093028 5.996862 6.189193
1131 5.702184 5.635930 5.768439
1133 6.506325 6.413742 6.598907
1139 7.050003 6.942645 7.157360
1174 6.892636 6.784939 7.000333
1180 6.429248 6.342962 6.515534
1182 8.839734 8.591493 9.087976
1185 7.115881 7.010670 7.221091
1192 5.301158 5.204124 5.398191
1193 6.922383 6.788407 7.056359
1198 8.528362 8.327797 8.728927
1201 5.720273 5.569488 5.871058
1217 6.250213 6.184586 6.315839
1241 6.480293 6.088190 6.872396
1245 5.917872 5.858279 5.977466
1246 6.797700 6.653906 6.941495
1247 6.691126 6.572223 6.810029
1270 6.819103 6.697303 6.940904
1273 5.902787 5.831914 5.973661
1287 6.137365 6.041581 6.233149
1321 6.253898 6.190694 6.317102
1336 6.231014 6.094288 6.367740
1341 6.554074 6.477910 6.630238
1364 5.891956 5.786519 5.997393
1373 5.951321 5.851278 6.051364
1388 6.117664 6.039334 6.195995
1397 5.446647 5.377593 5.515701
1400 6.378499 6.258656 6.498342
1401 7.315684 7.184701 7.446667
1408 7.290262 7.141059 7.439464
1410 7.133591 6.992800 7.274381
1411 6.558030 6.460226 6.655834
1418 6.177232 6.079913 6.274552
1422 5.717356 5.579012 5.855700
1439 7.121375 6.848753 7.393997
1446 5.650848 5.547147 5.754549
1548 5.520363 5.385977 5.654748
1549 7.085216 6.952624 7.217809
1558 5.900668 5.821554 5.979782
1582 6.711916 6.597672 6.826161
1599 6.211153 6.145247 6.277060
1611 6.479858 6.382857 6.576859
1646 7.338276 7.246799 7.429753
1651 6.026793 5.947140 6.106446
1655 5.757383 5.678811 5.835956
1671 5.487285 5.345969 5.628602
1672 5.590823 5.518364 5.663283
1682 5.695418 5.632832 5.758005
1694 6.024141 5.942383 6.105898
1699 6.581304 6.485454 6.677153
1700 7.120848 6.967389 7.274306
1713 5.859345 5.790517 5.928174
1714 7.003628 6.915034 7.092222
1745 5.822302 5.751955 5.892650
1749 7.673570 7.500163 7.846977
1758 6.158614 6.007205 6.310022
1770 6.746408 6.662181 6.830635
1797 7.597270 7.495751 7.698789
1807 6.568264 6.456260 6.680269
1823 6.091849 5.980337 6.203362
1846 5.798714 5.662622 5.934806
1850 6.211956 6.151219 6.272693
1852 6.285271 6.182976 6.387566
1853 6.862732 6.782164 6.943300
1854 5.468466 5.387634 5.549298
1862 6.041243 5.977489 6.104997
1873 6.334997 6.245959 6.424035
1904 9.255506 9.021379 9.489633
1924 6.197991 6.140431 6.255552
1927 6.395225 6.320564 6.469885
1946 6.704440 6.614532 6.794349
1965 5.714761 5.642714 5.786809
1969 5.388521 5.249491 5.527552
1985 5.715677 5.576753 5.854601
1999 5.466626 5.375041 5.558210
2009 5.592615 5.474914 5.710315
2028 6.745599 6.657779 6.833419
2068 6.163883 6.080445 6.247321
2077 6.731823 6.643679 6.819968
2091 6.088708 6.010355 6.167061
2093 6.707359 6.523812 6.890906
2102 5.879199 5.822701 5.935698
2129 5.964289 5.857058 6.071519
2149 5.635891 5.561805 5.709978
2153 8.794187 8.589088 8.999285
2175 8.583610 8.365017 8.802203
2176 6.037542 5.959823 6.115260
2193 5.852414 5.670277 6.034552
2194 7.592880 7.459677 7.726084
2198 6.014088 5.928804 6.099371
2202 6.763022 6.636603 6.889440
2204 5.507692 5.410111 5.605274
2214 5.595364 5.505218 5.685510
2241 5.921043 5.842751 5.999335
2265 5.845115 5.790224 5.900005
2295 6.401580 6.297395 6.505766
2305 5.243665 5.162959 5.324371
2312 7.397805 7.135635 7.659976
2323 6.640331 6.546219 6.734444
2342 5.931703 5.813450 6.049957
2373 6.120492 5.982827 6.258156
2378 6.794626 6.667515 6.921738
2402 5.697400 5.626317 5.768483
2420 5.438677 5.343548 5.533807
2431 6.031952 5.966231 6.097672
2512 5.529612 5.458150 5.601073
2513 5.578756 5.460516 5.696996
2516 5.790487 5.598594 5.982379
2536 7.075620 6.991078 7.160162
2542 6.811703 6.724004 6.899402
2551 5.279088 5.193439 5.364738
2572 6.043564 5.960296 6.126833
2603 6.090629 5.977271 6.203988
2633 6.440978 6.343831 6.538125
2652 7.989268 7.847937 8.130599
2654 7.894060 7.598932 8.189187
2670 5.819352 5.711579 5.927125
2688 5.648052 5.564605 5.731500
2690 7.109840 7.023055 7.196625
2693 6.312941 6.200445 6.425436
2715 6.046727 5.983661 6.109794
2716 5.645893 5.581399 5.710386
2726 6.205706 6.124288 6.287123
2732 6.844379 6.725492 6.963266
2757 6.308467 6.217634 6.399301
2774 6.269018 6.187851 6.350185
2798 5.667449 5.541309 5.793589
2801 5.541879 5.465789 5.617970
2817 7.068054 6.901567 7.234540
2819 5.942133 5.867150 6.017115
2851 5.822851 5.717472 5.928229
2852 5.719791 5.600572 5.839010
2917 7.818827 7.633040 8.004613
2919 7.317026 7.189647 7.444406
2926 6.927297 6.291808 7.562786
2929 6.650686 6.533494 6.767879
2932 6.869399 6.757915 6.980882
2961 5.764994 5.687460 5.842527
2979 5.480510 5.399615 5.561405
2982 6.334079 6.260312 6.407845
3027 8.079233 7.879171 8.279295
3035 6.401612 6.331081 6.472142
3045 6.119467 5.997652 6.241281
3051 5.637182 5.569202 5.705163
3056 7.131470 6.996185 7.266755
3057 6.765685 6.662927 6.868443
3092 6.398534 6.255107 6.541960
3103 6.026813 5.920733 6.132893
3134 5.334536 5.201535 5.467536
3172 5.932191 5.865909 5.998472
3173 6.870753 6.750227 6.991279
3177 5.723944 5.626716 5.821172
3196 5.723279 5.624227 5.822331
3200 5.370359 5.230384 5.510335
3245 5.401082 5.309903 5.492261
3268 7.403859 7.196112 7.611605
3295 5.501005 5.369876 5.632135
3333 7.122691 7.027480 7.217902
3340 6.055678 5.981494 6.129863
3365 6.060397 5.974922 6.145872
3366 7.485740 7.335148 7.636332
3370 5.397073 5.261441 5.532704
3382 5.965782 5.901493 6.030071
3390 7.544392 7.431831 7.656952
3399 6.951968 6.685069 7.218867
3414 5.993159 5.928012 6.058306
3531 6.498444 6.357787 6.639101
3592 5.898927 5.820691 5.977162
3598 5.899420 5.782248 6.016592
3628 6.535066 6.457807 6.612325
3665 5.421177 5.309760 5.532594
3675 6.559451 6.477981 6.640921
3713 5.949436 5.679015 6.219857
3717 9.345175 8.874453 9.815897
3722 5.891308 5.813311 5.969306
3724 6.185142 5.929349 6.440936
3766 5.889138 5.822060 5.956216
3789 6.178332 6.059866 6.296798
3849 6.838529 6.726385 6.950673
3850 8.985470 8.743098 9.227843
3866 6.569541 6.491553 6.647528
3894 6.472803 6.376085 6.569521
3903 6.223353 6.084295 6.362410
3919 6.299085 6.218012 6.380159
3986 7.174568 7.065273 7.283863
4004 6.110685 5.865103 6.356266
4009 5.714965 5.641467 5.788463
4026 7.293088 7.123176 7.462999
4068 6.233893 6.120107 6.347679
4073 6.094358 6.022999 6.165717
4092 6.404772 6.286919 6.522625
4158 9.310683 9.052868 9.568497
4169 6.956346 6.831286 7.081406
4362 6.622683 6.362205 6.883161
4403 6.426709 6.347768 6.505649
4424 5.628411 5.492258 5.764564
4436 5.946022 5.852596 6.039448
4447 7.014620 6.924030 7.105210
4465 6.226858 6.139830 6.313886
4502 5.623731 5.542839 5.704623
4537 6.110264 5.957699 6.262829
4546 5.428826 5.287890 5.569761
4656 5.945255 5.853189 6.037320
4697 7.098388 7.009629 7.187146
4755 6.303008 6.212814 6.393203
4813 7.058448 5.634333 8.482564
4829 5.874389 5.793694 5.955084
4853 6.844704 6.752892 6.936516
4859 5.753937 5.676739 5.831134
4864 5.837533 5.732279 5.942788
4892 6.340879 6.260317 6.421441
4985 6.040234 5.967985 6.112482
4988 6.059968 5.982885 6.137051
4998 6.286891 6.200964 6.372817

Checking the impact significance of predictors on IMDB score.

library(QuantPsyc)
lm.beta(lm.fit6)
Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.longer object length is not a multiple of shorter object length
           poly(num_voted_users, 2)1            poly(num_voted_users, 2)2 
                        6.735011e-01                        -3.223345e-01 
    poly(num_critic_for_reviews, 2)1     poly(num_critic_for_reviews, 2)2 
                        4.532313e+03                        -1.286433e-01 
                num_user_for_reviews                   poly(duration, 2)1 
                       -1.860903e-03                         9.193268e+08 
                  poly(duration, 2)2                 facenumber_in_poster 
                       -6.932039e-02                        -1.693137e-01 
                               gross                     poly(budget, 2)1 
                       -1.984955e-09                        -2.365565e+02 
                    poly(budget, 2)2                           title_year 
                        1.109206e+06                        -3.986154e-04 
                     genresAdventure                      genresAnimation 
                        7.048767e-03                         3.236437e+02 
                     genresBiography                         genresComedy 
                        1.223394e-02                         2.728816e-01 
                         genresCrime                    genresDocumentary 
                        3.082938e+07                         2.057189e-02 
                         genresDrama                         genresFamily 
                        4.622652e+00                         7.070758e-01 
                       genresFantasy                         genresHorror 
                       -3.895118e+00                        -5.717494e+04 
                       genresMusical                        genresMystery 
                       -1.949084e-03                         3.840150e-03 
                        genresSci-Fi                       genresThriller 
                        8.208130e+01                        -8.900566e-03 
            duration:num_voted_users num_user_for_reviews:num_voted_users 
                       -3.329438e-08                         1.039401e-01 

Conclusion: The most important factor that affects movie rating is the duration. The longer the movie is, the higher the rating will be. num_critic_for_reviews is also an important predictor. Budget is important, although there is no strong correlation between budget and movie rating. The number of faces in movie poster has a non-neglectable effect to the movie rating.

LS0tCnRpdGxlOiAiUmVncmVzc2lvbiBBbmFseXNpcyBvZiBJTURCIDUwMDAgTW92aWVzIERhdGFzZXRzIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tClB1cnBvc2U6CkJ5IGRvaW5nIGEgcmVncmVzc29uIGFuYWx5c2lzLCB3ZSB3YW50IHRvIGtub3c6CjEpIEFtb25nIHRoZSAyNyB2YXJpYWJsZXMgZ2l2ZW4sIHdoaWNoIG9mIHRoZW0gYXJlIGNyaXRpY2FsIGluIHRlbGxpbmcgdGhlIElNREIgcmF0aW5nIG9mIGEgbW92aWUuCjIpIElzIHRoZXJlIGFueSBjb3JyZWxhdGlvbiBiZXR3ZWVuIGdlbnJlICYgSU1EQiByYWdpbmcsZmFjZSBudW1iZXIgaW4gcG9zdGVyICYgSU1EQiByYXRpbmcsZGlyZWN0b3IgbmFtZSAmIElNREIgcmF0aW5nIGFuZCBkdXJhdGlvbiAmIElNREIgcmF0aW5nLgozKSBQcmVkaWN0IHRoZSBJTURCIFNjb3JlIHVzaW5nIG91ciBtb2RlbAoKYGBge3J9Cm08LSByZWFkLmNzdignbW92aWVfbWV0YWRhdGEuY3N2JykKYGBgCiMjIFN0ZXAgMTogRGF0YSBDb2xsZWN0aW9uIApUaGlzIGRhdGEgc2V0IHdhcyBmb3VuZCBmcm9tIEthZ2dsZS4gVGhlIGF1dGhvciBzY3JhcGVkIDUwMDArIG1vdmllcyBmcm9tIElNREIgd2Vic2l0ZSB1c2luZyBhIFB5dGhvbiBsaWJyYXJ5IGNhbGxlZCAic2NyYXB5IiBhbmQgb2J0YWluIGFsbCBuZWVkZWQgMjggdmFyaWFibGVzIGZvciA1MDQzIG1vdmllcyBhbmQgNDkwNiBwb3N0ZXJzICg5OThNQiksIHNwYW5uaW5nIGFjcm9zcyAxMDAgeWVhcnMgaW4gNjYgY291bnRyaWVzLiBUaGVyZSBhcmUgMjM5OSB1bmlxdWUgZGlyZWN0b3IgbmFtZXMsIGFuZCB0aG91c2FuZHMgb2YgYWN0b3JzL2FjdHJlc3Nlcy4gQmVsb3cgYXJlIHRoZSAyOCB2YXJpYWJsZXM6CiJtb3ZpZV90aXRsZSIgImNvbG9yIiAibnVtX2NyaXRpY19mb3JfcmV2aWV3cyIgIm1vdmllX2ZhY2Vib29rX2xpa2VzIiAiZHVyYXRpb24iICJkaXJlY3Rvcl9uYW1lIiAiZGlyZWN0b3JfZmFjZWJvb2tfbGlrZXMiICJhY3Rvcl8zX25hbWUiICJhY3Rvcl8zX2ZhY2Vib29rX2xpa2VzIiAiYWN0b3JfMl9uYW1lIiAiYWN0b3JfMl9mYWNlYm9va19saWtlcyIgImFjdG9yXzFfbmFtZSIgImFjdG9yXzFfZmFjZWJvb2tfbGlrZXMiICJncm9zcyIgImdlbnJlcyIgIm51bV92b3RlZF91c2VycyIgImNhc3RfdG90YWxfZmFjZWJvb2tfbGlrZXMiICJmYWNlbnVtYmVyX2luX3Bvc3RlciIgInBsb3Rfa2V5d29yZHMiICJtb3ZpZV9pbWRiX2xpbmsiICJudW1fdXNlcl9mb3JfcmV2aWV3cyIgImxhbmd1YWdlIiAiY291bnRyeSIgImNvbnRlbnRfcmF0aW5nIiAiYnVkZ2V0IiAidGl0bGVfeWVhciIgImltZGJfc2NvcmUiICJhc3BlY3RfcmF0aW8iCgpUaGlzIGRhdGFzZXQgaXMgYSBwcm9vZiBvZiBjb25jZXB0LiBJdCBjYW4gYmUgdXNlZCBmb3IgZXhwZXJpbWVudGFsIGFuZCBsZWFybmluZyBwdXJwb3NlLkZvciBjb21wcmVoZW5zaXZlIG1vdmllIGFuYWx5c2lzIGFuZCBhY2N1cmF0ZSBtb3ZpZSByYXRpbmdzIHByZWRpY3Rpb24sIDI4IGF0dHJpYnV0ZXMgZnJvbSA1MDAwIG1vdmllcyBtaWdodCBub3QgYmUgZW5vdWdoLiBBIGRlY2VudCBkYXRhc2V0IGNvdWxkIGNvbnRhaW4gaHVuZHJlZHMgb2YgYXR0cmlidXRlcyBmcm9tIDUwSyBvciBtb3JlIG1vdmllcywgYW5kIHJlcXVpcmVzIHRvbnMgb2YgZmVhdHVyZSBlbmdpbmVlcmluZy4KCiMjIFN0ZXAgMiA6IERhdGEgY2xlYW5pbmcgYW5kIGV4cGxvcmF0aW9uCgpBc3NpZ24gdGhlIGZpcnN0IHdvcmQgb2YgZ2VucmVzIGFzIHRoZSBnZW5yZSBvZiBlYWNoIG1vdmllOihnZW5yZXMgYmVlbiBzcGxpdCBpbnRvIHdvcmRzIGluIEV4Y2VsKToKYGBge3J9CiMgcmVtb3ZlIGNvbHVtbnMgWC1YLjgKd2hpY2goY29sbmFtZXMobSk9PSdnZW5yZXMnKQp3aGljaChjb2xuYW1lcyhtKT09J1guOCcpCm08LW1bLC1jKDExOjE5KV0KYGBgCgpPbmx5IGtlZXAgbW92aWUgZGF0YSBmb3IgVVNBLCBiYWNhdXNlIHRoZSAiYnVkZ2V0IiB2YXJpYWJsZSB3YXMgbm90IGFsbCBjb252ZXJ0ZWQgdG8gVVMgZG9sbGFycywgd2hpY2ggbWlnaHQgY2F1c2UgYSBwcm9ibGVtIGluIGxhdGVyIGFuYWx5c2lzLiBJZiB3ZSB3YW50IHRvIGNvbnZlcnQgYWxsIGJ1ZGdldHMgaW50byBVUyBkb2xsYXJ0cywgd2UgaGF2ZSB0byB0YWtlIGluIHRvIGNvbnNpZGVyYXRpb24gZm9yIGluZmxhdGlvbiBhcyB3ZWxsLiBUaGlzIG1pZ2h0IG1ha2UgdGhlIHByb2JsZW0gbW9yZSBjb21wbGljYXRlZC4gVGhlcmVmb3JlLCBmb3IgcHJhdGljZSBwdXJwb3NlLCB3ZSBkZWNpZGVkIHRvIG9ubHkgc3R1ZHkgZGF0YSBmb3IgbW92aWVzIG9mIFVTQS4gCmBgYHtyfQptb3ZpZS51c2E8LW1bd2hpY2gobVssJ2NvdW50cnknXT09J1VTQScpLF0KYGBgCkRvdWJsZSBjaGVjazoKYGBge3J9Cm1vdmllLnVzYSRjb3VudHJ5CmBgYAoKUmVtb3ZlICdsYW5ndWFnZScgc2luY2UgYWZ0ZXIgcmVtb3ZpbmcgYWxsIGNvdW50cmllcyBleGNlcHQgZm9yIFVTQSwgdGhlcmUgaXMgb25seSA0IGxhbmd1YWdlcyBhc2lkZSBmcm9tIEVuZ2xpc2gsIG5vdCBtZWFuaW5nZnVsIGZvciBvdXIgcHJlZGljdGlvbi4gCmBgYHtyfQpzdW1tYXJ5KG1vdmllLnVzYSRsYW5ndWFnZSkKbW92aWUudXNhPC1tb3ZpZS51c2FbLCAtd2hpY2gobmFtZXMobW92aWUudXNhKT09J2xhbmd1YWdlJyldCmBgYAoKUmVtb3ZlICdtb3ZpZV9pbWRiX2xpbmsnIGNvbHVtbiBzaW5jZSBpdCdzIG5vdCB1c2VmdWwgZm9yIG91ciBhbmFseXNpcyBhbmQgc3RvcmUgdGhlIHJlc3Qgb2QgdGhlIGRhdGEgYXMgJ21vdmllJy4KYGBge3J9Cm1vdmllLmRmPSBkYXRhLmZyYW1lKG1vdmllLnVzYSkKbW08LW1vdmllLmRmWywgLXdoaWNoKG5hbWVzKG1vdmllLmRmKT09J21vdmllX2ltZGJfbGluaycpXSAKYGBgCgoKYGBge3J9CnN0cihtbSkKYGBgCgpDaGVjayBmb3IgbWlzc2luZyB2YWx1ZXM6CmBgYHtyfQpsaWJyYXJ5KEFtZWxpYSkKbWlzc21hcChtbSwgbWFpbiA9ICJNaXNzaW5nIHZhbHVlcyB2cyBvYnNlcnZlZCIpCnNhcHBseShtbSxmdW5jdGlvbih4KSBzdW0oaXMubmEoeCkpKSAjIG51bWJlciBvZiBtaXNzaW5nIHZhbHVlcyBmb3IgZWFjaCB2YXJpYWJsZSAKYGBgCldlIG5vdGljZWQgdGhhdCB0aGVyZSBhcmUgbWFueSBtaXNzaW5nIHZhbHVlcyBmb3IgYnVkZ2V0LGFzcGVjdCByYXRpbyBhbmQgZ3Jvc3MuCgpPbWl0IG1pc3NpbmcgdmFsdWVzOgpgYGB7cn0KbW92aWU8LW5hLm9taXQobW0pCnNhcHBseShtb3ZpZSxmdW5jdGlvbih4KSBzdW0oaXMubmEoeCkpKSAjIGRvdWJsZSBjaGVjayBmb3IgbWlzc2luZyB2YWx1ZXMKYGBgCgoKYGBge3J9CmxpYnJhcnkocHN5Y2gpCmxpYnJhcnkoY2FyKQpsaWJyYXJ5KFJDb2xvckJyZXdlcikgCmxpYnJhcnkoY29ycnBsb3QpCmxpYnJhcnkoZ2dwbG90MikKYGBgCgpFeHBsb3JlIHRpdGxlX3llYXIgcHJlZGljdG9yOgpgYGB7cn0KcmFuZ2UobW92aWUkdGl0bGVfeWVhcikgIyBjaGVjayBtb3ZpZSB0aXRsZSB5ZWFyCnN1bSh3aXRoKG1vdmllLHRpdGxlX3llYXI9PScyMDA5JykpICMgMTQ1CnN1bSh3aXRoKG1vdmllLHRpdGxlX3llYXI9PScyMDE0JykpICMgMTIxCmBgYApWaXN1YWxpemF0aW9uIG9mIHRpdGxlIFllYXIgdnMuIFNjb3JlOgpgYGB7cn0Kc2NhdHRlcnBsb3QoeD1tb3ZpZSR0aXRsZV95ZWFyLHk9bW92aWUkaW1kYl9zY29yZSkKYGBgClRoZXJlIGFyZSBtYW55IG91dGxpZXJzIGZvciB0aXRsZSB5ZWFyLiBUaGUgbW9qb3JpdHkgb2YgZGF0YSBwb2ludHMgYXJlIGFyb3VuZCB0aGUgeWVhciBvZiAyMDAwIGFuZCBsYXRlcix3aGljaCBtYWtlIHNlbnNlIHRoYXQgdGhpcyBpcyBsZXNzIG1vdmllcyBpbiB0aGUgZWFybHkgeWVhcnMuIEFsc28sIGFuIGludGVyaW5nIG5vdGljZSBpcyB0aGF0IG1vdmllcyBmcm9tIGVhcmx5IHllYXJzIHRlbmQgdG8gaGF2ZSBoaWdoZXIgc2NvcmVzLiAKCgoKVmlzdWFsaXphdGlvbiBvZiBJTURCIFNjb3JlOgpgYGB7cn0KbWF4KG1vdmllJGltZGJfc2NvcmUpICMgOS40CmdncGxvdChtb3ZpZSwgYWVzKHggPSBpbWRiX3Njb3JlKSkgKwogICAgICAgIGdlb21faGlzdG9ncmFtKGFlcyhmaWxsID0gLi5jb3VudC4uKSwgYmlud2lkdGggPTAuNSkgKwogICAgICAgIHNjYWxlX3hfY29udGludW91cyhuYW1lID0gIklNREIgU2NvcmUiLAogICAgICAgICAgICAgICAgICAgICAgICAgICBicmVha3MgPSBzZXEoMCwxMCksCiAgICAgICAgICAgICAgICAgICAgICAgICAgIGxpbWl0cz1jKDEsIDEwKSkgKwogICAgICAgIGdndGl0bGUoIkhpc3RvZ3JhbSBvZiBNb3ZpZSBJTURCIFNjb3JlIikgKwogICAgICAgIHNjYWxlX2ZpbGxfZ3JhZGllbnQoIkNvdW50IiwgbG93ID0gImJsdWUiLCBoaWdoID0gInJlZCIpCmBgYApgYGB7cn0Kc3VtKHdpdGgobW92aWUsaW1kYl9zY29yZT49OCkpCiMgMTQ4IG1vdmllcyB3aXRoIElNREIgc2NvcmUgZ3JlYXRlciBvciBlcXVhbCB0byA4LgpgYGAKSU1EQiBzY29yZSBsb29rcyBub3JtYWwuVGhlIGhpZ2hlc3Qgc2NvcmUgaXMgOS40IG91dCBvZiBzY2FsZSAxMC4gQW5kIHdlIGNhbiBjb25zaWRlciBtb3ZpZXMgd2l0aCBhIHNjb3JlIGdyZWF0ZXIgb3IgZXF1YWwgdG8gOCBhIGdyZWF0IG1vdmllIGZyb20gbWFueSBwZXJzcGVjdGl2ZXMuCgoKRXhwbG9yaW5nIGNvcnJlbGF0aW9uIDoKYGBge3J9CnBhaXJzLnBhbmVscyhtb3ZpZVtjKCdkaXJlY3Rvcl9uYW1lJywnZHVyYXRpb24nLCdmYWNlbnVtYmVyX2luX3Bvc3RlcicsJ2ltZGJfc2NvcmUnLCdnZW5yZXMnKV0pCmBgYApmcm9tIHRoZSBwbG90LCBvbmx5IGR1cmF0aW9uIGFuZCBJTUJEIHNjb3JlIGhhcyBhIGhpZ2ggY29ycmVsYXRpb24uCmZhY2UgbnVtYmVyIGluIHBvc3RlcnMgaGFzIGEgbmVnYXRpdmUgY29ycmVhbHRpb24gd2l0aCBJTUJEIHNjb3JlLgpnZW5yZSBoYXMgbGl0dGxlIGNvcnJlbGF0aW4gd2l0aCBzY29yZQpJbnRlcmVzdGluZywgZGlyZWN0b3IgbmFtZSBoYXMgbm8gY29ycmVsYXRpb24gd2l0aCBJTURCIHNjb3JlCgoKYGBge3J9CnBhaXJzLnBhbmVscyhtb3ZpZVtjKCdjb2xvcicsJ2FjdG9yXzFfbmFtZScsJ3RpdGxlX3llYXInLCdpbWRiX3Njb3JlJywnYXNwZWN0X3JhdGlvJywnZ3Jvc3MnKV0pCmBgYApDb2xvciBhbmQgdGl0bGUgeWVhciBoYXMgaGlnaGx5IHBvc2l0aXZlIGNvcnJlbGF0aW9uLgpDb2xvciBhbmQgYXNwZWN0IHJhdGlhLGdyb3NzIGhhcyBzbWFsbGVyIHBvc2l0aXZlIGNvcnJlbGF0aW9ucy4KQWN0b3IgMSBuYW1lbSBoYXMgdmVyeSBzbWFsbCBwb3NpdGl2ZSBjb3JyZWxhdGlvbiB3aXRoIGdyb3NzLCBtZWFuaW5nIHdobyBwbGF5cyB0aGUgbW92aWVzIGRvZXMgbm90IGhhdmUgaW1wYWN0IG9uIHRoZSBncm9zcy4KVGl0bGUgeWVhciBhbmQgYXNwZWN0IHJhdGlvIGFuZCBjb2xvciBhcmUgaGlnaGx5IHBvc2l0aXZlbHkgY29ycmVsYXRlZC4KSU1EQiBzY29yZSBoYXMgdmVyeSBzbWFsbCBwb3NpdGl2ZSBjb3JyZWxhdGlvbiB3aXRoIGFjdG9yIDEgbmFtZSAsd2hpY2ggbWVhbnMgd2hvIHdhcyB0aGUgYWN0b3IgMSBkb2VzIG5vdCBtYWtlIHRoZSBtb3ZpZSBoYXMgYSBoaWdoZXIgc2NvcmUuCkludGVyZXN0aW5nbHksIElNREIgc2NvcmUgaGFzIGEgbmVnYXRpdmUgY29ycmVsYXRpb24gd2l0aCB0aXRsZSB5ZWFyLHdoaWNoIG1lYW5zIHRoZSBvbGQgbW92aWVzIHNlZW1zIHRvIGhhdmUgYSBoaWdoZXIgc2NvcmUuIHRoZSByZXN1bHQgYWdyZWVzIHdpdGggb3V0IHBic2VydmF0aW9uIGZyb20gdGhlIHNjYXR0ZXIgcGxvdC4gCklNREIgYW5kIGFzcGVjdCByYXRpbyBoYXMgIHNtYWxsIHBvc2l0aXZlIGNvcnJlbGF0aW9uLgpJTURCIGhhcyBhIHN0cm9uZyBwb3NpdGl2ZSBjb3JyZWxhdGlvbiB3aXRoIGdyb3NzLgoKCkNvcnBsb3QgZm9yIGFsbCBudW1lcmljYWwgdmFyaWFibGVzOgpgYGB7cn0KbnVtczwtIHNhcHBseShtb3ZpZSxpcy5udW1lcmljKSAjIHNlbGVjdCBudW1lcmljIGNvbHVtbnMKbW92aWUubnVtPC0gbW92aWVbLG51bXNdCmNvcnJwbG90KGNvcihtb3ZpZS5udW0pLG1ldGhvZD0nZWxsaXBzZScpIApgYGAKTm90ZTogY29ycnBsb3QgY2Fubm90IHVzZSBkYXRhLmZyYW1lLCB1c2UgY29yKCkgdG8gY2hhbmdlIGl0IHRvIG1hdHJpeC4KCkZyb20gdGhlIGNvcnJlbGF0aW9uIHBsb3QsIHdlIGNhbiB0ZWxsIHRoYXQ6CkZhY2UgbnVtYmVyIGluIHBvc3RlciBoYXMgbmVnYXRpdmUgY29ycmVsYXRpb24gd2l0aCBhbGwgb3RoZXIgcHJlZGljdG9ycy4KQ2FzdCB0b3RhbCBmYWNlYm9vayBsaWtlcyBhbmQgYWN0b3IgMSBmYWNlYm9vayBsaWtlcyBoYXMgYSBzdHJvbmdlciBwb3NpdGl2ZSBjb3JyZWxhdGlvbi4KYnVkZ2V0IGFuZCBncm9zcyBoYXZlIHN0cm9uZyBjb3JyZWFsdGlvbiB3aGljaCBpcyBub3Qgc3VycHJpc2luZy4KSW50ZXJlc3RpbmdseSwgSU1EQiBzY29yZXMgaGFzIHN0cm9uZyBwb3NpdGl2ZSBjb3JybGF0aW9uIHdpdGggbnVtYmVyIG9mIGNyaXRpY3MgZm9yIHJldmlldywgd2hpY2ggbWVhbnMgdGhlIG1vcmUgdGhlIGNyaXRpY3MgcmV2aWV3LCB0aGUgaGlnaGVyIHRoZSBzY29yZS5EdXJhdGlvbiBhbmQgbnVtYmVyIG9mIHZvdGVkIHVzZXJzIGFsc28gaGF2ZSBzdHJvbmcgcG9zaXRpdmUgY29ycmVsYXRpb24gd2l0aCBJTURCIHNjb3Jlcy4gCgoKRmluZCB0aGUgcGFpcnMgb2YgY29ycmVsYXRpb25zCmBgYHtyfQpjb3JyLnRlc3QobW92aWUubnVtLHk9TlVMTCx1c2U9J3BhaXJ3aXNlJyxtZXRob2Q9J3BlYXJzb24nLGFkanVzdD0naG9sbScsYWxwaGE9MC4wNSkgIyB4IG11c3QgYmUgbnVtZXJpYwpgYGAKYGBge3J9CiMgQm94cGxvdHMgZm9yIHNpZ25pZmljYW50IGNhdGVnb3JpY2FsIHByZWRpY3RvcnMKQm94cGxvdChtb3ZpZSRpbWRiX3Njb3JlLG1vdmllJGNvbG9yKQoKYGBgCkJsYWNrIGFuZCB3aGl0ZSBtb3ZpZXMgc2VlbXMgdG8gaGF2ZSBhIGhpdGhlciBtZWFkaWFuIHJhdGUsIGFuZCBvdmVyYWxsIGEgbGl0dGxlIGhpZ2hlciBzY29yZXMuIApDb2xvcnMgbW92aWVzIGhhdmUgbWFueSBvdXRsaWVycy4gCgpCb3hwbG90IGZvciBnZW5yZToKYGBge3J9CmZpbGwgPC0gIkJsdWUiCmxpbmUgPC0gIlJlZCIKZ2dwbG90KG1vdmllLCBhZXMoeCA9IGdlbnJlcywgeSA9aW1kYl9zY29yZSkpICsKICAgICAgICBnZW9tX2JveHBsb3QoZmlsbCA9IGZpbGwsIGNvbG91ciA9IGxpbmUpICsKICAgICAgICBzY2FsZV95X2NvbnRpbnVvdXMobmFtZSA9ICJJTURCIFNjb3JlIiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgYnJlYWtzID0gc2VxKDAsIDExLCAwLjUpLAogICAgICAgICAgICAgICAgICAgICAgICAgICBsaW1pdHM9YygwLCAxMSkpICsKICAgICAgICBzY2FsZV94X2Rpc2NyZXRlKG5hbWUgPSAiR2VucmVzIikgKwogICAgICAgIGdndGl0bGUoIkJveHBsb3Qgb2YgSU1EQiBTY29yZSBhbmQgR2VucmVzIikKYGBgCkZyb20gdGhlIGJveHBsb3Qgb2YgZ2VucmVzLCAiRG9jdW1lbnRhdGlvbiIgaGFzIHRoZSBoaWdoZXN0IG1lZGlhbiBzY29yZS5BbmQgVHJpbGwgbW92aWVzIGhhcyB0aGUgbG93ZXN0IG1lZGlhbi4gQnV0IGl0IGlzIGFsc28gYmVjYXVzZSB0aGVyZSBpcyAxIG9ic2VydmF0aW9uIGZvciB0aHJpbGwgbW92aWVzIGluIG91ciBkYXRhIHNldC4gCgpgYGB7cn0Kc3VtbWFyeShtb3ZpZSRnZW5yZXMpCmBgYAoKIyBCb3hwbG90cyBmb3IgInRpdGxlIHllYXInOgpgYGB7cn0KbGlicmFyeShnZ3Bsb3QyKQpmaWxsIDwtICJCbHVlIgpsaW5lIDwtICJSZWQiCmdncGxvdChtb3ZpZSwgYWVzKHggPSBhcy5mYWN0b3IodGl0bGVfeWVhciksIHkgPWltZGJfc2NvcmUpKSArCiAgICAgICAgZ2VvbV9ib3hwbG90KGZpbGwgPSBmaWxsLCBjb2xvdXIgPSBsaW5lKSArCiAgICAgICAgc2NhbGVfeV9jb250aW51b3VzKG5hbWUgPSAiSU1EQiBTY29yZSIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgIGJyZWFrcyA9IHNlcSgxLjUsIDEwLCAwLjUpLAogICAgICAgICAgICAgICAgICAgICAgICAgICBsaW1pdHM9YygxLjUsIDEwKSkgKwogICAgICAgIHNjYWxlX3hfZGlzY3JldGUobmFtZSA9ICJ0aXRsZV95ZWFyIikgKwogICAgICAgIGdndGl0bGUoIkJveHBsb3Qgb2YgSU1EQiBTY29yZSBhbmQgR2VucmVzIikKYGBgClRoZSBtZWRpYW4gb2YgaW1kYiBzY29yZSBvZiBhbGwgeWVhcnMgc2VlbSBkaWZmZXJlbnQuIFNvIGxldCdzIHRyeSB0byB0cmVhdCB0aXRsZV95ZWFyIGFzIGNhdGVnb3JpY2FsLgoKCmBgYHtyfQojIFNjYXR0ZXIgcGxvdCBtYXRyaXggZm9yIGNvcnJlbGF0aW9uIHNpZ25pZmljYW50IG51bWVyaWNhbCB2YXJpYWJsZXMKc2NhdHRlcnBsb3RNYXRyaXgofm1vdmllJGltZGJfc2NvcmUrbW92aWUkbnVtX3ZvdGVkX3VzZXJzK21vdmllJG51bV9jcml0aWNfZm9yX3Jldmlld3MrbW92aWUkbnVtX3VzZXJfZm9yX3Jldmlld3MrbW92aWUkZHVyYXRpb24rbW92aWUkZmFjZW51bWJlcl9pbl9wb3N0ZXIrbW92aWUkZ3Jvc3MrbW92aWUkbW92aWVfZmFjZWJvb2tfbGlrZXMrbW92aWUkZGlyZWN0b3JfZmFjZWJvb2tfbGlrZXMrbW92aWUkY2FzdF90b3RhbF9mYWNlYm9va19saWtlcyttb3ZpZSRidWRnZXQpCmBgYAoKCiMjIFN0ZXAgMzogZml0dGluZyByZWdyZXNzaW9uIG1vZGVsIApgYGB7cn0KbW92aWUuc2lnPC1tb3ZpZVssYygnaW1kYl9zY29yZScsJ251bV92b3RlZF91c2VycycsJ251bV9jcml0aWNfZm9yX3Jldmlld3MnLCdudW1fdXNlcl9mb3JfcmV2aWV3cycsJ2R1cmF0aW9uJywnZmFjZW51bWJlcl9pbl9wb3N0ZXInLCdncm9zcycsJ21vdmllX2ZhY2Vib29rX2xpa2VzJywnZGlyZWN0b3JfZmFjZWJvb2tfbGlrZXMnLCdjYXN0X3RvdGFsX2ZhY2Vib29rX2xpa2VzJywnYnVkZ2V0JywndGl0bGVfeWVhcicsJ2dlbnJlcycpXQpgYGAKClN0ZXAgZnVuY3Rpb24gdG8gY2hlY2sgQUlDIGNyaXRlcmlhOiAKYGBge3J9Cm51bGw9bG0obW92aWUuc2lnJGltZGJfc2NvcmV+MSkgIyBzZXQgbnVsbCBtb2RlbApzdW1tYXJ5KG51bGwpCmBgYAoKMS4gRnVsbCBtb2RlbCBpcyBsaW5lYXIgYWRkaXRpdmUgbW9kZWwKYGBge3J9CmZ1bGwxPWxtKG1vdmllLnNpZyRpbWRiX3Njb3Jlfm1vdmllLnNpZyRudW1fdm90ZWRfdXNlcnMrbW92aWUuc2lnJG51bV9jcml0aWNfZm9yX3Jldmlld3MrbW92aWUuc2lnJG51bV91c2VyX2Zvcl9yZXZpZXdzK21vdmllLnNpZyRkdXJhdGlvbittb3ZpZS5zaWckZmFjZW51bWJlcl9pbl9wb3N0ZXIrbW92aWUuc2lnJGdyb3NzK21vdmllLnNpZyRtb3ZpZV9mYWNlYm9va19saWtlcyttb3ZpZS5zaWckZGlyZWN0b3JfZmFjZWJvb2tfbGlrZXMrbW92aWUuc2lnJGNhc3RfdG90YWxfZmFjZWJvb2tfbGlrZXMrbW92aWUuc2lnJGJ1ZGdldCttb3ZpZS5zaWckdGl0bGVfeWVhcitmYWN0b3IobW92aWUuc2lnJGdlbnJlcykpCnN1bW1hcnkoZnVsbDEpCmBgYAoKYGBge3J9CnN0ZXAobnVsbCxzY29wZSA9IGxpc3QobG93ZXI9bnVsbCx1cHBlcj1mdWxsMSksZGlyZWN0aW9uID0gJ2ZvcndhcmQnKQpgYGAKCgoyLiBmdWxsIG1vZGVsIGlzIHBvbHlub21pYWwgcmVncmVzaXNvbiBtb2RlbCB3aXRoIGludGVyYWN0aW9uIHRlcm1zOgpgYGB7cn0KZnVsbDI9bG0obW92aWUuc2lnJGltZGJfc2NvcmV+cG9seShtb3ZpZS5zaWckbnVtX3ZvdGVkX3VzZXJzLDIpK3BvbHkobW92aWUuc2lnJG51bV9jcml0aWNfZm9yX3Jldmlld3MsMikrcG9seShtb3ZpZS5zaWckbnVtX3VzZXJfZm9yX3Jldmlld3MsMikrcG9seShtb3ZpZS5zaWckZHVyYXRpb24sMikrbW92aWUuc2lnJGZhY2VudW1iZXJfaW5fcG9zdGVyK3BvbHkobW92aWUuc2lnJGdyb3NzLDIpK3BvbHkobW92aWUuc2lnJG1vdmllX2ZhY2Vib29rX2xpa2VzLDIpK21vdmllLnNpZyRkaXJlY3Rvcl9mYWNlYm9va19saWtlcyttb3ZpZS5zaWckY2FzdF90b3RhbF9mYWNlYm9va19saWtlcyttb3ZpZS5zaWckYnVkZ2V0K21vdmllLnNpZyR0aXRsZV95ZWFyK21vdmllLnNpZyRnZW5yZXMrbW92aWUuc2lnJGZhY2VudW1iZXJfaW5fcG9zdGVyKm1vdmllLnNpZyRudW1fY3JpdGljX2Zvcl9yZXZpZXdzK21vdmllLnNpZyRudW1fdXNlcl9mb3JfcmV2aWV3cyptb3ZpZS5zaWckbnVtX3ZvdGVkX3VzZXJzK21vdmllLnNpZyRudW1fdm90ZWRfdXNlcnMqbW92aWUuc2lnJGdyb3NzK21vdmllLnNpZyRncm9zcyptb3ZpZS5zaWckYnVkZ2V0KQpzdW1tYXJ5KGZ1bGwyKQpgYGAKCmBgYHtyfQpzdGVwKG51bGwsc2NvcGU9bGlzdChsb3dlcj1udWxsLHVwcGVyPWZ1bGwyKSxkaXJlY3Rpb249J2ZvcndhcmQnKQpgYGAKCjMuIGZ1bGwzOiBhZGRpdGl2ZSBtb2RlbCB3aXRoIGludGVyYWN0aW9uCmBgYHtyfQpmdWxsMz0KbG0obW92aWUuc2lnJGltZGJfc2NvcmUgfm1vdmllLnNpZyRudW1fdm90ZWRfdXNlcnMrbW92aWUuc2lnJG51bV9jcml0aWNfZm9yX3Jldmlld3MrbW92aWUuc2lnJG51bV91c2VyX2Zvcl9yZXZpZXdzK21vdmllLnNpZyRkdXJhdGlvbittb3ZpZS5zaWckZmFjZW51bWJlcl9pbl9wb3N0ZXIrbW92aWUuc2lnJGdyb3NzK21vdmllLnNpZyRtb3ZpZV9mYWNlYm9va19saWtlcyttb3ZpZS5zaWckZGlyZWN0b3JfZmFjZWJvb2tfbGlrZXMrbW92aWUuc2lnJGNhc3RfdG90YWxfZmFjZWJvb2tfbGlrZXMrbW92aWUuc2lnJGJ1ZGdldCttb3ZpZS5zaWckdGl0bGVfeWVhcitmYWN0b3IobW92aWUuc2lnJGdlbnJlcykrbW92aWUuc2lnJGR1cmF0aW9uKm1vdmllLnNpZyRudW1fdm90ZWRfdXNlcnMrbW92aWUuc2lnJG51bV92b3RlZF91c2Vycyptb3ZpZS5zaWckbnVtX3VzZXJfZm9yX3Jldmlld3MrbW92aWUuc2lnJGdyb3NzKm1vdmllLnNpZyRidWRnZXQsZGF0YT1tb3ZpZS5zaWcpCnN1bW1hcnkoZnVsbDMpCmBgYAoKYGBge3J9CnN0ZXAobnVsbCxzY29wZT1saXN0KGxvd2VyPW51bGwsdXBwZXI9ZnVsbDMpLGRpcmVjdGlvbj0nZm9yd2FyZCcpCmBgYAoKRm9yIGNvbnZpbmllbmNlIHRvIGludGVycHJldCB0aGUgcmVzdWx0LCBJIHdpbGwgc3RhcnQgd2l0aCBGdWxsMyhhZGRpdGl2ZSBtb2RlIHdpdGggaW50ZXJhY3RpaW4gdGVybXMpLiBBZnRlciBjaGVja2luZyByZXNpZHVhbCwgdGhlbiBkZWNpZGUgc2hvdWxkIHdlIGFkZCBoaWdoZXIgb3JkZXIgdGVybXMuCgpTcGxpdCBkYXRhIGludG8gVGVzdCBhbmQgVHJhaW46CmBgYHtyfQppbmR4ID0gc2FtcGxlKDE6bnJvdyhtb3ZpZS5zaWcpLCBhcy5pbnRlZ2VyKDAuOSpucm93KG1vdmllLnNpZykpKQppbmR4ICMgcmFtZG9taXplIHJvd3MsIHNhdmUgOTAlIG9mIGRhdGEgaW50byBpbmRleAoKbW92aWVfdHJhaW4gPSBtb3ZpZS5zaWdbaW5keCxdCm1vdmllX3Rlc3QgPSBtb3ZpZS5zaWdbLWluZHgsXQpgYGAKCmBgYHtyfQojIGxtLmZpdCAxOiBsaW5lYXIgbW9kZWwgd2l0aCBpbnRlcmFjdGlvbiB0ZXJtIGRyb3BwaW5nIGluc2lnIHByZWRpY3RvcnMuCiMgaW5zaWcgdGVybXM6IGRpcmVjdG9yIGZhY2Vib29rbGlrZScsJ21vdmllIGZiIGxpa2UnIGFuZCAnY2FzdCB0b3RhbCBmYiBsaWtlcycgZnJvbSBzdW1tYXJ5KGZ1bGwzKQojIE5vdGU6IG5vdGhpbmcgdG8gZG8gd2l0aCBzdGVwIGZ1bmN0aW9uIHdlIGNob29zZSBmb3IgZnVsbDMuCmxtLmZpdDE8LWxtKG1vdmllX3RyYWluJGltZGJfc2NvcmV+bW92aWVfdHJhaW4kbnVtX3ZvdGVkX3VzZXJzK21vdmllX3RyYWluJG51bV9jcml0aWNfZm9yX3Jldmlld3MrbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MrbW92aWVfdHJhaW4kZHVyYXRpb24rbW92aWVfdHJhaW4kZmFjZW51bWJlcl9pbl9wb3N0ZXIrbW92aWVfdHJhaW4kZ3Jvc3MrbW92aWVfdHJhaW4kYnVkZ2V0K21vdmllX3RyYWluJHRpdGxlX3llYXIrZmFjdG9yKG1vdmllX3RyYWluJGdlbnJlcykrbW92aWVfdHJhaW4kZHVyYXRpb24qbW92aWVfdHJhaW4kbnVtX3ZvdGVkX3VzZXJzK21vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyptb3ZpZV90cmFpbiRudW1fdXNlcl9mb3JfcmV2aWV3cyttb3ZpZV90cmFpbiRncm9zcyptb3ZpZV90cmFpbiRidWRnZXQpCnN1bW1hcnkobG0uZml0MSkKYGBgClRoZSBQLXZhbHVlIGlzIHZlcnkgc2FtbGwuQWxsIHRlcm1zIGFyZSBzaWduaWZpY2FudCBidXQgZmFjZSBudW1iZXIgaW4gcG9zdGVycyBpcyB0aGUgbGVhc3Qgc2lnbmlmaWNhbnQgdmFyaWFibGUuQWRqdXN0ZWQgUl4yIGlzIDAuNDcyNywgd2hpY2ggbWVhbnMgNDcuMjclIG9mIHRoZSB2YXJpYWJpbGl0eSBjYW4gYmUgZXhwbGFpbmVkIGJ5IHRoaXMgbW9kZWwuIAoKCkRvIExhY2sgb2YgZml0IHRlc3QgdG8gc2VlIGlmIHJlbW92aW5nIHRoZSBwcmVkaWN0b3JzIGltcHJvdmUgbW9kZWwgcGVyZm9ybWFuY2U6CmBgYHtyfQojbG0uZnVsbDogZnVsbCBsaW5lYXIgbW9kZWwgd2l0aCBpbnRlcmFjdGlvbiB0ZXJtcyBvbiB0cmFpbiBkYXRhc2V0LgpsbS5mdWxsPC1sbShtb3ZpZV90cmFpbiRpbWRiX3Njb3Jlfm1vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyttb3ZpZV90cmFpbiRudW1fY3JpdGljX2Zvcl9yZXZpZXdzK21vdmllX3RyYWluJG51bV91c2VyX2Zvcl9yZXZpZXdzK21vdmllX3RyYWluJGR1cmF0aW9uK21vdmllX3RyYWluJGZhY2VudW1iZXJfaW5fcG9zdGVyK21vdmllX3RyYWluJGdyb3NzK21vdmllX3RyYWluJG1vdmllX2ZhY2Vib29rX2xpa2VzK21vdmllX3RyYWluJGRpcmVjdG9yX2ZhY2Vib29rX2xpa2VzK21vdmllX3RyYWluJGNhc3RfdG90YWxfZmFjZWJvb2tfbGlrZXMrbW92aWVfdHJhaW4kYnVkZ2V0K21vdmllX3RyYWluJHRpdGxlX3llYXIrZmFjdG9yKG1vdmllX3RyYWluJGdlbnJlcykrbW92aWVfdHJhaW4kZHVyYXRpb24qbW92aWVfdHJhaW4kbnVtX3ZvdGVkX3VzZXJzK21vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyptb3ZpZV90cmFpbiRudW1fdXNlcl9mb3JfcmV2aWV3cyttb3ZpZV90cmFpbiRncm9zcyptb3ZpZV90cmFpbiRidWRnZXQpCmBgYAoKYGBge3J9CmFub3ZhKGxtLmZ1bGwsbG0uZml0MSkgIyBIMDogcmVkdWNlZCBtb2RlbCBmaXRzPT09bGFjayBvZiBmaXQ9MApgYGAKVGhlIFAtdmFsdWUgb2YgdGhlIHBhcnRpYWwgRi10ZXN0IGlzIDAuMTM3OSwgd2hpY2ggbWVhbnMgZHJvcHBpbmcgJ2RpcmVjdG9yIGZhY2Vib29rbGlrZScsJ21vdmllIGZiIGxpa2UnIGFuZCAnY2FzdCB0b3RhbCBmYiBsaWtlcycgZGlkIGltcHJvdmUgbW9kZWwgcGVyZm9ybWFuY2UuCgpEaWFnbm9zdGljczoKYGBge3J9CnBsb3QobG0uZml0MSkKIyByZXNpZHVhbCB2cyBmaXR0ZWQgaW5kaWNhdGVzIG1pZ2h0IGJlIGhpZ2hlciBvcmRlciB0ZXJtLiBOb3JtYWwgcGxvdCBub3QgZ29vZC4KYGBgCgpgYGB7cn0KbGlicmFyeShjYXIpCnJlc2lkdWFsUGxvdHMobG0uZml0MSkKYGBgCkFsbCBvZiB0aGUgcmVzaWR1YWwgdnMgcHJlZGljdG9yIHBsb3RzIGhhdmUgYSBnZW5lcmFsIHRyZW5kIG9mIGNlcnZpdHVyZSwgd2hpY2ggaW5kaWNhdGVzIHRoZSBjdXJyZW50IG1vZGVsIGRvZXMgbm90IGZpdC4gSGlnaGVyIG9yZGVyIHRlcm1zIHNob3VsZCBiZSBpbmNsdWRlZC4KCkZpdCBtb2RlbCB3aXRoIGhpZ2VyIG9yZGVyIHRlcm1zOgpgYGB7cn0KIyBsbS5maXQyOiBtb2RlbCBiYXNlZCBvbiBsbS5maXQxIGFkZGluZyBoaWdlciBvcmRlciBmb3IgYWxsIHZhcmlhYmxlcyBleGNlcHQgZm9yICdmYWNlIG51bWJlciBpbiBwb3N0ZXInIGFuZCAndGl0bGUteWVhcicuCmxtLmZpdDI8LWxtKG1vdmllX3RyYWluJGltZGJfc2NvcmV+cG9seShtb3ZpZV90cmFpbiRudW1fdm90ZWRfdXNlcnMsMikrcG9seShtb3ZpZV90cmFpbiRudW1fY3JpdGljX2Zvcl9yZXZpZXdzLDIpK3BvbHkobW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MsMikrcG9seShtb3ZpZV90cmFpbiRkdXJhdGlvbiwyKSttb3ZpZV90cmFpbiRmYWNlbnVtYmVyX2luX3Bvc3Rlcitwb2x5KG1vdmllX3RyYWluJGdyb3NzLDIpK3BvbHkobW92aWVfdHJhaW4kYnVkZ2V0LDIpK21vdmllX3RyYWluJHRpdGxlX3llYXIrZmFjdG9yKG1vdmllX3RyYWluJGdlbnJlcykrbW92aWVfdHJhaW4kZHVyYXRpb24qbW92aWVfdHJhaW4kbnVtX3ZvdGVkX3VzZXJzK21vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyptb3ZpZV90cmFpbiRudW1fdXNlcl9mb3JfcmV2aWV3cyttb3ZpZV90cmFpbiRncm9zcyptb3ZpZV90cmFpbiRidWRnZXQpCnN1bW1hcnkobG0uZml0MikKYGBgClRoZSBzZWNvbmQgb3JkZXIgdGVybSBmb3IgJ251bSB1c2VyIGZvciByZXZpZXdzJyBpcyBub3Qgc2lnLCBjYW4gYmUgZHJvcGVkLgpUaGUgc2Vjb25kIG9yZGVyIHRlcm0gZm9yICdncm9zcycgaXMgc2lnIGJ1dCBjbG9zZSB0byBub3Qgc2lnLCBjYW4gYmUgZHJvcGVkLgpUaGUgaW50ZXJhY3Rpb24gZm9yICdncm9zcycgYW5kICdidWRnZXQnIGlzIG5vdCB2ZXJ5IHNpZ25pZmljYW50LCBjb3VsZCBiZSBkcm9wZWQuCgpgYGB7cn0KIyBsbS5maXQzOiBiYXNlZCBvbiBsbS5maXQyIGRyb3BwaW5nIHNlY29uZCBvcmRlciB0ZXJtIGZvciAnbnVtYmVyIG9mIHVzZXJzIGZvciByZXZpZXcnLCAnZ3Jvc3MnIGFuZCBidWRnZXQqZ3Jvc3MKbG0uZml0MzwtbG0obW92aWVfdHJhaW4kaW1kYl9zY29yZX5wb2x5KG1vdmllX3RyYWluJG51bV92b3RlZF91c2VycywyKStwb2x5KG1vdmllX3RyYWluJG51bV9jcml0aWNfZm9yX3Jldmlld3MsMikrbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MrcG9seShtb3ZpZV90cmFpbiRkdXJhdGlvbiwyKSttb3ZpZV90cmFpbiRmYWNlbnVtYmVyX2luX3Bvc3Rlcittb3ZpZV90cmFpbiRncm9zcytwb2x5KG1vdmllX3RyYWluJGJ1ZGdldCwyKSttb3ZpZV90cmFpbiR0aXRsZV95ZWFyK2ZhY3Rvcihtb3ZpZV90cmFpbiRnZW5yZXMpK21vdmllX3RyYWluJGR1cmF0aW9uKm1vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyttb3ZpZV90cmFpbiRudW1fdm90ZWRfdXNlcnMqbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MpCnN1bW1hcnkobG0uZml0MykKYGBgCgpgYGB7cn0KYW5vdmEobG0uZml0MixsbS5maXQzKSAKYGBgClAtdmFsdWUgZm9yIGxhY2sgb2YgZml0IHRlc3QgaXMgOiAwLjA3NC4KTWVhbmluZyBsbS5maXQzIGlzIGJldHRlciB0aGFuIGxtLmZpdDIuClJeMiBmb3IgbG0uZml0MzogMC41MDc1LCA1MC43NSUgb2YgdmFyaWF0aW9uIGNvdWxkIGJlIGV4cGxhaW5lZCBieSB0aGlzIG1vZGVsLgoKRGlhZ25vc3RpY3MgZm9yIGxtLmZpdDM6CmBgYHtyfQpwbG90KGxtLmZpdDMpCmBgYAoKYGBge3J9CmxpYnJhcnkoY2FyKQpyZXNpZHVhbFBsb3RzKGxtLmZpdDMpCmBgYApUaGUgcGxvdCBpcyB3YXkgYmV0dGVyIHRoYW4gbG0uZml0Mi4gQWxsIHRoZSByZXNpZHVhbHMgdnMgcHJlZGljdG9ycyBhcmUgc3RyYWluZ2h0IGxpbmVzIGV4Y2VwdCBmb3IgdGl0bGUgeWVhci4gU28sIGxldCd0IHRyeSB0byBhZGQgc2Vjb25kIG9yZGVyIGZvciB0aXRsZSB5ZWFyLgoKCmBgYHtyfQojIGxtLmZpdDQ6IGJhc2VkIG9uIGxtLmZpdDMgYWRkdGluZyBzZWNvbmQgb3JkZXIgZm9yIHRpdGxlIHllYXIuCmxtLmZpdDQ8LWxtKG1vdmllX3RyYWluJGltZGJfc2NvcmV+cG9seShtb3ZpZV90cmFpbiRudW1fdm90ZWRfdXNlcnMsMikrcG9seShtb3ZpZV90cmFpbiRudW1fY3JpdGljX2Zvcl9yZXZpZXdzLDIpK21vdmllX3RyYWluJG51bV91c2VyX2Zvcl9yZXZpZXdzK3BvbHkobW92aWVfdHJhaW4kZHVyYXRpb24sMikrbW92aWVfdHJhaW4kZmFjZW51bWJlcl9pbl9wb3N0ZXIrbW92aWVfdHJhaW4kZ3Jvc3MrcG9seShtb3ZpZV90cmFpbiRidWRnZXQsMikrcG9seShtb3ZpZV90cmFpbiR0aXRsZV95ZWFyLDIpK2ZhY3Rvcihtb3ZpZV90cmFpbiRnZW5yZXMpK21vdmllX3RyYWluJGR1cmF0aW9uKm1vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyttb3ZpZV90cmFpbiRudW1fdm90ZWRfdXNlcnMqbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MpCnN1bW1hcnkobG0uZml0NCkKYGBgCgpgYGB7cn0KYW5vdmEobG0uZml0NCxsbS5maXQzKQpgYGAKUCB2YWx1ZSBpcyBzbyBzbWFsbCwgcmVqZWN0IG51bGwsIG1lYW5pbmcgYWRkaW5nIHNlY29uZCBvcmRlciB0ZXJtIGZvciB0aXRsZSB5ZWFyIGRpZCBub3QgaW1wcm92ZSBtb2RlbC4KCgpNYXJnaW5hbCBNb2RlbCBwbG90OgpgYGB7cn0KbWFyZ2luYWxNb2RlbFBsb3RzKGxtLmZpdDMpCmBgYApUaGUgcGxvdHMgb2YgdGhlIHJlc3BvbnNlIHZlcnN1cyB0aGUgaW5kaXZpZHVhbCBwcmVkaWN0b3JzIGRpc3BsYXkgdGhlIGNvbmRpdGlvbmFsIGRpc3RyaWJ1dGlvbiBvZiB0aGUgcmVzcG9uc2UgZ2l2ZW4gZWFjaCBwcmVkaWN0b3IsIGlnbm9yaW5nIHRoZSBvdGhlciBwcmVkaWN0b3JzLgpGcm9tIG91ciBwbG90cywgb3VyIG1vZGVsIGlzIHJlYWxseSBnb29kLnNpbmNlIHRoZSBtYXJnaW5hbCByZWxhdGlvbnNoaXAgYmV0d2VlbiB0aGUgcmVzcG9uc2UgYW5kIHRoZSBwcmVkaWN0b3IgYXJlIG92ZXJsYXBwaW5nLiAKCkNoZWNrIGZvciByZXNpZHVhbCBvdXJsaWVyczoKYGBge3J9CmxpYnJhcnkoY2FyKQpxcVBsb3QobG0uZml0MyRyZXNpZHVhbHMsaWQubiA9IDEwKQpgYGAKCmBgYHtyfQpvdXRsaWVyVGVzdChsbS5maXQzKSAjIEgwOiByZXNpZHVhbCBpcyBub3QgYW4gb3V0bGllcgpgYGAKQWxsIG9mIHRoZSAxMCByZXNpZHVhbHMgaGF2ZSBzaWduaWZpY2FudCBwLXZhbHVlcywgdGhlcmVmb3JlLCB3ZSBjYW4gZHJvcCB0aGVtLgoKQmVmb3JlIHdlIGRyb3AsIGxldCdzIGRvIHNvbWUgZGlnc25vc3RpY3MgdG8gZG91YmxlIGNoZWNrIHdoaWNoIHRvIGRyb3AuCmBgYHtyfQpsaWJyYXJ5KGNhcikKaW5mbHVlbmNlUGxvdChsbS5maXQzLCBpZC5uPTEwKQpgYGAKRnJvbSB0aGUgaW5mbHVjZW5jZSBwbG90LCB3ZSBkZWNpZGVkIHRvIGRyb3Agb2JzZXJ2YXRpb25zOgoyNTcy77yMMTQyM++8jDg2MO+8jDE1MjDvvIw1MDnvvIw2ODLvvIwxMDE377yMODQ477yMMzYx77yMMjM3CgpgYGB7cn0KIyBsbS5maXQ1OiBtb2RlbCBiYXNlZCBvbiBsbS5maXQzIHJlbW92aW5nIDEwIG91dGxpZXJzLgptb3ZpZV90cmFpbjwtbW92aWVfdHJhaW5bLWMoMjU3MiwxNDIzLDg2MCwxNTIwLDUwOSw2ODIsMTAxNyw4NDgsMzYxLDIzNyksXQoKbG0uZml0NTwtbG0obW92aWVfdHJhaW4kaW1kYl9zY29yZX5wb2x5KG1vdmllX3RyYWluJG51bV92b3RlZF91c2VycywyKStwb2x5KG1vdmllX3RyYWluJG51bV9jcml0aWNfZm9yX3Jldmlld3MsMikrbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MrcG9seShtb3ZpZV90cmFpbiRkdXJhdGlvbiwyKSttb3ZpZV90cmFpbiRmYWNlbnVtYmVyX2luX3Bvc3Rlcittb3ZpZV90cmFpbiRncm9zcytwb2x5KG1vdmllX3RyYWluJGJ1ZGdldCwyKSttb3ZpZV90cmFpbiR0aXRsZV95ZWFyK2ZhY3Rvcihtb3ZpZV90cmFpbiRnZW5yZXMpK21vdmllX3RyYWluJGR1cmF0aW9uKm1vdmllX3RyYWluJG51bV92b3RlZF91c2Vycyttb3ZpZV90cmFpbiRudW1fdm90ZWRfdXNlcnMqbW92aWVfdHJhaW4kbnVtX3VzZXJfZm9yX3Jldmlld3MpCnN1bW1hcnkobG0uZml0NSkKYGBgCgpgYGB7cn0KY29tcGFyZUNvZWZzKGxtLmZpdDMsIGxtLmZpdDUpCmBgYApSZW1vdmluZyBvdXRsaWVycyBkaWQgbm90IGNoYW5nZSB0aGUgcmVzdWx0IHRvbyBtdWNoLgoKCgpEaWFnbm9zdGljcyBmb3IgbG0uZml0NToKYGBge3J9CmxpYnJhcnkoY2FyKQpyZXNpZHVhbFBsb3RzKGxtLmZpdDUpCmBgYApMb29rcyBnb29kIGV4Y2VwdCBmb3IgcmVzaWR1YWxzIHZzIGZpdHRlZCB2YWx1ZXMgc2hvdyBzb21lIGN1cnZpdHVyZS4KCmBgYHtyfQpwbG90KGxtLmZpdDUpCmBgYAoKTm93LGxldCdzIGxvb2sgYXQgbW9kZWwgYXNzdW1wdGlvbiBmb3IgYm90aCBsbS5maXQzIGFuZCBsbS5maXQ1OgpgYGB7cn0KIyBub3JtYWxpdHkKc2hhcGlyby50ZXN0KGxtLmZpdDMkcmVzaWR1YWxzKQpzaGFwaXJvLnRlc3QobG0uZml0NSRyZXNpZHVhbHMpCmBgYApCb3RoIG1vZGVscyBmYWlsZWQgdGhlIG5vcm1hbGl0eSBhc3N1bXB0aW9uLiBJIHRoaW5rIHRoaXMgaXMgZHVlIHRvIHRoZSBtYW55IG91dGxpZXJzIGluIHRoZSBkYXRhIHNldC4gCgpgYGB7cn0KIyBlcXVhbCB2YXJpYW5jZSA6IEgwOiB2YXJpYW5jZSBpcyBub3QgY29uc3RhbnQKbmN2VGVzdChsbS5maXQzKQpuY3ZUZXN0KGxtLmZpdDUpCmBgYApCb3RoIG1vZGVscyBwYXNzZWQgdGhlIGVxdWFsIHZhcmlhbmNlIGFzc3VtcHRpb24uIAoKVGhpcyBpcyBqdXN0IHRvIGV4cGxvcmUgbW9yZSBpbnRlcmVzdGluZyBmYWN0cwpQbG90cyBmb3IgZGF0YSB3aXRoIGZpdHRlZCByZWdyZXNzaW9uIGxpbmU6CmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCmdncGxvdChkYXRhPW1vdmllX3RyYWluLGFlcyh4PWR1cmF0aW9uLHk9aW1kYl9zY29yZSxjb2xvdXI9ZmFjdG9yKGdlbnJlcykpKStzdGF0X3Ntb290aChtZXRob2Q9bG0sZnVsbHJhbmdlID0gRkFMU0UpK2dlb21fcG9pbnQoKQpgYGAKCgpgYGB7cn0KbGlicmFyeShnZ3Bsb3QyKQpnZ3Bsb3QoZGF0YT1tb3ZpZV90cmFpbixhZXMoeD1udW1fdm90ZWRfdXNlcnMseT1pbWRiX3Njb3JlLGNvbG91cj1mYWN0b3IoZ2VucmVzKSkpK3N0YXRfc21vb3RoKG1ldGhvZD1sbSxmdWxscmFuZ2UgPSBGQUxTRSkrZ2VvbV9wb2ludCgpCmBgYAoKYGBge3J9CmxpYnJhcnkoZ2dwbG90MikKZ2dwbG90KGRhdGE9bW92aWVfdHJhaW4sYWVzKHg9ZmFjZW51bWJlcl9pbl9wb3N0ZXIseT1pbWRiX3Njb3JlLGNvbG91cj1mYWN0b3IoZ2VucmVzKSkpK3N0YXRfc21vb3RoKG1ldGhvZD1sbSxmdWxscmFuZ2UgPSBGQUxTRSkrZ2VvbV9wb2ludCgpCmBgYAoKCmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCmdncGxvdChkYXRhPW1vdmllX3RyYWluLGFlcyh4PWdyb3NzLHk9aW1kYl9zY29yZSxjb2xvdXI9ZmFjdG9yKGdlbnJlcykpKStzdGF0X3Ntb290aChtZXRob2Q9bG0sZnVsbHJhbmdlID0gRkFMU0UpK2dlb21fcG9pbnQoKQpgYGAKCmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCmdncGxvdChkYXRhPW1vdmllX3RyYWluLGFlcyh4PWJ1ZGdldCx5PWltZGJfc2NvcmUsY29sb3VyPWZhY3RvcihnZW5yZXMpKSkrc3RhdF9zbW9vdGgobWV0aG9kPWxtLGZ1bGxyYW5nZSA9IEZBTFNFKStnZW9tX3BvaW50KCkKYGBgCgoKIyNTdGVwIDQ6IE1ha2luZyBwcmVkaWN0aW9ucyBvbiB0aGUgdGVzdCBkYXRhc2V0ClJld3JpdGluZyBtb2RlbCBsbS5maXQ1IGluIGFub3RoZXIgbm90YXRpb246CiMgTm90ZSwgaWYgd3JpdGUgaW4gbG0odHJhaW4kc2NvcmV+dHJhaW4keDErdHJhaW4keDIuLi4uKSwgaXQgd2lsbCBjcmVhdGUgdGhlIHNhbWUgbnVtYmVyIG9mIHZhbHVlcyB3aXRoIHRoZSB0cmFpbiBkYXRhIHNldCB3aGVuIHByZWRpY3QoKS4KYGBge3J9CiMgbG0uZml0NiA9bG0uZml0IDUgdXNpbmcgZGlmZmVyZW5jZSB3cml0aW5nCmxtLmZpdDY8LWxtKGltZGJfc2NvcmV+cG9seShudW1fdm90ZWRfdXNlcnMsMikrcG9seShudW1fY3JpdGljX2Zvcl9yZXZpZXdzLDIpK251bV91c2VyX2Zvcl9yZXZpZXdzK3BvbHkoZHVyYXRpb24sMikrZmFjZW51bWJlcl9pbl9wb3N0ZXIrZ3Jvc3MrcG9seShidWRnZXQsMikrdGl0bGVfeWVhcitnZW5yZXMrZHVyYXRpb24qbnVtX3ZvdGVkX3VzZXJzK251bV92b3RlZF91c2VycypudW1fdXNlcl9mb3JfcmV2aWV3cyxkYXRhPWRhdGEuZnJhbWUobW92aWVfdHJhaW4pKQpzdW1tYXJ5KGxtLmZpdDYpCmBgYAoKCmBgYHtyfQpwcjwtcHJlZGljdC5sbShsbS5maXQ2LG5ld2RhdGEgPSBkYXRhLmZyYW1lKG1vdmllX3Rlc3QpLGludGVydmFsID0gJ2NvbmZpZGVuY2UnKQpwcgpgYGAKCkNoZWNraW5nIHRoZSBpbXBhY3Qgc2lnbmlmaWNhbmNlIG9mIHByZWRpY3RvcnMgb24gSU1EQiBzY29yZS4KYGBge3J9CiMgc3RhbnRkYXJkaXplZCByZWdyZXNzaW9uIGNvZWZmaWNpZW50cwpsaWJyYXJ5KFF1YW50UHN5YykKbG0uYmV0YShsbS5maXQ2KQpgYGAKCkNvbmNsdXNpb246ClRoZSBtb3N0IGltcG9ydGFudCBmYWN0b3IgdGhhdCBhZmZlY3RzIG1vdmllIHJhdGluZyBpcyB0aGUgZHVyYXRpb24uIFRoZSBsb25nZXIgdGhlIG1vdmllIGlzLCB0aGUgaGlnaGVyIHRoZSByYXRpbmcgd2lsbCBiZS4KbnVtX2NyaXRpY19mb3JfcmV2aWV3cyBpcyBhbHNvIGFuIGltcG9ydGFudCBwcmVkaWN0b3IuIApCdWRnZXQgaXMgaW1wb3J0YW50LCBhbHRob3VnaCB0aGVyZSBpcyBubyBzdHJvbmcgY29ycmVsYXRpb24gYmV0d2VlbiBidWRnZXQgYW5kIG1vdmllIHJhdGluZy4KVGhlIG51bWJlciBvZiBmYWNlcyBpbiBtb3ZpZSBwb3N0ZXIgaGFzIGEgbm9uLW5lZ2xlY3RhYmxlIGVmZmVjdCB0byB0aGUgbW92aWUgcmF0aW5nLgoKCgoKCgoK