Netflix Movie Duration Analysis

Yohannes Gebretsadik

Introduction

This project analyzes Netflix movie duration using a dataset of Netflix titles.

Summary Statistics

[1]  90  91 125 104 127  91
[1] 101.2454
[1] 31
[1] 312

Call:
lm(formula = duration_num ~ release_year, data = movies)

Residuals:
     Min       1Q   Median       3Q      Max 
-107.557  -13.465   -1.783   13.595  213.535 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1234.41935   68.73329   17.96   <2e-16 ***
release_year   -0.56291    0.03414  -16.49   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 25.62 on 5996 degrees of freedom
Multiple R-squared:  0.04337,   Adjusted R-squared:  0.04321 
F-statistic: 271.8 on 1 and 5996 DF,  p-value: < 2.2e-16

Histogram

  • Most movies fall between 90 and 120 minutes, with the highest concentration around 100 minutes.

Boxplot

  • This boxplot highlights the median around 100 minutes and reveals several high end outliers, extending unusually long movies.

Density Plot

  • The density plot confirms a concentration around 100 minutes and shows a slight right skew, indicating a smaller number of longer movies.

Linear Regression Plot

  • The linear regression plot shows a slight negative relationship between release years and movies duration, indicating than more recent movies tend to be slightly shorter on average. However, the trend is weak, suggesting that movie durations have remained relatively stable over time. The variation in points also indicates the factors that factors other than release year likely influence movie length.

    Conclusion

Overall, most movies are around 100 minutes long, with the majority falling between 90-120 minutes. While there is a slight decrease in duration over time, movie lenght has stayed fairly consistent overall.