Movie Database
Everyone enjoys watching movies. Nowadays, there are movies in every genre you can think of appealing to at least one person. It is a huge business that has to adapt and change with the times. Advances in technology have made movies better and more expansive. Just like any other business, production companies need to use data to make informed decisions about which movies to fund. Data can help guide companies in the right direction that will likely give you the most profit. Any movie has a budget and companies want to make sure they receive the most revenue in return. My project will show how a company would use data to find insights in the market and make decisions based off of a machine learning model forecasting the future of movies. First, we need to conduct some exploratory data analysis to find interesting correlations. The dataset I used can be found here.
Budget vs Revenue
One important aspect of the movie business is the relationship between budget and revenue. Every movie has a budget to spend on making the movie that they want to make back, plus a large profit. Here I fit a least squared errors line in blue to our data. The red line is a simple x=y line to show the movies below line that don’t make their money back. You can see in the plot that the majority of movies make their money back, but some do not.
- geom_smooth()` using method = ‘gam’ and formula ’y ~ s(x, bs = “cs”)
## Rows: 45,466
## Columns: 24
## $ adult <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
## $ belongs_to_collection <chr> "{'id': 10194, 'name': 'Toy Story Collection', '~
## $ budget <dbl> 30000000, 65000000, 0, 16000000, 0, 60000000, 58~
## $ genres <chr> "[{'id': 16, 'name': 'Animation'}, {'id': 35, 'n~
## $ homepage <chr> "http://toystory.disney.com/toy-story", NA, NA, ~
## $ id <dbl> 862, 8844, 15602, 31357, 11862, 949, 11860, 4532~
## $ imdb_id <chr> "tt0114709", "tt0113497", "tt0113228", "tt011488~
## $ original_language <chr> "en", "en", "en", "en", "en", "en", "en", "en", ~
## $ original_title <chr> "Toy Story", "Jumanji", "Grumpier Old Men", "Wai~
## $ overview <chr> "Led by Woody, Andy's toys live happily in his r~
## $ popularity <dbl> 21.946943, 17.015539, 11.712900, 3.859495, 8.387~
## $ poster_path <chr> "/rhIRbceoE9lR4veEXuwCC2wARtG.jpg", "/vzmL6fP7aP~
## $ production_companies <chr> "[{'name': 'Pixar Animation Studios', 'id': 3}]"~
## $ production_countries <chr> "[{'iso_3166_1': 'US', 'name': 'United States of~
## $ release_date <date> 1995-10-30, 1995-12-15, 1995-12-22, 1995-12-22,~
## $ revenue <dbl> 373554033, 262797249, 0, 81452156, 76578911, 187~
## $ runtime <dbl> 81, 104, 101, 127, 106, 170, 127, 97, 106, 130, ~
## $ spoken_languages <chr> "[{'iso_639_1': 'en', 'name': 'English'}]", "[{'~
## $ status <chr> "Released", "Released", "Released", "Released", ~
## $ tagline <chr> NA, "Roll the dice and unleash the excitement!",~
## $ title <chr> "Toy Story", "Jumanji", "Grumpier Old Men", "Wai~
## $ video <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
## $ vote_average <dbl> 7.7, 6.9, 6.5, 6.1, 5.7, 7.7, 6.2, 5.4, 5.5, 6.6~
## $ vote_count <dbl> 5415, 2413, 92, 34, 173, 1886, 141, 45, 174, 119~
Correlation Matrix - Heatmap of numeric variables
Next, I selected all of the numeric variables and designed a correlation heatmap. You can see the Pearson correlation value of each variable. It is a good view of how the variables might play into the ROI predictive model that we’ll build later on. Revenue and budget are positively correlated which is good to see the data reflects these two moving together.
Top Movies
Have you ever wondered about the highest earning movie for a production company? In this table we can see the highest revenue earned by a movie for a production company. Avatar has earned the most revenue of any movie (through 2017).
| Production Compnay | Movie Title | Revenue |
|---|---|---|
| Ingenious Film Partners | Avatar | $2,787,965,087 |
| Lucasfilm | Star Wars: The Force Awakens | $2,068,223,624 |
| Paramount Pictures | Titanic | $1,845,034,188 |
| Universal Studios | Jurassic World | $1,513,528,810 |
| Universal Pictures | Furious 7 | $1,506,249,360 |
| Marvel Studios | Avengers: Age of Ultron | $1,405,403,694 |
| Warner Bros | Harry Potter and the Deathly Hallows: Part 2 | $1,342,000,000 |
| Walt Disney Pictures | Frozen | $1,274,219,009 |
| Studio Babelsberg | Captain America: Civil War | $1,153,304,495 |
| WingNut Films | The Lord of the Rings: The Return of the King | $1,118,888,979 |
Timeline
This interactive graph from ggplot2 and plotly is a good picture of how long each production company has been producing movies. It’s a great timeline to see how old some companies are and if they are still producing movies. As you can see, Paramount Pictures is one of the oldest and it is still producing movies in the more recent years. Marvel Studios, however, is fairly new which makes sense as the Marvel universe came from comic books first.
Modeling
When you look at this data, the clear variable to try to predict is revenue. If you could know how much money a film is going to make, you could easily decide if you want to make that movie. So here, I use a linear regression model to predict revenue based on genre, popularity, runtime, revenue, vote average, vote count, realease year and release quarter. As you can see it does an ok job of predicting the revenue.
We can plot the actuals versus predictions to see how the model does.
We can find out which variables are most important to predicting
revenue.
ROI
More important than revenue, however, is ROI, or Return on Investment. This is the amount of money you make back per $1 spent. Obviously, a higher ROI means more bang for your buck! You want to spend the smallest amount, but make back the most amount. Now, the model below does a pretty bad job of predicting ROI. If we could make a really good model, we’d be rich! It’s not impossible, but we’d probably need more data to get at least closer.
options(scipen=10000)
ggplot(data = pred_df_roi,
mapping = aes(x = .pred, y = ROI)) +
geom_point(color = '#006EA1', alpha = 0.25) +
geom_abline(intercept = 0, slope = 1, color = 'orange') +
scale_x_continuous(labels = scales::label_number_si()) +
labs(title = 'Predictions vs Actuals',
x = 'Predicted ROI',
y = 'Actual ROI')
Time Series Forecasting
Now that we’ve looked at what has happened in the movies, let’s see if we can predict what will happen. Here, I use a time series forecast to see what the sales of each genre should look like in the future. I can pick a genre in my code to see them individually.
Then I want to compare the model versus the actuals for each genre to get a better look at how accurate our model might be. The model can’t pick up on seasonal trends based on yearly data so we do try it on monthly data as well.
“The first letter denotes the error type (“A”, “M” or “Z”); the second letter denotes the trend type (“N”,”A”,”M” or “Z”); and the third letter denotes the season type (“N”,”A”,”M” or “Z”). In all cases, “N”=none, “A”=additive, “M”=multiplicative and “Z”=automatically selected. So, for example, “ANN” is simple exponential smoothing with additive errors, “MAM” is multiplicative Holt-Winters’ method with multiplicative errors, and so on.”
## # A tibble: 21 x 17
## # Groups: primary_genre [21]
## primary_genre data data.ts fit.ets nof_values model.desc sigma logLik
## <chr> <list> <list> <list> <int> <chr> <dbl> <dbl>
## 1 Action <tibble> <ts[...]> <ets> 37 ETS(M,A,N) 3.82e-1 -817.
## 2 Adventure <tibble> <ts[...]> <ets> 37 ETS(M,A,N) 5.13e-1 -824.
## 3 Animation <tibble> <ts[...]> <ets> 37 ETS(A,N,N) 7.29e+8 -821.
## 4 character0 <tibble> <ts[...]> <ets> 37 ETS(A,N,N) 2.04e+7 -689.
## 5 Comedy <tibble> <ts[...]> <ets> 37 ETS(M,A,N) 2.33e-1 -797.
## 6 Crime <tibble> <ts[...]> <ets> 37 ETS(A,A,N) 2.38e+8 -778.
## 7 Documentary <tibble> <ts[...]> <ets> 37 ETS(A,N,N) 4.06e+7 -714.
## 8 Drama <tibble> <ts[...]> <ets> 37 ETS(M,A,N) 3.62e-1 -806.
## 9 Family <tibble> <ts[...]> <ets> 37 ETS(A,N,N) 3.79e+8 -797.
## 10 Fantasy <tibble> <ts[...]> <ets> 37 ETS(M,A,N) 7.60e-1 -793.
## # ... with 11 more rows, and 9 more variables: AIC <dbl>, BIC <dbl>, ME <dbl>,
## # RMSE <dbl>, MAE <dbl>, MPE <dbl>, MAPE <dbl>, MASE <dbl>, ACF1 <dbl>
## [1] "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01"
## [6] "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01"
## [11] "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01"
## [16] "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01"
## [21] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01"
## [26] "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01"
## [31] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
## [36] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"
## [41] "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01"
## [46] "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01"
## [51] "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01"
## [56] "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01"
## [61] "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01"
## [66] "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01"
## [71] "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01"
## [76] "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01"
## [81] "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01"
## [86] "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01"
## [91] "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01"
## [96] "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01"
## [101] "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01"
## [106] "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01"
## [111] "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01"
## [116] "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01"
## [121] "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01"
## [126] "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01"
## [131] "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01"
## [136] "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01"
## [141] "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01"
## [146] "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01"
## [151] "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [156] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
## [161] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
## [166] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
## [171] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
## [176] "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01"
## [181] "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01"
## [186] "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01"
## [191] "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01"
## [196] "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01"
## [201] "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01"
## [206] "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01"
## [211] "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01"
## [216] "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01"
## [221] "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01"
## [226] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01"
## [231] "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01"
## [236] "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01"
## [241] "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01"
## [246] "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01"
## [251] "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01"
## [256] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01"
## [261] "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01"
## [266] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
## [271] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"
## [276] "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01"
## [281] "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01"
## [286] "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01"
## [291] "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01"
## [296] "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01"
## [301] "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01"
## [306] "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01"
## [311] "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01"
## [316] "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01"
## [321] "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01"
## [326] "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01"
## [331] "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01"
## [336] "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01"
## [341] "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01"
## [346] "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01"
## [351] "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01"
## [356] "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01"
## [361] "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01"
## [366] "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01"
## [371] "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01"
## [376] "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01"
## [381] "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01"
## [386] "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [391] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
## [396] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
## [401] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
## [406] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
## [411] "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01"
## [416] "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01"
## [421] "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01"
## [426] "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01"
## [431] "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01"
## [436] "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01"
## [441] "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01"
## [446] "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01"
## [451] "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01"
## [456] "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01"
## [461] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01"
## [466] "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01"
## [471] "1980-01-01" "1981-01-01" "1983-01-01" "1987-01-01" "1988-01-01"
## [476] "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [481] "1994-01-01" "1995-01-01" "1996-01-01" "1998-01-01" "1999-01-01"
## [486] "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01"
## [491] "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01"
## [496] "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01"
## [501] "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01"
## [506] "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01" "1980-01-01"
## [511] "1981-01-01" "1982-01-01" "1983-01-01" "1985-01-01" "1986-01-01"
## [516] "1987-01-01" "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01"
## [521] "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01"
## [526] "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01"
## [531] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
## [536] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
## [541] "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01"
## [546] "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01"
## [551] "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01"
## [556] "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01"
## [561] "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01"
## [566] "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01"
## [571] "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01"
## [576] "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01"
## [581] "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01"
## [586] "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01"
## [591] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01"
## [596] "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01"
## [601] "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01"
## [606] "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01"
## [611] "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01"
## [616] "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01"
## [621] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01"
## [626] "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01"
## [631] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
## [636] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"
## [641] "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01"
## [646] "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01"
## [651] "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01"
## [656] "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01"
## [661] "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01"
## [666] "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01"
## [671] "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01"
## [676] "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01"
## [681] "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01"
## [686] "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01"
## [691] "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01"
## [696] "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01"
## [701] "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01"
## [706] "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01"
## [711] "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01"
## [716] "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01"
## [721] "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01"
## [726] "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01"
## [731] "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01"
## [736] "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01"
## [741] "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01"
## [746] "1984-01-01" "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01"
## [751] "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [756] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
## [761] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
## [766] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
## [771] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
## [776] "2014-01-01" "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01"
## [781] "2019-01-01" "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01"
## [786] "2024-01-01" "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01"
## [791] "1982-01-01" "1983-01-01" "1984-01-01" "1985-01-01" "1986-01-01"
## [796] "1987-01-01" "1988-01-01" "1989-01-01" "1990-01-01" "1991-01-01"
## [801] "1992-01-01" "1993-01-01" "1994-01-01" "1995-01-01" "1996-01-01"
## [806] "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01"
## [811] "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01"
## [816] "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01" "2011-01-01"
## [821] "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01"
## [826] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01"
## [831] "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01"
## [836] "1980-01-01" "1981-01-01" "1982-01-01" "1983-01-01" "1984-01-01"
## [841] "1985-01-01" "1986-01-01" "1987-01-01" "1988-01-01" "1989-01-01"
## [846] "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01" "1994-01-01"
## [851] "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01"
## [856] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01"
## [861] "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01"
## [866] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
## [871] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"
## [876] "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01"
## [881] "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01"
## [886] "1983-01-01" "1984-01-01" "1986-01-01" "1987-01-01" "1988-01-01"
## [891] "1989-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [896] "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01"
## [901] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01"
## [906] "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01"
## [911] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01"
## [916] "2015-01-01" "2016-01-01" "2017-01-01" "2018-01-01" "2019-01-01"
## [921] "2020-01-01" "2021-01-01" "2022-01-01" "2023-01-01" "2024-01-01"
## [926] "2025-01-01" "2026-01-01" "1980-01-01" "1981-01-01" "1982-01-01"
## [931] "1987-01-01" "1990-01-01" "1991-01-01" "1992-01-01" "1993-01-01"
## [936] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
## [941] "1999-01-01" "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01"
## [946] "2006-01-01" "2007-01-01" "2009-01-01" "2010-01-01" "2011-01-01"
## [951] "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01" "2016-01-01"
## [956] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01" "2021-01-01"
## [961] "2022-01-01" "2023-01-01" "2024-01-01" "2025-01-01" "2026-01-01"
| index | key | Sales | units |
|---|---|---|---|
| Action | |||
| 2014-01-01 | actual | 7617253949 | 7617253949 |
| 2015-01-01 | actual | 12360239064 | 12360239064 |
| 2016-01-01 | actual | 10558589985 | 10558589985 |
| 2017-01-01 | forecast | 10725228928 | 10725228928 |
| 2018-01-01 | forecast | 11230890108 | 11230890108 |
| 2019-01-01 | forecast | 11736551288 | 11736551288 |
| 2020-01-01 | forecast | 12242212469 | 12242212469 |
Here I’ve colored out 80% and 95% confidence intervals in purple shades.
## # A tibble: 20 x 17
## # Groups: primary_genre [20]
## primary_genre data data.ts fit.ets nof_values model.desc sigma logLik
## <chr> <list> <list> <list> <int> <chr> <dbl> <dbl>
## 1 Action <tibble> <ts[...]> <ets> 204 ETS(A,A,N) 4.72e8 -4615.
## 2 Adventure <tibble> <ts[...]> <ets> 191 ETS(A,N,N) 4.51e8 -4307.
## 3 Animation <tibble> <ts[...]> <ets> 190 ETS(A,N,N) 3.09e8 -4212.
## 4 character0 <tibble> <ts[...]> <ets> 203 ETS(A,N,N) 8.55e6 -3778.
## 5 Comedy <tibble> <ts[...]> <ets> 204 ETS(A,N,N) 2.04e8 -4444.
## 6 Crime <tibble> <ts[...]> <ets> 194 ETS(A,N,N) 7.93e7 -4039.
## 7 Documentary <tibble> <ts[...]> <ets> 200 ETS(A,N,N) 1.71e7 -3859.
## 8 Drama <tibble> <ts[...]> <ets> 204 ETS(A,N,N) 2.51e8 -4487.
## 9 Family <tibble> <ts[...]> <ets> 146 ETS(A,N,N) 2.15e8 -3164.
## 10 Fantasy <tibble> <ts[...]> <ets> 156 ETS(A,N,N) 2.00e8 -3375.
## 11 Foreign <tibble> <ts[...]> <ets> 51 ETS(A,N,N) 4.07e5 -758.
## 12 History <tibble> <ts[...]> <ets> 82 ETS(A,N,N) 7.00e7 -1661.
## 13 Horror <tibble> <ts[...]> <ets> 198 ETS(A,N,N) 9.21e7 -4153.
## 14 Music <tibble> <ts[...]> <ets> 127 ETS(A,N,N) 2.62e7 -2476.
## 15 Mystery <tibble> <ts[...]> <ets> 139 ETS(A,N,N) 4.79e7 -2800.
## 16 Romance <tibble> <ts[...]> <ets> 186 ETS(A,N,N) 7.04e7 -3846.
## 17 Science Ficti~ <tibble> <ts[...]> <ets> 134 ETS(A,N,N) 2.65e8 -2926.
## 18 Thriller <tibble> <ts[...]> <ets> 192 ETS(A,N,N) 1.13e8 -4065.
## 19 War <tibble> <ts[...]> <ets> 85 ETS(A,N,N) 7.21e7 -1726.
## 20 Western <tibble> <ts[...]> <ets> 36 ETS(A,N,N) 9.40e7 -724.
## # ... with 9 more variables: AIC <dbl>, BIC <dbl>, ME <dbl>, RMSE <dbl>,
## # MAE <dbl>, MPE <dbl>, MAPE <dbl>, MASE <dbl>, ACF1 <dbl>
| index | key | Sales | units |
|---|---|---|---|
| Action | |||
| 2015-01-01 | actual | 593683 | 593683 |
| 2015-02-01 | actual | 135189995 | 135189995 |
| 2015-03-01 | actual | 71561644 | 71561644 |
| 2015-04-01 | actual | 3086214643 | 3086214643 |
| 2015-05-01 | actual | 1136695592 | 1136695592 |
| 2015-06-01 | actual | 1605238637 | 1605238637 |
| 2015-07-01 | actual | 1411252047 | 1411252047 |
| 2015-08-01 | actual | 321556109 | 321556109 |
| 2015-09-01 | actual | 409616532 | 409616532 |
| 2015-10-01 | actual | 880674609 | 880674609 |
| 2015-11-01 | actual | 653458261 | 653458261 |
| 2015-12-01 | actual | 2648187312 | 2648187312 |
| 2016-01-01 | actual | 771005885 | 771005885 |
| 2016-02-01 | actual | 894343596 | 894343596 |
| 2016-03-01 | actual | 1079014641 | 1079014641 |
| 2016-04-01 | actual | 252360895 | 252360895 |
| 2016-05-01 | actual | 511465401 | 511465401 |
| 2016-06-01 | actual | 1416886756 | 1416886756 |
| 2016-07-01 | actual | 1270086719 | 1270086719 |
| 2016-08-01 | actual | 959988344 | 959988344 |
| 2016-09-01 | actual | 208301904 | 208301904 |
| 2016-10-01 | actual | 886868558 | 886868558 |
| 2016-11-01 | actual | 200613336 | 200613336 |
| 2016-12-01 | actual | 2107653950 | 2107653950 |
| 2017-01-01 | forecast | 1014138745 | 1014138745 |