In this project we will analyse new monthly car registrations in the
UK and then use Prophet to predict future registrations based on trends
over the years.
📌 Explore monthly car registrations dataset
📌 Understand patterns and trends of car sales over the years
📌 Use Prophet to forecast future car registrations
📌 Interpret results and graphs
The car registration dataset we will use contains monthly numbers of new car registrations from 1960 onwards. This will allow us to analyse its trend and seasonality. Then we can convert the data in the correct format for Prophet.
# First we need to download our data, in this case we can directly download it with the function read.csv
cars <- read.csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-car-sales.csv")
# Change data to acceptable format for Prophet
cars_df <- data.frame(
ds = seq(as.Date("1960-01-01"), by="month", length.out=nrow(cars)), #monthly data
y = cars$Sales)
#We can view the first few rows
head(cars)## Month Sales
## 1 1960-01 6550
## 2 1960-02 8728
## 3 1960-03 12026
## 4 1960-04 14395
## 5 1960-05 14587
## 6 1960-06 13791
head(cars_df)## ds y
## 1 1960-01-01 6550
## 2 1960-02-01 8728
## 3 1960-03-01 12026
## 4 1960-04-01 14395
## 5 1960-05-01 14587
## 6 1960-06-01 13791
First let’s plot a graph of our data and make some observations!
Now we can zoom in to understand the spikes that occur.
From the graphs we can clearly see two things:
✔️ An upward trend over the years
✔️ Seasonal spikes
These could happen because of several reasons. One of the reasons the amount of car registrations is increasing over time could be the development in the automotive industry in the UK, encouraging people to purchase newer cars. This means that overall the car ownership has been steadily increasing. We can also see from our second plot that the highest amount of new registrations are during spring time or at the end of year. This could indicate people getting ready for the holiday months before summer since travel time might be increased then, or at the end of the year where their financial situation could be more stable after bonuses at work etc.
We can also view another version of the original graph which demonstrates more stabilility in the variance of the seasonality. This is achieved with a log transformation.
If you are interested, here is an interactive graph for you to explore!💡
#Load the required libraries
library(plotly)## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
After exploring our data and making our observations, it’s time to finally prepare our forecast! We first have to fit our model to then enable us to create plots which demonstrate the trends and patterns our model is forecasting.
#Load the required libraries
library(prophet)## Loading required package: Rcpp
## Loading required package: rlang
Now we will use our already prepared data to fit the model with Prophet.
model_car <- prophet(cars_df)## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
#We create a future dataframe of 24 months
future_date <- make_future_dataframe(model_car, periods = 24, freq = "month")
#We check if the dates have been created correctly
head(future_date)## ds
## 1 1960-01-01
## 2 1960-02-01
## 3 1960-03-01
## 4 1960-04-01
## 5 1960-05-01
## 6 1960-06-01
We can now visualise the future car registrations!
forecast_value <- predict(model_car, future_date)
#We view the first few rows
head(forecast_value)## ds trend additive_terms additive_terms_lower additive_terms_upper
## 1 1960-01-01 9590.512 -3458.501 -3458.501 -3458.501
## 2 1960-02-01 9694.705 -2937.247 -2937.247 -2937.247
## 3 1960-03-01 9792.175 2952.008 2952.008 2952.008
## 4 1960-04-01 9896.368 4917.530 4917.530 4917.530
## 5 1960-05-01 9997.199 6166.221 6166.221 6166.221
## 6 1960-06-01 10101.391 2991.111 2991.111 2991.111
## yearly yearly_lower yearly_upper multiplicative_terms
## 1 -3458.501 -3458.501 -3458.501 0
## 2 -2937.247 -2937.247 -2937.247 0
## 3 2952.008 2952.008 2952.008 0
## 4 4917.530 4917.530 4917.530 0
## 5 6166.221 6166.221 6166.221 0
## 6 2991.111 2991.111 2991.111 0
## multiplicative_terms_lower multiplicative_terms_upper yhat_lower yhat_upper
## 1 0 0 4513.355 7752.653
## 2 0 0 5099.283 8352.413
## 3 0 0 11179.296 14314.760
## 4 0 0 13209.004 16315.038
## 5 0 0 14686.363 17680.759
## 6 0 0 11663.731 14688.822
## trend_lower trend_upper yhat
## 1 9590.512 9590.512 6132.011
## 2 9694.705 9694.705 6757.458
## 3 9792.175 9792.175 12744.183
## 4 9896.368 9896.368 14813.898
## 5 9997.199 9997.199 16163.420
## 6 10101.391 10101.391 13092.503
plot(model_car,forecast_value)To help us understand better, we can see the plot where its trend and seasonality are separated.
This confirms that there is a continuous long-term increase as well as similar seasonality as years go by.
✔️ As we saw above, it is true that the long-term future trend of new
car registrations is increasing.
✔️ From the plot where we visualised our forecast, we can see the shaded
region around the graph representing the uncertainty, meaning there
could be different results given different circumstances that
occur.
✔️The seasonality still exists as we saw earlier where the number of car
registrations are more over certain months of the year.
## ds trend additive_terms
## Min. :1960-01-01 00:00:00 Min. : 9591 Min. :-4892.54
## 1st Qu.:1962-09-23 12:00:00 1st Qu.:12940 1st Qu.:-2898.45
## Median :1965-06-16 00:00:00 Median :16135 Median : -698.26
## Mean :1965-06-16 08:43:38 Mean :15433 Mean : -11.27
## 3rd Qu.:1968-03-08 18:00:00 3rd Qu.:17946 3rd Qu.: 2961.78
## Max. :1970-12-01 00:00:00 Max. :19743 Max. : 6679.06
## additive_terms_lower additive_terms_upper yearly
## Min. :-4892.54 Min. :-4892.54 Min. :-4892.54
## 1st Qu.:-2898.45 1st Qu.:-2898.45 1st Qu.:-2898.45
## Median : -698.26 Median : -698.26 Median : -698.26
## Mean : -11.27 Mean : -11.27 Mean : -11.27
## 3rd Qu.: 2961.78 3rd Qu.: 2961.78 3rd Qu.: 2961.78
## Max. : 6679.06 Max. : 6679.06 Max. : 6679.06
## yearly_lower yearly_upper multiplicative_terms
## Min. :-4892.54 Min. :-4892.54 Min. :0
## 1st Qu.:-2898.45 1st Qu.:-2898.45 1st Qu.:0
## Median : -698.26 Median : -698.26 Median :0
## Mean : -11.27 Mean : -11.27 Mean :0
## 3rd Qu.: 2961.78 3rd Qu.: 2961.78 3rd Qu.:0
## Max. : 6679.06 Max. : 6679.06 Max. :0
## multiplicative_terms_lower multiplicative_terms_upper yhat_lower
## Min. :0 Min. :0 Min. : 4431
## 1st Qu.:0 1st Qu.:0 1st Qu.:10748
## Median :0 Median :0 Median :13690
## Mean :0 Mean :0 Mean :13842
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:16858
## Max. :0 Max. :0 Max. :24335
## yhat_upper trend_lower trend_upper yhat
## Min. : 7467 Min. : 9591 Min. : 9591 Min. : 5900
## 1st Qu.:13972 1st Qu.:12940 1st Qu.:12940 1st Qu.:12322
## Median :16904 Median :16135 Median :16135 Median :15277
## Mean :16998 Mean :15425 Mean :15440 Mean :15422
## 3rd Qu.:19932 3rd Qu.:17946 3rd Qu.:17946 3rd Qu.:18388
## Max. :27523 Max. :19637 Max. :19845 Max. :25860
With the help of Prophet, we were able to get loads of insight and information about the number of new car registrations in the UK over the years! We could spot trends and seasonality that make sense and coincide with society’s changing demands over the years. So here we have a predictive model that emphasises key trends and patterns for us to explore the future and with the help of Prophet and Data Science we could build even more!