logo

Introduction🚗

In this project we will analyse new monthly car registrations in the UK and then use Prophet to predict future registrations based on trends over the years.

Aims of this project🎯

📌 Explore monthly car registrations dataset
📌 Understand patterns and trends of car sales over the years
📌 Use Prophet to forecast future car registrations
📌 Interpret results and graphs

1.1 Exploring our dataset🔍

The car registration dataset we will use contains monthly numbers of new car registrations from 1960 onwards. This will allow us to analyse its trend and seasonality. Then we can convert the data in the correct format for Prophet.

# First we need to download our data, in this case we can directly download it with the function read.csv
cars <- read.csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-car-sales.csv")
# Change data to acceptable format for Prophet
cars_df <- data.frame(
ds = seq(as.Date("1960-01-01"), by="month", length.out=nrow(cars)),  #monthly data
y = cars$Sales)
#We can view the first few rows
head(cars)
##     Month Sales
## 1 1960-01  6550
## 2 1960-02  8728
## 3 1960-03 12026
## 4 1960-04 14395
## 5 1960-05 14587
## 6 1960-06 13791
head(cars_df)
##           ds     y
## 1 1960-01-01  6550
## 2 1960-02-01  8728
## 3 1960-03-01 12026
## 4 1960-04-01 14395
## 5 1960-05-01 14587
## 6 1960-06-01 13791

1.2 Observe our current data💻

First let’s plot a graph of our data and make some observations!

Now we can zoom in to understand the spikes that occur.

From the graphs we can clearly see two things:
✔️ An upward trend over the years
✔️ Seasonal spikes

These could happen because of several reasons. One of the reasons the amount of car registrations is increasing over time could be the development in the automotive industry in the UK, encouraging people to purchase newer cars. This means that overall the car ownership has been steadily increasing. We can also see from our second plot that the highest amount of new registrations are during spring time or at the end of year. This could indicate people getting ready for the holiday months before summer since travel time might be increased then, or at the end of the year where their financial situation could be more stable after bonuses at work etc.

We can also view another version of the original graph which demonstrates more stabilility in the variance of the seasonality. This is achieved with a log transformation.

If you are interested, here is an interactive graph for you to explore!💡

#Load the required libraries
library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

1.3 Let’s fit the model with Prophet🧩

After exploring our data and making our observations, it’s time to finally prepare our forecast! We first have to fit our model to then enable us to create plots which demonstrate the trends and patterns our model is forecasting.

#Load the required libraries  
library(prophet)
## Loading required package: Rcpp
## Loading required package: rlang

Now we will use our already prepared data to fit the model with Prophet.

model_car <- prophet(cars_df)
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
#We create a future dataframe of 24 months
future_date <- make_future_dataframe(model_car, periods = 24, freq = "month")
#We check if the dates have been created correctly
head(future_date)
##           ds
## 1 1960-01-01
## 2 1960-02-01
## 3 1960-03-01
## 4 1960-04-01
## 5 1960-05-01
## 6 1960-06-01

Forecasting with Prophet🫧

We can now visualise the future car registrations!

forecast_value <- predict(model_car, future_date)
#We view the first few rows
head(forecast_value)
##           ds     trend additive_terms additive_terms_lower additive_terms_upper
## 1 1960-01-01  9590.512      -3458.501            -3458.501            -3458.501
## 2 1960-02-01  9694.705      -2937.247            -2937.247            -2937.247
## 3 1960-03-01  9792.175       2952.008             2952.008             2952.008
## 4 1960-04-01  9896.368       4917.530             4917.530             4917.530
## 5 1960-05-01  9997.199       6166.221             6166.221             6166.221
## 6 1960-06-01 10101.391       2991.111             2991.111             2991.111
##      yearly yearly_lower yearly_upper multiplicative_terms
## 1 -3458.501    -3458.501    -3458.501                    0
## 2 -2937.247    -2937.247    -2937.247                    0
## 3  2952.008     2952.008     2952.008                    0
## 4  4917.530     4917.530     4917.530                    0
## 5  6166.221     6166.221     6166.221                    0
## 6  2991.111     2991.111     2991.111                    0
##   multiplicative_terms_lower multiplicative_terms_upper yhat_lower yhat_upper
## 1                          0                          0   4513.355   7752.653
## 2                          0                          0   5099.283   8352.413
## 3                          0                          0  11179.296  14314.760
## 4                          0                          0  13209.004  16315.038
## 5                          0                          0  14686.363  17680.759
## 6                          0                          0  11663.731  14688.822
##   trend_lower trend_upper      yhat
## 1    9590.512    9590.512  6132.011
## 2    9694.705    9694.705  6757.458
## 3    9792.175    9792.175 12744.183
## 4    9896.368    9896.368 14813.898
## 5    9997.199    9997.199 16163.420
## 6   10101.391   10101.391 13092.503
plot(model_car,forecast_value)

To help us understand better, we can see the plot where its trend and seasonality are separated.

This confirms that there is a continuous long-term increase as well as similar seasonality as years go by.

Analysis & Conclusion📊

2.1 Analysing our plots and results📈

✔️ As we saw above, it is true that the long-term future trend of new car registrations is increasing.
✔️ From the plot where we visualised our forecast, we can see the shaded region around the graph representing the uncertainty, meaning there could be different results given different circumstances that occur.
✔️The seasonality still exists as we saw earlier where the number of car registrations are more over certain months of the year.

##        ds                          trend       additive_terms    
##  Min.   :1960-01-01 00:00:00   Min.   : 9591   Min.   :-4892.54  
##  1st Qu.:1962-09-23 12:00:00   1st Qu.:12940   1st Qu.:-2898.45  
##  Median :1965-06-16 00:00:00   Median :16135   Median : -698.26  
##  Mean   :1965-06-16 08:43:38   Mean   :15433   Mean   :  -11.27  
##  3rd Qu.:1968-03-08 18:00:00   3rd Qu.:17946   3rd Qu.: 2961.78  
##  Max.   :1970-12-01 00:00:00   Max.   :19743   Max.   : 6679.06  
##  additive_terms_lower additive_terms_upper     yearly        
##  Min.   :-4892.54     Min.   :-4892.54     Min.   :-4892.54  
##  1st Qu.:-2898.45     1st Qu.:-2898.45     1st Qu.:-2898.45  
##  Median : -698.26     Median : -698.26     Median : -698.26  
##  Mean   :  -11.27     Mean   :  -11.27     Mean   :  -11.27  
##  3rd Qu.: 2961.78     3rd Qu.: 2961.78     3rd Qu.: 2961.78  
##  Max.   : 6679.06     Max.   : 6679.06     Max.   : 6679.06  
##   yearly_lower       yearly_upper      multiplicative_terms
##  Min.   :-4892.54   Min.   :-4892.54   Min.   :0           
##  1st Qu.:-2898.45   1st Qu.:-2898.45   1st Qu.:0           
##  Median : -698.26   Median : -698.26   Median :0           
##  Mean   :  -11.27   Mean   :  -11.27   Mean   :0           
##  3rd Qu.: 2961.78   3rd Qu.: 2961.78   3rd Qu.:0           
##  Max.   : 6679.06   Max.   : 6679.06   Max.   :0           
##  multiplicative_terms_lower multiplicative_terms_upper   yhat_lower   
##  Min.   :0                  Min.   :0                  Min.   : 4431  
##  1st Qu.:0                  1st Qu.:0                  1st Qu.:10748  
##  Median :0                  Median :0                  Median :13690  
##  Mean   :0                  Mean   :0                  Mean   :13842  
##  3rd Qu.:0                  3rd Qu.:0                  3rd Qu.:16858  
##  Max.   :0                  Max.   :0                  Max.   :24335  
##    yhat_upper     trend_lower     trend_upper         yhat      
##  Min.   : 7467   Min.   : 9591   Min.   : 9591   Min.   : 5900  
##  1st Qu.:13972   1st Qu.:12940   1st Qu.:12940   1st Qu.:12322  
##  Median :16904   Median :16135   Median :16135   Median :15277  
##  Mean   :16998   Mean   :15425   Mean   :15440   Mean   :15422  
##  3rd Qu.:19932   3rd Qu.:17946   3rd Qu.:17946   3rd Qu.:18388  
##  Max.   :27523   Max.   :19637   Max.   :19845   Max.   :25860

Conclusion📝

With the help of Prophet, we were able to get loads of insight and information about the number of new car registrations in the UK over the years! We could spot trends and seasonality that make sense and coincide with society’s changing demands over the years. So here we have a predictive model that emphasises key trends and patterns for us to explore the future and with the help of Prophet and Data Science we could build even more!