Time Series: ARIMA Modeling

Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017)

The data source is from the Central Bank of Nigeria Statistics Database

Determining if the Time Series is Stationary

Using the R tsdisplay method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary

tsdisplay(loan_to_deposit)

Determining the Differencing Order to Make Time Series Stationary

The original data contains trends. To convert the time series to a stationary time series I made use of the R diffs method. To determine the order of difference to apply I made use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = 1

level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)
plot(diff_data, ylab="first order difference loans-to-deposit")

tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

The ACF above shows that there is only one autocorrelation that is outside the 95% limit. Lets check the impact of a log transform of the first order difference of the time series if this would address this issue

level_of_difference <- ndiffs(log(loan_to_deposit))
diff_log_data <- diff(log(loan_to_deposit), level_of_difference)
plot(diff_log_data, ylab="first order difference log(loans-to-deposit)")

tsdisplay(diff_log_data, main="first order difference of log(loans-to-deposit) ratio time series")

The log transform of the first order difference of time series produces ACF that indicates that the difference log series is a white noise series.

The first order difference of the log transform of the data would be used to generate the arima model. R auto.arima method would be used that generates the p,d,q automatically

arima_fit <- auto.arima(diff_log_data, seasonal = FALSE)
arima_fit
Series: diff_log_data 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
      intercept
         0.0015
s.e.     0.0071

sigma^2 estimated as 0.006225:  log likelihood=137.84
AIC=-271.68   AICc=-271.58   BIC=-266.06
plot(diff_log_data, main="ARMA(0,0,0) Modeling \n the difference time series of log of Loan-to-Deposit Ratio")
lines(fitted(arima_fit), col="red") 
legend("bottomright",c("data","ARIMA(0,0,0)"),col=c("black","red"),lty=c(1,1))

Applying the learning from the Forecasting Principle text with a difference series that is a white noise, the model for the original series,log(loan_to_deposit) would be \(y^'^~t~=0.0015+y~(t-1)~+e~t~\)

ARIMA Model using auto.arima for the log(loan_to_deposit) Time Series

arima_fit_auto <- auto.arima(log(loan_to_deposit))
summary(arima_fit_auto)
Series: log(loan_to_deposit) 
ARIMA(0,1,0) with drift         

Coefficients:
       drift
      0.0015
s.e.  0.0071

sigma^2 estimated as 0.006225:  log likelihood=137.84
AIC=-271.68   AICc=-271.58   BIC=-266.06

Training set error measures:
                       ME       RMSE        MAE         MPE     MAPE      MASE        ACF1
Training set 3.386398e-05 0.07858044 0.04448429 -0.02037677 1.103638 0.2131183 -0.09051223
plot(log(loan_to_deposit), main="ARMA(0,1,0) Modeling \n of the log of Loan-to-Deposit Ratio Time Seires")
lines(fitted(arima_fit_auto), col="red") 
legend("bottomleft",c("data","ARIMA(0,1,0)"),col=c("black","red"),lty=c(1,1))

Simple Exponential Smoothening Modeling using R SES the log(loan_to_deposit) Time Series

ses_model <- ses(log(loan_to_deposit))
summary(ses_model)

Forecast method: Simple exponential smoothing

Model Information:
ETS(A,N,N) 

Call:
 ses(x = log(loan_to_deposit)) 

  Smoothing parameters:
    alpha = 0.9087 

  Initial states:
    l = 4.1984 

  sigma:  0.0783

      AIC      AICc       BIC 
-30.07517 -29.97600 -24.43461 

Error measures:
                      ME       RMSE       MAE        MPE     MAPE      MASE         ACF1
Training set 0.001656168 0.07827393 0.0446254 0.01718334 1.107703 0.2137943 0.0002983321

Forecasts:
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
May 2017       4.385022 4.284710 4.485334 4.231608 4.538436
Jun 2017       4.385022 4.249481 4.520563 4.177730 4.592314
Jul 2017       4.385022 4.221681 4.548362 4.135214 4.634830
Aug 2017       4.385022 4.197969 4.572075 4.098949 4.671095
Sep 2017       4.385022 4.176941 4.593103 4.066789 4.703254
Oct 2017       4.385022 4.157851 4.612193 4.037594 4.732450
Nov 2017       4.385022 4.140246 4.629798 4.010669 4.759375
Dec 2017       4.385022 4.123824 4.646219 3.985555 4.784489
Jan 2018       4.385022 4.108376 4.661668 3.961929 4.808115
Feb 2018       4.385022 4.093746 4.676298 3.939554 4.830490
plot(log(loan_to_deposit), main="ETS(A,N,N) Modeling \n of the log of Loan-to-Deposit Ratio Time Seires")
lines(fitted(ses_model), col="red") 
legend("bottomright",c("data","ETS(A,N,N)"),col=c("black","red"),lty=c(1,1))

The SES model made use of ETS(A,N,N)

Comparing SES and ARIMA for the log(loan_to_deposit) Time Series

comparing the SES and ARIMA model using the AICc,
ARIMA model - ARIMA(0,1,0), AICc is -271.58 SES model - ETS(A,N,N) with an alpha of 0.9087, AICc is -29.976

The ARIMA model based on the AICc performed better in modeling the log(loan_to_deposit) time series. However plotting the residuals of the two models suggest that it does not seem the AICc are computed the same way. Because the plot show very little difference. However the difference in AICc of SES (-29.976) and ARIMA (-271.58) looks large.

Based on the residual plot below, the models seems to have very similar performance.

Comparing the Residuals of the SES and ARIMA model

 plot(residuals(arima_fit_auto), main="Comparing the Residuals of \nSES and ARIMA model", ylab="residual")
lines(residuals(ses_model), col="red")
legend("topleft",c("ARIMA(0,1,0)","ETS(A,N,N)"),col=c("black","red"),lty=c(1,1))

Plotting the log(loan_to_deposit), ARIMA(0,1,0) and ETS(A,N,N) on one plot

plot(log(loan_to_deposit), main="log(loan_to_deposit) time series, ARIMA(0,1,0) and ETS(A,N,N) models in one Plot")
lines(fitted(ses_model), col="red") 
lines(fitted(arima_fit_auto), col="blue") 
legend("bottomright",c("data","ETS(A,N,N)","ARIMA(0,1,0)"),col=c("black","red","blue"),lty=c(1,1))

Analysing the Residuals

An attempt is made to check if the residual is a white noise or not. This is done by ploting the ACF and PACF for each of the models - ARIMA and SES. The outcome of this plot show that both model produce residuals that are approximately white noise.

tsdisplay(arima_fit_auto$residuals, main="Analysis of ARIMA(0,1,0) Residual")

tsdisplay(ses_model$residuals, main="Analysis of ETS(A,N,N) Residuals")

---
title: "Applying ARIMA Modeling to Time Series"
author: "Adebayo Aderibigbe"
output:
  html_notebook: default
  html_document: default
  github_document: default
  pdf_document: default
  word_document: default
---

###Time Series: ARIMA Modeling
###Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017) 
The data source is from the [Central Bank of Nigeria Statistics  Database](http://statistics.cbn.gov.ng/cbn-onlinestats/DataBrowser.aspx)

```{r  include=FALSE}
library("forecast")
library("fpp")

```


```{r, include=FALSE}
#data source
dmb_data <- read.csv("DMB_data.csv")

#convert data to time series
ts_data <- ts(dmb_data[,(3:5)],frequency=12,start=c(2007,1))

#loan_to_deposit
loan_to_deposit <- ts_data[,1]
```

```{r, include=FALSE}
#create fit
fit <- stl(loan_to_deposit,t.window=15, s.window="periodic", robust=TRUE)
```


```{r  echo=FALSE}
#par(cex.axis=1.5, cex.lab=1.5)
ts.plot(loan_to_deposit,xlab="Year",ylab="Loans-to-Deposit Ratio",main="Monthly Loan-to-Deposit Ratio  for Nigeria Deposit Money Banks \n Jan. 2007 - Apr. 2017")
lines(fit$time.series[,2],col="red",ylab="Trend")
legend("bottomright",c("data","trend"),col=c("black","red"),lty=c(1,1))
```

###Determining if the Time Series is Stationary

Using the R **tsdisplay** method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary


```{r }
tsdisplay(loan_to_deposit)
```


####Determining the Differencing Order to Make Time Series Stationary


The original data contains trends. To convert the time series to a stationary time series I made use of the  R  **diffs**  method. 
To determine the order of difference to apply I made  use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = `r ndiffs(loan_to_deposit) `


```{r }
level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)

plot(diff_data, ylab="first order difference loans-to-deposit")
tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

```

The ACF above shows that there is only one autocorrelation that is outside the 95% limit.
Lets check the impact of a log transform of the first order difference of the time series if this would address this issue


```{r }
level_of_difference <- ndiffs(log(loan_to_deposit))
diff_log_data <- diff(log(loan_to_deposit), level_of_difference)
plot(diff_log_data, ylab="first order difference log(loans-to-deposit)")
tsdisplay(diff_log_data, main="first order difference of log(loans-to-deposit) ratio time series")

```

The log transform of the first order difference of time series produces ACF that indicates that the difference log series is a white noise series.



The first order difference of the log transform of the data would be used to generate the arima model.
R **auto.arima** method would be used that generates the p,d,q automatically

```{r }
arima_fit <- auto.arima(diff_log_data, seasonal = FALSE)
arima_fit


plot(diff_log_data, main="ARMA(0,0,0) Modeling \n the difference time series of log of Loan-to-Deposit Ratio")
lines(fitted(arima_fit), col="red") 
legend("bottomright",c("data","ARIMA(0,0,0)"),col=c("black","red"),lty=c(1,1))
```

Applying the learning from the [Forecasting Principle text](https://www.otexts.org/fpp/8/1) with a difference series that  is a white noise, the model for the original series,log(loan_to_deposit) would be $y^'^~t~=0.0015+y~(t-1)~+e~t~$

####ARIMA Model using auto.arima for the log(loan_to_deposit) Time Series
```{r}
arima_fit_auto <- auto.arima(log(loan_to_deposit))
summary(arima_fit_auto)

plot(log(loan_to_deposit), main="ARMA(0,1,0) Modeling \n of the log of Loan-to-Deposit Ratio Time Seires")
lines(fitted(arima_fit_auto), col="red") 
legend("bottomleft",c("data","ARIMA(0,1,0)"),col=c("black","red"),lty=c(1,1))
```

####Simple Exponential Smoothening Modeling using R SES  the log(loan_to_deposit) Time Series
```{r}
ses_model <- ses(log(loan_to_deposit))
summary(ses_model)

plot(log(loan_to_deposit), main="ETS(A,N,N) Modeling \n of the log of Loan-to-Deposit Ratio Time Seires")
lines(fitted(ses_model), col="red") 
legend("bottomright",c("data","ETS(A,N,N)"),col=c("black","red"),lty=c(1,1))

```

The SES model made use of ETS(A,N,N)

###Comparing SES and ARIMA for the log(loan_to_deposit) Time Series
comparing the SES and ARIMA model using the AICc,   
ARIMA model - ARIMA(0,1,0), AICc is -271.58 
SES model - ETS(A,N,N) with an alpha of 0.9087, AICc is -29.976

The ARIMA model based on the AICc performed better in modeling the log(loan_to_deposit) time series.   However plotting the residuals of the two models suggest that it does not seem the AICc are computed the same way. Because the plot show very little difference.  However the difference in AICc of SES (-29.976) and ARIMA (-271.58) looks large.

Based on the residual plot below, the models seems to have very similar performance.

####Comparing the Residuals of the SES and ARIMA model
```{r}
 plot(residuals(arima_fit_auto), main="Comparing the Residuals of \nSES and ARIMA model", ylab="residual")
lines(residuals(ses_model), col="red")
legend("topleft",c("ARIMA(0,1,0)","ETS(A,N,N)"),col=c("black","red"),lty=c(1,1))
```

####Plotting the log(loan_to_deposit), ARIMA(0,1,0) and ETS(A,N,N) on one plot
```{r}
plot(log(loan_to_deposit), main="log(loan_to_deposit) time series, ARIMA(0,1,0) and ETS(A,N,N) models in one Plot")
lines(fitted(ses_model), col="red") 
lines(fitted(arima_fit_auto), col="blue") 
legend("bottomright",c("data","ETS(A,N,N)","ARIMA(0,1,0)"),col=c("black","red","blue"),lty=c(1,1))

```
####Analysing the Residuals
An attempt is made to check if the residual is a white noise or not.  This is done by ploting the ACF and PACF for each of the models - ARIMA and SES.  The outcome of this plot show that both model produce residuals that are approximately white noise.  

```{r}
tsdisplay(arima_fit_auto$residuals, main="Analysis of ARIMA(0,1,0) Residual")

tsdisplay(ses_model$residuals, main="Analysis of ETS(A,N,N) Residuals")
```

