Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017) The data source is from the Central Bank of Nigeria Statistics Database

#check for seasonality
seasonplot(loan_to_deposit,main = "Season Plot: Nigerian Deposit Money Banks loan-to-Deposit Ratio ", year.labels = TRUE,col=1:20,ylab="Percentage(%)",year.labels.left=TRUE,pch=19)

Observing the graph above it does not seem there seems to have a strong seasonal observable pattern between the yearly season. Apart from 2008,2014 & 2015 all other years had the loan-to-deposit ratio drop from January to February.

#check for seasonality
plot(decompose ,main="Plot of Decomposition of Time Series - loan-to-deposit ratio")

monthplot(loan_to_deposit, ylab = "data", cex.axis = 0.8, main="Monthly plot of Data")

monthplot(decompose, choice = "seasonal", cex.axis = 0.8, main="Monthly plot of Seasonal ")

monthplot(decompose, choice = "trend", cex.axis = 0.8, main="Monthly plot of Trend")

monthplot(decompose, choice = "remainder", type = "h", cex.axis = 0.8, , main="Monthly plot of Remainder")

#check for seasonality
quarter <- (cycle(loan_to_deposit)-1 ) %/% 3
monthplot(loan_to_deposit, phase = quarter, main="Quarterly Plot: Loan-to-deposit Ratio")

The quarterly plot reveal a strong quarterly seasonality.

Determining if the Time Series is Stationary

Using the R tsdisplay method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary

tsdisplay(loan_to_deposit)

Determining the Differencing Order to Make Time Series Stationary

The original data contains trends. To convert the time series to a stationary time series I made use of the R diffs method. To determine the order of difference to apply I made use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = 1
To determine the need for order of seasonality difference to apply I made use of nsdiffs, nsdiffs(loan_to_deposit) = 0
The result show that only first order differencing of the time series is required to make the series stationary

level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)
plot(diff_data, ylab="first order difference loans-to-deposit")

tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

The ACF above shows that there is only one autocorrelation that is outside the 95% limit.

Investigating Other time series dataset

The Nigerian Air total passengers travel time series dataset

The dataset is from the World bank website

plot.ts(nigeria,ylab="Number of Passengers per 1000", main = "Nigeria  Yearly Air Passengers 1970 - 2015 ")

Investigating Time Series Stationarity: Nigerian Air Travel Passengers

The yearly time series plot above does not contain seasonality. The plot show evidence of a trend, hence this series is not stationary
Inspecting the ACF and PACF graph and determining the order of differencing the time series also confirms this.
Required order for differencing , ndiffs(nigeria)= 1
Using the ADF test, the p value of 0.7408 also confirms that the plot is not stationary

tsdisplay(nigeria)

adf.test(nigeria)

    Augmented Dickey-Fuller Test

data:  nigeria
Dickey-Fuller = -1.5803, Lag order = 3, p-value = 0.7408
alternative hypothesis: stationary

First Order Differenced Time Series

tsdisplay(diff(nigeria), main="First Order Difference Time Series ")

adf.test(diff(nigeria))

    Augmented Dickey-Fuller Test

data:  diff(nigeria)
Dickey-Fuller = -2.4792, Lag order = 3, p-value = 0.3837
alternative hypothesis: stationary

The graph above shows the first order differenced time series. The adf.test small p value indicates that the time series is now stationary. However it would be observed that the outlier points can still be observed in the differenced time series. ###Using AUTO.ARIMA to determine the best model for the Air Travel Time series

nigeria_fit <- auto.arima(nigeria)
summary(nigeria_fit)
Series: nigeria 
ARIMA(0,1,0)                    

sigma^2 estimated as 260917:  log likelihood=-344.47
AIC=690.94   AICc=691.04   BIC=692.75

Training set error measures:
                  ME     RMSE      MAE      MPE     MAPE      MASE
Training set 66.3181 505.2179 267.8018 1.934564 20.54023 0.9782746
                  ACF1
Training set 0.1964641

Using ETS(SES) to determine the best model for the Air Travel Time series

nigeria_fit_ses <- ses(nigeria, initial = "simple")
summary(nigeria_fit_ses)

Forecast method: Simple exponential smoothing

Model Information:
 

Call:
 ses(x = nigeria, initial = "simple") 

  Smoothing parameters:
    alpha = 1 

  Initial states:
    l = 173 

  sigma:  505.2179
Error measures:
                   ME     RMSE      MAE     MPE     MAPE      MASE
Training set 66.31434 505.2179 267.7981 1.93239 20.53805 0.9782609
                  ACF1
Training set 0.1964642

Forecasts:
     Point Forecast    Lo 80    Hi 80      Lo 95    Hi 95
2016        3223.46 2575.997 3870.923 2233.25090 4213.669
2017        3223.46 2307.809 4139.110 1823.09295 4623.827
2018        3223.46 2102.021 4344.898 1508.36769 4938.552
2019        3223.46 1928.534 4518.385 1243.04202 5203.878
2020        3223.46 1775.689 4671.231 1009.28542 5437.634
2021        3223.46 1637.506 4809.413  797.95329 5648.966
2022        3223.46 1510.434 4936.485  603.61334 5843.306
2023        3223.46 1392.158 5054.761  422.72613 6024.193
2024        3223.46 1281.071 5165.848  252.83315 6194.086
2025        3223.46 1176.003 5270.917   92.14437 6354.775
plot(nigeria, main= "ARIMA (0,1,0) & SES (alpha=1) Model Plot",ylab="Number of Passengers per 1000")
lines(fitted(nigeria_fit), col="red")
lines(fitted(nigeria_fit_ses), col="blue")
legend("topleft",lty=1, col=c("black","blue","red"), 
       c("Data","ARIMA Model","SES Model"),cex=0.80)

In the plot above the SES model is not noticable because it returns the same value as ARIMA(0,1,0) model.

Comparing the ARIMA and SES Models Fit Values

#Comparing ARIMA and SES models fit values
arima_model<- window(fitted(nigeria_fit),1970,1990)
ses_model<- window(fitted(nigeria_fit_ses),1970,1990)
ts.union(arima_model,ses_model)
Time Series:
Start = 1970 
End = 1990 
Frequency = 1 
     arima_model ses_model
1970     172.827     173.0
1971     173.000     173.0
1972     227.100     227.1
1973     286.800     286.8
1974     314.100     314.1
1975     430.300     430.3
1976     590.400     590.4
1977     800.800     800.8
1978    1093.900    1093.9
1979    1441.000    1441.0
1980    1581.300    1581.3
1981    1938.500    1938.5
1982    2300.200    2300.2
1983    2138.400    2138.4
1984    2221.300    2221.3
1985    1945.900    1945.9
1986    2575.000    2575.0
1987    2134.000    2134.0
1988    1614.400    1614.4
1989     995.000     995.0
1990     848.900     848.9
#last 15 values
arima_model2<- window(fitted(nigeria_fit),2000,2015)
ses_model2<- window(fitted(nigeria_fit_ses),2000,2015)
ts.union(arima_model2,ses_model2)
Time Series:
Start = 2000 
End = 2015 
Frequency = 1 
     arima_model2 ses_model2
2000      419.700    419.700
2001      507.396    507.396
2002      519.453    519.453
2003      520.278    520.278
2004      520.263    520.263
2005      540.461    540.461
2006      747.648    747.648
2007     1307.541   1307.541
2008     1363.435   1363.435
2009     1460.900   1460.900
2010     1365.343   1365.343
2011     4197.375   4197.375
2012     4793.913   4793.913
2013     4716.148   4716.148
2014     4209.624   4209.624
2015     3857.424   3857.424

Investigating the Model Residuals and Outlier impact

nigeria_outliers <-tso(nigeria,types=c("AO","LS","TC","IO"))
plot(residuals(nigeria_fit), main="ARIMA(0,1,0) & SES (alpha = 1) residuals", col="blue", ylab="Residual: Number of Passengers per 1000")
lines(residuals(nigeria_fit_ses), col="red")
legend("topleft",lty=1, col=c("blue","red"), 
       c("ARIMA Residual","SES Residual"),cex=0.80)

#ploting the outliers
plot(nigeria_outliers)

Box Plot to Verify Outliers

boxplot(nigeria, main="Box Plot of Nigeria  Yearly Air Passengers 1970 - 2015  ")

Reviewing the outlier plot and model residual plot, it would be observed that the outliers had poor model representation compared to other data points in the time series. SES and ARIMA model for this time series returned the same value.

---
title: "Investigating Time Series Stationarity"
author: "Adebayo Aderibigbe"
output:
  html_notebook: default
  html_document: default
  github_document: default
  pdf_document: default
  word_document: default
  toc: yes
---




```{r  include=FALSE}
library("forecast")
library("fpp")

```


```{r, include=FALSE}
#data source
dmb_data <- read.csv("DMB_data.csv")

#convert data to time series
ts_data <- ts(dmb_data[,(3:5)],frequency=12,start=c(2007,1))

#loan_to_deposit
loan_to_deposit <- ts_data[,1]

decompose <- stl(loan_to_deposit, s.window=12)
```
  
    

Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017) 
The data source is from the [Central Bank of Nigeria Statistics  Database](http://statistics.cbn.gov.ng/cbn-onlinestats/DataBrowser.aspx)

```{r  echo=FALSE}
#par(cex.axis=1.5, cex.lab=1.5)
ts.plot(loan_to_deposit,xlab="Year",ylab="Loans-to-Deposit Ratio",main="Monthly Loan-to-Deposit Ratio  \nfor Nigeria Deposit Money Banks Jan. 2007 - Apr. 2017")
lines(decompose$time.series[,2],col="red",ylab="Trend")
legend("bottomright",c("data","trend"),col=c("black","red"),lty=c(1,1))
```

```{r, }
#check for seasonality
seasonplot(loan_to_deposit,main = "Season Plot: Nigerian Deposit Money Banks loan-to-Deposit Ratio ", year.labels = TRUE,col=1:20,ylab="Percentage(%)",year.labels.left=TRUE,pch=19)

```
Observing the graph above it does not seem there seems to have a strong seasonal observable pattern between the yearly season. Apart from 2008,2014 & 2015 all other years had the loan-to-deposit ratio drop from January to February.

```{r, }
#check for seasonality
plot(decompose ,main="Plot of Decomposition of Time Series - loan-to-deposit ratio")

```

```{r, }
monthplot(loan_to_deposit, ylab = "data", cex.axis = 0.8, main="Monthly plot of Data")
monthplot(decompose, choice = "seasonal", cex.axis = 0.8, main="Monthly plot of Seasonal ")
monthplot(decompose, choice = "trend", cex.axis = 0.8, main="Monthly plot of Trend")
monthplot(decompose, choice = "remainder", type = "h", cex.axis = 0.8, , main="Monthly plot of Remainder")
```

```{r, }
#check for seasonality
quarter <- (cycle(loan_to_deposit)-1 ) %/% 3
monthplot(loan_to_deposit, phase = quarter, main="Quarterly Plot: Loan-to-deposit Ratio")
```
The quarterly plot reveal a strong quarterly seasonality.


###Determining if the Time Series is Stationary

Using the R **tsdisplay** method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary


```{r }
tsdisplay(loan_to_deposit)
```


###Determining the Differencing Order to Make Time Series Stationary


The original data contains trends. To convert the time series to a stationary time series I made use of the  R  **diffs**  method. 
To determine the order of difference to apply I made  use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = `r ndiffs(loan_to_deposit) `  
To determine the need for order of seasonality difference to apply I made use of nsdiffs, nsdiffs(loan_to_deposit) = `r nsdiffs(loan_to_deposit) `  
The result show that only first order differencing of the time series is required to make the series stationary


```{r }
level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)

plot(diff_data, ylab="first order difference loans-to-deposit")
tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

```

The ACF above shows that there is only one autocorrelation that is outside the 95% limit.


###Investigating Other time series dataset
#### The Nigerian Air total passengers travel time series dataset  
The dataset  is from the  [World bank website](http://data.worldbank.org/indicator/IS.AIR.PSGR)

```{r, include=FALSE}

air_travel <- read.csv("API_IS.AIR.PSGR_DS2_en_csv_v2/API_IS.AIR.PSGR_DS2_en_csv_v2.CSV", header = TRUE,skip = 4,comment.char="")

head(air_travel,5)

library(tidyr)
library(dplyr)
library(stringr)
library(tsoutliers)

air_travel_df <- as.data.frame(air_travel)

ghana_nigeria <- filter(air_travel_df, Country.Name =="Nigeria" | Country.Name=="Ghana")

ghana_nigeria_thin <- gather(ghana_nigeria,"year","n",5:62)

clean_data <- select(ghana_nigeria_thin,Country.Code,year,n)
clean_data$year <- str_sub(clean_data$year,2)

clean_data <- filter(clean_data, year>=1970 & year<2016) %>% spread(Country.Code,n)%>% 
            mutate(GHA=GHA/1000,NGA=NGA/1000 )



ts_clean_data <- ts(clean_data[,c("GHA","NGA")], end=2015, frequency=1)

nigeria <- ts_clean_data[,2]
```

```{r}
plot.ts(nigeria,ylab="Number of Passengers per 1000", main = "Nigeria  Yearly Air Passengers 1970 - 2015 ")
```

###Investigating Time Series Stationarity: Nigerian Air Travel Passengers
The yearly  time series plot above does not contain seasonality. The plot show evidence of a trend, hence this series is not stationary  
Inspecting the ACF and PACF graph and determining the order of differencing the time series also confirms this.  
Required order for differencing , ndiffs(nigeria)= `r ndiffs(nigeria) `  
Using the ADF test, the p value of 0.7408 also confirms that the plot is not stationary

```{r}
tsdisplay(nigeria)
adf.test(nigeria)
```

First Order Differenced Time Series 
```{r}
tsdisplay(diff(nigeria), main="First Order Difference Time Series ")
adf.test(diff(nigeria))
```
The graph above shows the first order differenced time series.  The adf.test small p value indicates that the time series is now stationary.  However it would be observed that the outlier points can still be observed in the differenced time series.
###Using AUTO.ARIMA to determine the best model for the Air Travel Time series
```{r, }
nigeria_fit <- auto.arima(nigeria)

summary(nigeria_fit)


```

###Using ETS(SES) to determine the best model for the Air Travel Time series
```{r, }
nigeria_fit_ses <- ses(nigeria, initial = "simple")

summary(nigeria_fit_ses)
```
```{r}
plot(nigeria, main= "ARIMA (0,1,0) & SES (alpha=1) Model Plot",ylab="Number of Passengers per 1000")
lines(fitted(nigeria_fit), col="red")
lines(fitted(nigeria_fit_ses), col="blue")
legend("topleft",lty=1, col=c("black","blue","red"), 
       c("Data","ARIMA Model","SES Model"),cex=0.80)

```
In the plot above the SES model is not noticable because it returns the same value as ARIMA(0,1,0) model.    

####Comparing the ARIMA and SES Models Fit Values
```{r}
#Comparing ARIMA and SES models fit values
arima_model<- window(fitted(nigeria_fit),1970,1990)
ses_model<- window(fitted(nigeria_fit_ses),1970,1990)
ts.union(arima_model,ses_model)

#last 15 values
arima_model2<- window(fitted(nigeria_fit),2000,2015)
ses_model2<- window(fitted(nigeria_fit_ses),2000,2015)
ts.union(arima_model2,ses_model2)
```

    
####Investigating the Model Residuals and Outlier impact  

```{r, }

nigeria_outliers <-tso(nigeria,types=c("AO","LS","TC","IO"))

plot(residuals(nigeria_fit), main="ARIMA(0,1,0) & SES (alpha = 1) residuals", col="blue", ylab="Residual: Number of Passengers per 1000")
lines(residuals(nigeria_fit_ses), col="red")
legend("topleft",lty=1, col=c("blue","red"), 
       c("ARIMA Residual","SES Residual"),cex=0.80)

#ploting the outliers
plot(nigeria_outliers)

```

### Box Plot to Verify Outliers
```{r}
boxplot(nigeria, main="Box Plot of Nigeria  Yearly Air Passengers 1970 - 2015  ")
```

Reviewing the outlier plot and model residual plot, it would be observed that the outliers had poor model representation compared to other data points in the time series. SES and ARIMA model for this time series returned the same value.