Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017) The data source is from the Central Bank of Nigeria Statistics Database

#check for seasonality
seasonplot(loan_to_deposit,main = "Season Plot: Nigerian Deposit Money Banks loan-to-Deposit Ratio ", year.labels = TRUE,col=1:20,ylab="Percentage(%)",year.labels.left=TRUE,pch=19)

Observing the graph above it does not seem there seems to have a strong seasonal observable pattern between the yearly season. Apart from 2008,2014 & 2015 all other years had the loan-to-deposit ratio drop from January to February.
#check for seasonality
plot(decompose ,main="Plot of Decomposition of Time Series - loan-to-deposit ratio")

monthplot(loan_to_deposit, ylab = "data", cex.axis = 0.8, main="Monthly plot of Data")

monthplot(decompose, choice = "seasonal", cex.axis = 0.8, main="Monthly plot of Seasonal ")

monthplot(decompose, choice = "trend", cex.axis = 0.8, main="Monthly plot of Trend")

monthplot(decompose, choice = "remainder", type = "h", cex.axis = 0.8, , main="Monthly plot of Remainder")

#check for seasonality
quarter <- (cycle(loan_to_deposit)-1 ) %/% 3
monthplot(loan_to_deposit, phase = quarter, main="Quarterly Plot: Loan-to-deposit Ratio")

The quarterly plot reveal a strong quarterly seasonality.
Determining if the Time Series is Stationary
Using the R tsdisplay method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary
tsdisplay(loan_to_deposit)

Determining the Differencing Order to Make Time Series Stationary
The original data contains trends. To convert the time series to a stationary time series I made use of the R diffs method. To determine the order of difference to apply I made use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = 1
To determine the need for order of seasonality difference to apply I made use of nsdiffs, nsdiffs(loan_to_deposit) = 0
The result show that only first order differencing of the time series is required to make the series stationary
level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)
plot(diff_data, ylab="first order difference loans-to-deposit")

tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

The ACF above shows that there is only one autocorrelation that is outside the 95% limit.
Investigating Other time series dataset
The Nigerian Air total passengers travel time series dataset
The dataset is from the World bank website
plot.ts(nigeria,ylab="Number of Passengers per 1000", main = "Nigeria Yearly Air Passengers 1970 - 2015 ")

Investigating Time Series Stationarity: Nigerian Air Travel Passengers
The yearly time series plot above does not contain seasonality. The plot show evidence of a trend, hence this series is not stationary
Inspecting the ACF and PACF graph and determining the order of differencing the time series also confirms this.
Required order for differencing , ndiffs(nigeria)= 1
Using the ADF test, the p value of 0.7408 also confirms that the plot is not stationary
tsdisplay(nigeria)

adf.test(nigeria)
Augmented Dickey-Fuller Test
data: nigeria
Dickey-Fuller = -1.5803, Lag order = 3, p-value = 0.7408
alternative hypothesis: stationary
First Order Differenced Time Series
tsdisplay(diff(nigeria), main="First Order Difference Time Series ")

adf.test(diff(nigeria))
Augmented Dickey-Fuller Test
data: diff(nigeria)
Dickey-Fuller = -2.4792, Lag order = 3, p-value = 0.3837
alternative hypothesis: stationary
The graph above shows the first order differenced time series. The adf.test small p value indicates that the time series is now stationary. However it would be observed that the outlier points can still be observed in the differenced time series. ###Using AUTO.ARIMA to determine the best model for the Air Travel Time series
nigeria_fit <- auto.arima(nigeria)
summary(nigeria_fit)
Series: nigeria
ARIMA(0,1,0)
sigma^2 estimated as 260917: log likelihood=-344.47
AIC=690.94 AICc=691.04 BIC=692.75
Training set error measures:
ME RMSE MAE MPE MAPE MASE
Training set 66.3181 505.2179 267.8018 1.934564 20.54023 0.9782746
ACF1
Training set 0.1964641
Using ETS(SES) to determine the best model for the Air Travel Time series
nigeria_fit_ses <- ses(nigeria, initial = "simple")
summary(nigeria_fit_ses)
Forecast method: Simple exponential smoothing
Model Information:
Call:
ses(x = nigeria, initial = "simple")
Smoothing parameters:
alpha = 1
Initial states:
l = 173
sigma: 505.2179
Error measures:
ME RMSE MAE MPE MAPE MASE
Training set 66.31434 505.2179 267.7981 1.93239 20.53805 0.9782609
ACF1
Training set 0.1964642
Forecasts:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2016 3223.46 2575.997 3870.923 2233.25090 4213.669
2017 3223.46 2307.809 4139.110 1823.09295 4623.827
2018 3223.46 2102.021 4344.898 1508.36769 4938.552
2019 3223.46 1928.534 4518.385 1243.04202 5203.878
2020 3223.46 1775.689 4671.231 1009.28542 5437.634
2021 3223.46 1637.506 4809.413 797.95329 5648.966
2022 3223.46 1510.434 4936.485 603.61334 5843.306
2023 3223.46 1392.158 5054.761 422.72613 6024.193
2024 3223.46 1281.071 5165.848 252.83315 6194.086
2025 3223.46 1176.003 5270.917 92.14437 6354.775
plot(nigeria, main= "ARIMA (0,1,0) & SES (alpha=1) Model Plot",ylab="Number of Passengers per 1000")
lines(fitted(nigeria_fit), col="red")
lines(fitted(nigeria_fit_ses), col="blue")
legend("topleft",lty=1, col=c("black","blue","red"),
c("Data","ARIMA Model","SES Model"),cex=0.80)

In the plot above the SES model is not noticable because it returns the same value as ARIMA(0,1,0) model.
Comparing the ARIMA and SES Models Fit Values
#Comparing ARIMA and SES models fit values
arima_model<- window(fitted(nigeria_fit),1970,1990)
ses_model<- window(fitted(nigeria_fit_ses),1970,1990)
ts.union(arima_model,ses_model)
Time Series:
Start = 1970
End = 1990
Frequency = 1
arima_model ses_model
1970 172.827 173.0
1971 173.000 173.0
1972 227.100 227.1
1973 286.800 286.8
1974 314.100 314.1
1975 430.300 430.3
1976 590.400 590.4
1977 800.800 800.8
1978 1093.900 1093.9
1979 1441.000 1441.0
1980 1581.300 1581.3
1981 1938.500 1938.5
1982 2300.200 2300.2
1983 2138.400 2138.4
1984 2221.300 2221.3
1985 1945.900 1945.9
1986 2575.000 2575.0
1987 2134.000 2134.0
1988 1614.400 1614.4
1989 995.000 995.0
1990 848.900 848.9
#last 15 values
arima_model2<- window(fitted(nigeria_fit),2000,2015)
ses_model2<- window(fitted(nigeria_fit_ses),2000,2015)
ts.union(arima_model2,ses_model2)
Time Series:
Start = 2000
End = 2015
Frequency = 1
arima_model2 ses_model2
2000 419.700 419.700
2001 507.396 507.396
2002 519.453 519.453
2003 520.278 520.278
2004 520.263 520.263
2005 540.461 540.461
2006 747.648 747.648
2007 1307.541 1307.541
2008 1363.435 1363.435
2009 1460.900 1460.900
2010 1365.343 1365.343
2011 4197.375 4197.375
2012 4793.913 4793.913
2013 4716.148 4716.148
2014 4209.624 4209.624
2015 3857.424 3857.424
Investigating the Model Residuals and Outlier impact
nigeria_outliers <-tso(nigeria,types=c("AO","LS","TC","IO"))
plot(residuals(nigeria_fit), main="ARIMA(0,1,0) & SES (alpha = 1) residuals", col="blue", ylab="Residual: Number of Passengers per 1000")
lines(residuals(nigeria_fit_ses), col="red")
legend("topleft",lty=1, col=c("blue","red"),
c("ARIMA Residual","SES Residual"),cex=0.80)

#ploting the outliers
plot(nigeria_outliers)

Box Plot to Verify Outliers
boxplot(nigeria, main="Box Plot of Nigeria Yearly Air Passengers 1970 - 2015 ")

Reviewing the outlier plot and model residual plot, it would be observed that the outliers had poor model representation compared to other data points in the time series. SES and ARIMA model for this time series returned the same value.
---
title: "Investigating Time Series Stationarity"
author: "Adebayo Aderibigbe"
output:
  html_notebook: default
  html_document: default
  github_document: default
  pdf_document: default
  word_document: default
  toc: yes
---




```{r  include=FALSE}
library("forecast")
library("fpp")

```


```{r, include=FALSE}
#data source
dmb_data <- read.csv("DMB_data.csv")

#convert data to time series
ts_data <- ts(dmb_data[,(3:5)],frequency=12,start=c(2007,1))

#loan_to_deposit
loan_to_deposit <- ts_data[,1]

decompose <- stl(loan_to_deposit, s.window=12)
```
  
    

Nigerian Deposit Money Banks Monthly Loan to Deposit Ratio (Jan. 2007 - April 2017) 
The data source is from the [Central Bank of Nigeria Statistics  Database](http://statistics.cbn.gov.ng/cbn-onlinestats/DataBrowser.aspx)

```{r  echo=FALSE}
#par(cex.axis=1.5, cex.lab=1.5)
ts.plot(loan_to_deposit,xlab="Year",ylab="Loans-to-Deposit Ratio",main="Monthly Loan-to-Deposit Ratio  \nfor Nigeria Deposit Money Banks Jan. 2007 - Apr. 2017")
lines(decompose$time.series[,2],col="red",ylab="Trend")
legend("bottomright",c("data","trend"),col=c("black","red"),lty=c(1,1))
```

```{r, }
#check for seasonality
seasonplot(loan_to_deposit,main = "Season Plot: Nigerian Deposit Money Banks loan-to-Deposit Ratio ", year.labels = TRUE,col=1:20,ylab="Percentage(%)",year.labels.left=TRUE,pch=19)

```
Observing the graph above it does not seem there seems to have a strong seasonal observable pattern between the yearly season. Apart from 2008,2014 & 2015 all other years had the loan-to-deposit ratio drop from January to February.

```{r, }
#check for seasonality
plot(decompose ,main="Plot of Decomposition of Time Series - loan-to-deposit ratio")

```

```{r, }
monthplot(loan_to_deposit, ylab = "data", cex.axis = 0.8, main="Monthly plot of Data")
monthplot(decompose, choice = "seasonal", cex.axis = 0.8, main="Monthly plot of Seasonal ")
monthplot(decompose, choice = "trend", cex.axis = 0.8, main="Monthly plot of Trend")
monthplot(decompose, choice = "remainder", type = "h", cex.axis = 0.8, , main="Monthly plot of Remainder")
```

```{r, }
#check for seasonality
quarter <- (cycle(loan_to_deposit)-1 ) %/% 3
monthplot(loan_to_deposit, phase = quarter, main="Quarterly Plot: Loan-to-deposit Ratio")
```
The quarterly plot reveal a strong quarterly seasonality.


###Determining if the Time Series is Stationary

Using the R **tsdisplay** method, I generated the ACF and PACF plots. The ACF plots show a gradual decrease which implies that the time series is not stationary


```{r }
tsdisplay(loan_to_deposit)
```


###Determining the Differencing Order to Make Time Series Stationary


The original data contains trends. To convert the time series to a stationary time series I made use of the  R  **diffs**  method. 
To determine the order of difference to apply I made  use of the R forecast package method ndiffs, ndiffs(loan_to_deposit) = `r ndiffs(loan_to_deposit) `  
To determine the need for order of seasonality difference to apply I made use of nsdiffs, nsdiffs(loan_to_deposit) = `r nsdiffs(loan_to_deposit) `  
The result show that only first order differencing of the time series is required to make the series stationary


```{r }
level_of_difference <- ndiffs(loan_to_deposit)
diff_data <- diff(loan_to_deposit, level_of_difference)

plot(diff_data, ylab="first order difference loans-to-deposit")
tsdisplay(diff_data, main="first order differencing of loans-to-deposit ratio time series")

```

The ACF above shows that there is only one autocorrelation that is outside the 95% limit.


###Investigating Other time series dataset
#### The Nigerian Air total passengers travel time series dataset  
The dataset  is from the  [World bank website](http://data.worldbank.org/indicator/IS.AIR.PSGR)

```{r, include=FALSE}

air_travel <- read.csv("API_IS.AIR.PSGR_DS2_en_csv_v2/API_IS.AIR.PSGR_DS2_en_csv_v2.CSV", header = TRUE,skip = 4,comment.char="")

head(air_travel,5)

library(tidyr)
library(dplyr)
library(stringr)
library(tsoutliers)

air_travel_df <- as.data.frame(air_travel)

ghana_nigeria <- filter(air_travel_df, Country.Name =="Nigeria" | Country.Name=="Ghana")

ghana_nigeria_thin <- gather(ghana_nigeria,"year","n",5:62)

clean_data <- select(ghana_nigeria_thin,Country.Code,year,n)
clean_data$year <- str_sub(clean_data$year,2)

clean_data <- filter(clean_data, year>=1970 & year<2016) %>% spread(Country.Code,n)%>% 
            mutate(GHA=GHA/1000,NGA=NGA/1000 )



ts_clean_data <- ts(clean_data[,c("GHA","NGA")], end=2015, frequency=1)

nigeria <- ts_clean_data[,2]
```

```{r}
plot.ts(nigeria,ylab="Number of Passengers per 1000", main = "Nigeria  Yearly Air Passengers 1970 - 2015 ")
```

###Investigating Time Series Stationarity: Nigerian Air Travel Passengers
The yearly  time series plot above does not contain seasonality. The plot show evidence of a trend, hence this series is not stationary  
Inspecting the ACF and PACF graph and determining the order of differencing the time series also confirms this.  
Required order for differencing , ndiffs(nigeria)= `r ndiffs(nigeria) `  
Using the ADF test, the p value of 0.7408 also confirms that the plot is not stationary

```{r}
tsdisplay(nigeria)
adf.test(nigeria)
```

First Order Differenced Time Series 
```{r}
tsdisplay(diff(nigeria), main="First Order Difference Time Series ")
adf.test(diff(nigeria))
```
The graph above shows the first order differenced time series.  The adf.test small p value indicates that the time series is now stationary.  However it would be observed that the outlier points can still be observed in the differenced time series.
###Using AUTO.ARIMA to determine the best model for the Air Travel Time series
```{r, }
nigeria_fit <- auto.arima(nigeria)

summary(nigeria_fit)


```

###Using ETS(SES) to determine the best model for the Air Travel Time series
```{r, }
nigeria_fit_ses <- ses(nigeria, initial = "simple")

summary(nigeria_fit_ses)
```
```{r}
plot(nigeria, main= "ARIMA (0,1,0) & SES (alpha=1) Model Plot",ylab="Number of Passengers per 1000")
lines(fitted(nigeria_fit), col="red")
lines(fitted(nigeria_fit_ses), col="blue")
legend("topleft",lty=1, col=c("black","blue","red"), 
       c("Data","ARIMA Model","SES Model"),cex=0.80)

```
In the plot above the SES model is not noticable because it returns the same value as ARIMA(0,1,0) model.    

####Comparing the ARIMA and SES Models Fit Values
```{r}
#Comparing ARIMA and SES models fit values
arima_model<- window(fitted(nigeria_fit),1970,1990)
ses_model<- window(fitted(nigeria_fit_ses),1970,1990)
ts.union(arima_model,ses_model)

#last 15 values
arima_model2<- window(fitted(nigeria_fit),2000,2015)
ses_model2<- window(fitted(nigeria_fit_ses),2000,2015)
ts.union(arima_model2,ses_model2)
```

    
####Investigating the Model Residuals and Outlier impact  

```{r, }

nigeria_outliers <-tso(nigeria,types=c("AO","LS","TC","IO"))

plot(residuals(nigeria_fit), main="ARIMA(0,1,0) & SES (alpha = 1) residuals", col="blue", ylab="Residual: Number of Passengers per 1000")
lines(residuals(nigeria_fit_ses), col="red")
legend("topleft",lty=1, col=c("blue","red"), 
       c("ARIMA Residual","SES Residual"),cex=0.80)

#ploting the outliers
plot(nigeria_outliers)

```

### Box Plot to Verify Outliers
```{r}
boxplot(nigeria, main="Box Plot of Nigeria  Yearly Air Passengers 1970 - 2015  ")
```

Reviewing the outlier plot and model residual plot, it would be observed that the outliers had poor model representation compared to other data points in the time series. SES and ARIMA model for this time series returned the same value.