Univariate time series analysis using M2 data for Pakistan

library(tseries)
library(tidyverse)
library(forecast)
library(lubridate)

1.Univariate Time Series Analysis

1.1 Time Series Plot

First step while doing time series analysis is to plot your data.Time plot provides a quick insight about various components of time series data. In this section , I am uploading data from the working directory where this data is stored. After uploading data, I have done some basic transformation of data.The data set that I am using for analysis is the macroeconomic variable “Money” abbreviated as M2 which is taken over the monthly frequency from July 2008 to January 2019 making a total of 126 observations. The series is stored as excel file and is to be fetched in R for analysis. I am using tidyverse package which is very handy and easy as compared to base R. The “tidyverse” is an opinionated collection of R packages designed for data analysis where all packages share an underlying design philosophy, grammar, and data structures.I shall also use ggplot for time plots and will use either Wall Street Journal theme or the economist themse. You may use themse of your own choice.

setwd("C:/Users/hp/OneDrive - Higher Education Commission/Econometric Analysis")   #My working directory
library(readxl) # Library to read xls or xlsx data 
m2<-read_xlsx("M2_monthly.xlsx")
library(ggthemes)   ## This library is used for various themes of plots
m2<-m2 %>% rename(m='M2 (Million Rupees)') #rename variable
m2<-m2 %>% mutate(date=ymd(Date)) # Date was initially character now using lubridate variable convert it into date variable
glimpse(m2) # To get data overview

## Observations: 126
## Variables: 3
## $ Date <dttm> 2008-07-01, 2008-08-01, 2008-09-01, 2008-10-01, 2008-11-...
## $ m    <dbl> 4554634, 4592203, 4673790, 4615553, 4684700, 4791899, 473...
## $ date <date> 2008-07-01, 2008-08-01, 2008-09-01, 2008-10-01, 2008-11-...

p<-ggplot(m2)+aes(x=date,y=m)+geom_line()+labs(title = "Fig#1  Broad money in million of Rs.",
              caption = "Data source: sbp.org.pk") +theme_wsj()+theme(plot.title = element_text(size = 16, face = "bold"))
p

## Now we transform variable into log
m2<-m2 %>% 
  mutate(logm=log(m))
# Plot log(m) 
p1<-ggplot(m2)+aes(x=date,y=logm)+geom_line()+labs(title = "Fig#2  Broad money in million of Rs.",caption = "Data source: sbp.org.pk") +theme_wsj()+theme(plot.title = element_text(size = 16, face = "bold"))
p1

The upward rising curve in figure # 1 indicates the possible existence of trend in data and requires statistical handling. To cope with the nonlinearity indicated in the time series plot, we initiate with the log transformation as given in Fig#2. The time plot of log(Money) variable in figure#2 removes the nonlinearity (formal analysis of linear vs nonlinear trend is not discussed here) in the upward rising cuve, however, the linear trend still remains and it indicates the presence of trend and seasonaility which I shall discuss in the following sections. Presence of strong trend indicate that series mean is not constant over time and therefore, it is non-stationary.

1.2 Components of a time series data

There are four main components of a time series data namely: trend, seasonality, cycle and irregularity. For forecasting purpose our main interest is to model trend and season as cyclical movements cover a span of many years. Forecasting is one of the main objective of econometric modeling. Other objectices of an econometric models are :descriptive analysis, causal and policy analysis. Description of data in time series is usually carried out from graphical analysis as has been the case in the previous section. In this section we shall carry out univariate analysis and will try to study trend and seasonality. Trend and seasonality components helps to forecast the series. Causality/Policy analysis is not a topic for univariate analysis as ARIMA models are statistical/historical models and not the economic models. To conduct the descriptive analysis, we examine the correlogram through ACF and PACF. Then we move toward the formal statistical test for stationarity usually called as unit-root test. This test examines the evidence of stochastic trend in the observed time series. As we are working with monthly time series, we need to check for the seasonal non-stationarity too. We follow as:

##library(tseries)
par(mfrow=c(1,2)) #For two plot in one row and two columns
##Fig.3 
p2<-acf(m2$logm)
p3<-pacf(m2$logm)

p2+theme_wsj()

## NULL

p3+theme_wsj()

## NULL

p2

## 
## Autocorrelations of series 'm2$logm', by lag
## 
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 1.000 0.976 0.953 0.932 0.908 0.885 0.864 0.838 0.815 0.791 0.767 0.745 
##    12    13    14    15    16    17    18    19    20    21 
## 0.723 0.699 0.675 0.652 0.628 0.605 0.583 0.558 0.534 0.512

p3

## 
## Partial autocorrelations of series 'm2$logm', by lag
## 
##      1      2      3      4      5      6      7      8      9     10 
##  0.976  0.008  0.019 -0.058  0.001  0.012 -0.080  0.015 -0.007 -0.037 
##     11     12     13     14     15     16     17     18     19     20 
##  0.035  0.005 -0.071 -0.005 -0.003 -0.028  0.003  0.000 -0.067  0.012 
##     21 
## -0.009

Correlogram of a series measured through ACF and PACF is useful graphical mechanism to find out where series is stationary or non-stationary. Series PACF indicate that series has strong dependence over its past values, therefore, series is not stationary. Sample acf and sample pacf are showing that series is non-stationary as acf does not decay even after 20 lags which means series has long memory while stationary series is called short memory as with time its lag values loss predictive power. In order to test it statistically we can use adf test. We shall use now only log of money i.e log(m) variable.

3. Unit Root Test

The ADF test for unit roots is conducted for the time series used for the study. The test is based on the hypothesis as H0 is a unit root, HA is stationarity.

adf.test(m2$logm,alternative = "stationary") #p-value is 0.687 so null hypothesis of non-stationary is not rejected.

## 
##  Augmented Dickey-Fuller Test
## 
## data:  m2$logm
## Dickey-Fuller = -1.7342, Lag order = 4, p-value = 0.687
## alternative hypothesis: stationary

m2<-m2 %>% 
  mutate(dlogm=logm-lag(logm,1)) ## first difference of a variable of logm

acf(m2$dlogm,na.action = na.pass)

pacf(m2$dlogm,na.action = na.pass)

adf.test(na.omit(m2$dlogm))

## Warning in adf.test(na.omit(m2$dlogm)): p-value smaller than printed p-
## value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  na.omit(m2$dlogm)
## Dickey-Fuller = -12.693, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary

auto.arima(m2$logm)

## Series: m2$logm 
## ARIMA(1,1,5) with drift 
## 
## Coefficients:
##           ar1     ma1      ma2     ma3      ma4      ma5   drift
##       -0.8756  0.6603  -0.3110  0.0339  -0.3929  -0.6289  0.0105
## s.e.   0.1008  0.0987   0.0945  0.1421   0.0813   0.1186  0.0003
## 
## sigma^2 estimated as 0.0001812:  log likelihood=363.93
## AIC=-711.86   AICc=-710.62   BIC=-689.23

##monthly data so take 12th difference
d12dlogm<-m2 %>% select(Date,dlogm) %>% mutate(ddlogm=dlogm-lag(dlogm,12))
acf(d12dlogm$ddlogm,na.action = na.pass)

pacf(d12dlogm$ddlogm,na.action = na.pass)

adf.test(na.omit(d12dlogm$ddlogm))

## Warning in adf.test(na.omit(d12dlogm$ddlogm)): p-value smaller than printed
## p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  na.omit(d12dlogm$ddlogm)
## Dickey-Fuller = -6.5376, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary

auto.arima(d12dlogm$ddlogm)

## Series: d12dlogm$ddlogm 
## ARIMA(0,0,1) with zero mean 
## 
## Coefficients:
##           ma1
##       -0.2544
## s.e.   0.0938
## 
## sigma^2 estimated as 8.205e-05:  log likelihood=371.69
## AIC=-739.38   AICc=-739.27   BIC=-733.92

The test yielded a p-value =0.687 leading to non-rejection of H0 , hence the result of the ADF test illustrates that the data series is nonstationary at level.ADF test indicate that null hypothesis of nonstationary is not rejected. In order to make series stationary, it is suggested that series should be detrended.

Seasonal Analysis and Decomposition of the components

library(seasonal)
library(seasonalview)
m2ts <- ts(m2$m, start=c(2008, 7), end=c(2018, 12), frequency=12) 
##decompose(m2ts)

# subset the time series (Aug 2010 to July 2018)
m2ts2 <- window(m2ts, start=c(2014, 8), end=c(2018, 7)) 
seasonplot(m2ts)

# Seasonal decomposition
fit <- stl(m2ts, s.window="period")
plot(fit)

par(mfrow=c(1,2))
acf(m2ts)
pacf(m2ts)

dts<-diff(m2ts,12)
acf(dts)
pacf(dts)

ddts<-diff(dts,1)
plot(ddts)
acf(ddts)

pacf(ddts)
# additional plots
monthplot(m2ts)

auto.arima(m2ts)

## Series: m2ts 
## ARIMA(0,1,1)(0,1,0)[12] 
## 
## Coefficients:
##           ma1
##       -0.3777
## s.e.   0.0861
## 
## sigma^2 estimated as 7.835e+09:  log likelihood=-1447.08
## AIC=2898.15   AICc=2898.26   BIC=2903.61

auto.arima(x=m2ts,seasonal = FALSE)

## Series:  
## ARIMA(4,2,4) 
## 
## Coefficients:
##           ar1      ar2     ar3      ar4      ma1     ma2      ma3     ma4
##       -0.1387  -0.5613  0.4034  -0.4216  -1.1186  0.4862  -1.0460  0.7674
## s.e.   0.0968   0.1002  0.1001   0.0893   0.0751  0.0742   0.0864  0.0757
## 
## sigma^2 estimated as 1.247e+10:  log likelihood=-1617.72
## AIC=3253.43   AICc=3255.01   BIC=3278.82

##install.packages("seasonalview")
#library(seasonalview)
#library("shiny")
#view(seas(m2ts))
#library(gets)
#isat(ddts, sis=TRUE, iis=FALSE, plot=TRUE, t.pval=0.005)