Importing the Data.

Importing the time serries data for Real Gross Private Domestic Investiment from the QUANDIL website

rGDPI <-Quandl("FRED/GPDIC1",type="ts")

Introduction

Number 1

This question uses the Real Gross Private Domestic Investiment timeserries data. #Data inspection Aquick Inspection of the Data.

str(rGDPI)

##  Time-Series [1:276] from 1947 to 2016: 223 206 200 238 263 ...

head (rGDPI)

## [1] 222.7 205.6 199.6 238.2 262.6 278.9

tail(rGDPI)

## [1] 2758.1 2772.5 2830.2 2864.8 2859.7 2842.0

summary(rGDPI)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   197.7   459.6   950.0  1187.0  1943.0  2865.0

#rGDPI

A quick inspection of this data shows that Real Gross Private Domestic Investiment times serries data is quartery data raging from 1947 to 2016 with 276 observations. The minimum data value is 179.7 and the maximum data value is 2865.0 and the mean is 1187.0. #Transforming the data

lrGDPI<-log(rGDPI)
drGDPI<-diff(rGDPI)
dlrGDPI<-diff(lrGDPI)
dlrGDPI4<-diff(lrGDPI,4)
dlrGDPI_4<-diff(diff(lrGDPI),4)

Change the plot window layout to 1 row with 2 column #Plot the Data

par(mfrow=c(2,3))
plot(rGDPI, xlab="", ylab="", main=expression(rGDPI))
plot(lrGDPI, xlab="", ylab="", main=expression(log(rGDPI)))
plot.new()
plot(dlrGDPI, xlab="", ylab="", main=expression(paste(Delta, "log(rGDPI)")))
plot(dlrGDPI4, xlab="", ylab="", main=expression(paste(Delta[4], "log(rGDPI)")))
plot(dlrGDPI_4, xlab="", ylab="", main=expression(paste(Delta, Delta[4], "log(rGDPI)")))

These figures the plot of the origial time serries for the real Gross Private Domestic Investment (rGDPI), the log transformed real Gross Private Domestic Investment, the first difference, seasonal differenced, and regularand seasonal differenced data. The results show the data is non stationary and that the real Gross Private Domestic Investment is difference stationary.

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

library(tseries)

## Warning: package 'tseries' was built under R version 3.2.3

adf.test(rGDPI)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  rGDPI
## Dickey-Fuller = -2.7222, Lag order = 6, p-value = 0.2718
## alternative hypothesis: stationary

we fail to reject that the time series is non stationary and conclude that the time series has a unit root

adf.test(lrGDPI)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  lrGDPI
## Dickey-Fuller = -2.6039, Lag order = 6, p-value = 0.3216
## alternative hypothesis: stationary

We fail to reject the null hypothesis that timeseries is nonstationary and conclude that the time series has a unit root.

adf.test(dlrGDPI)

## Warning in adf.test(dlrGDPI): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlrGDPI
## Dickey-Fuller = -7.3938, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that this time serries is non stationary and we conclude the the differenced log transformed times serries is stationary

adf.test(dlrGDPI4)

## Warning in adf.test(dlrGDPI4): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlrGDPI4
## Dickey-Fuller = -5.9736, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that this time serries is non stationary and we conclude the the differenced log transformed times serries is stationary.

adf.test(dlrGDPI_4)

## Warning in adf.test(dlrGDPI_4): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlrGDPI_4
## Dickey-Fuller = -8.5579, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that this time serries is non stationary and we conclude that the differenced log transformed times serries is stationary

#install.packages('urca')
library(urca)

## Warning: package 'urca' was built under R version 3.2.3

rGDPI.urdf<-ur.df(rGDPI, type="trend", selectlags="BIC")
summary(rGDPI.urdf)

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -194.368  -20.406    0.841   20.016  144.777 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.58796    5.54215  -0.467    0.641    
## z.lag.1     -0.02393    0.01039  -2.304    0.022 *  
## tt           0.26718    0.10690   2.499    0.013 *  
## z.diff.lag   0.36800    0.05656   6.506 3.73e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 42.93 on 270 degrees of freedom
## Multiple R-squared:  0.1511, Adjusted R-squared:  0.1416 
## F-statistic: 16.01 on 3 and 270 DF,  p-value: 1.306e-09
## 
## 
## Value of test-statistic is: -2.3038 3.8893 3.1517 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2  6.15  4.71  4.05
## phi3  8.34  6.30  5.36

Since the test statistic -2.3038 is greater than the critical value of -3.98, we fail to reject the null hypothesis that the time series has a unit root/difference stationary.

KPSS Test

library(tseries)
kpss.test(rGDPI, null="Trend")

## Warning in kpss.test(rGDPI, null = "Trend"): p-value smaller than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  rGDPI
## KPSS Trend = 1.0029, Truncation lag parameter = 3, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(lrGDPI, null="Trend")

## Warning in kpss.test(lrGDPI, null = "Trend"): p-value smaller than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  lrGDPI
## KPSS Trend = 0.44857, Truncation lag parameter = 3, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(dlrGDPI, null="Trend")

## Warning in kpss.test(dlrGDPI, null = "Trend"): p-value greater than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlrGDPI
## KPSS Trend = 0.01618, Truncation lag parameter = 3, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and conclude that the data is trend stationary.

kpss.test(dlrGDPI4, null="Trend")

## Warning in kpss.test(dlrGDPI4, null = "Trend"): p-value greater than
## printed p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlrGDPI4
## KPSS Trend = 0.02077, Truncation lag parameter = 3, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and conclude that the data is trend stationary.

kpss.test(dlrGDPI_4, null="Trend")

## Warning in kpss.test(dlrGDPI_4, null = "Trend"): p-value greater than
## printed p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlrGDPI_4
## KPSS Trend = 0.006604, Truncation lag parameter = 3, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and we conclude that the data is trend stationary.

#install.packages('urca')
library(urca)

rGDPI.urkpss<-ur.kpss(rGDPI, type="tau", lags="long")
summary(rGDPI.urkpss)

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 15 lags. 
## 
## Value of test-statistic is: 0.3061 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

since the value for the test statistic (0.3061) is greater than the critical value (0.216), we reject the null hypothesis at 1% level of significance that time series data is trend stationary. the conclusion is that the time series data is difference stationary

Number 2

Importing the Civilian Unemployment Rate time series data from Quand1

CUR <-Quandl("FRED/UNRATE",type="ts")
#CUR

This is monthly data # Splitting the timeserries into two parts-estimation sample and prediction sample

CURall <- CUR
CUR1 <- window(CURall, end=c(2014,12))
CUR2 <- window(CURall, start=c(2015,1))

The first part is used to identify and estimate the model

construct log, log-change and seasonal log change of Civilian Unemployment Rate time series data

Differencing is used to handle exponetial growth of the serries and to stabilize the variability of the serries.

lCUR1<-log(CUR1)
dlCUR1<-diff(lCUR1)
dlCUR12<-diff(lCUR1,12)
dlCUR12_1<-diff(diff(lCUR1),12)

Model Identification

Ploting data

par(mfrow=c(2,3))
plot(CUR1, xlab="", ylab="", main=expression(CUR1))
plot(lCUR1, xlab="", ylab="", main=expression(log(CUR1)))
plot.new()
plot(dlCUR1, xlab="", ylab="", main=expression(paste(Delta, "log(CUR1)")))
plot(dlCUR12, xlab="", ylab="", main=expression(paste(Delta[12], "log(CUR1)")))
plot(dlCUR12_1, xlab="", ylab="", main=expression(paste(Delta, Delta[12], "log(CUR1)")))

Analysis of these plots shows that Civilian Unemployment Rate time series data is non stationary. The data shows non constant variance. this is removed by taking a natural log transformation. the first difference of the log is used to remove the trend.

Plot ACF and PACF

library(zoo)
maxlag <-60
par(mfrow=c(2,4))
plot(acf(coredata(lCUR1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for log(CUR1)")))
plot(acf(coredata(dlCUR1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta,"log(CUR1)")))
plot(acf(coredata(dlCUR12), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta[12], "log(CUR1)")))
plot(acf(coredata(dlCUR12_1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta, Delta[12], "log(CUR1)")))

acf(coredata(lCUR1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for log(CUR1)")))
acf(coredata(dlCUR1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta, "log(CUR1)")))
acf(coredata(dlCUR12), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta[12], "log(CUR1)")))
acf(coredata(dlCUR12_1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta,Delta[12], "log(CUR1)")))

The ACF of the orginal series decays slowly which is evidence that the time series data is non stationary. The ACF of the twice differenced data slowly decays but there are seasonal spikes at lag 12, 36, 48. The PACG has lags that slowly decay.

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

library(tseries)
adf.test(CUR)

## Warning in adf.test(CUR): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  CUR
## Dickey-Fuller = -4.0267, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that the time series has a unit root and conclude that the time series is stationary.

adf.test(lCUR1)

## Warning in adf.test(lCUR1): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  lCUR1
## Dickey-Fuller = -4.3165, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that the time series has a unit root and conclude that the time series is stationary.

adf.test(dlCUR1)

## Warning in adf.test(dlCUR1): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCUR1
## Dickey-Fuller = -8.2684, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that the time serries has a unit root and we conclude that the time series is stationary

adf.test(dlCUR12)

## Warning in adf.test(dlCUR12): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCUR12
## Dickey-Fuller = -8.7545, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that the time serries has a unit root and we conclude the data is stationary.

adf.test(dlCUR12_1)

## Warning in adf.test(dlCUR12_1): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCUR12_1
## Dickey-Fuller = -10.022, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that this time serries is non stationary and we conclude that the differenced log transformed times serries is stationary.

#install.packages('urca')
library(urca)

cur.urdf<-ur.df(CUR, type="trend", selectlags="BIC")
summary(cur.urdf)

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.64394 -0.11178 -0.00623  0.10994  1.33334 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.859e-02  2.729e-02   2.147  0.03211 *  
## z.lag.1     -1.009e-02  4.829e-03  -2.090  0.03691 *  
## tt           3.258e-06  3.369e-05   0.097  0.92299    
## z.diff.lag   1.269e-01  3.478e-02   3.649  0.00028 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2096 on 811 degrees of freedom
## Multiple R-squared:  0.021,  Adjusted R-squared:  0.01738 
## F-statistic: 5.799 on 3 and 811 DF,  p-value: 0.0006355
## 
## 
## Value of test-statistic is: -2.0902 1.6469 2.4589 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

Since the test statistic -2.0902 is greater than the critical value of -3.98, we fail to reject the null hypothesis at the 1 % level of signifance that the time series has a unit root/difference stationary.

KPSS Test

library(tseries)
kpss.test(CUR1, null="Trend")

## Warning in kpss.test(CUR1, null = "Trend"): p-value smaller than printed p-
## value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  CUR1
## KPSS Trend = 0.50053, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(lCUR1, null="Trend")

## Warning in kpss.test(lCUR1, null = "Trend"): p-value smaller than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  lCUR1
## KPSS Trend = 0.53791, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(dlCUR1, null="Trend")

## Warning in kpss.test(dlCUR1, null = "Trend"): p-value greater than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCUR1
## KPSS Trend = 0.027704, Truncation lag parameter = 6, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and conclude that the data is trend stationary.

kpss.test(dlCUR12, null="Trend")

## Warning in kpss.test(dlCUR12, null = "Trend"): p-value greater than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCUR12
## KPSS Trend = 0.039263, Truncation lag parameter = 6, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and conclude that the data is trend stationary.

kpss.test(dlCUR12_1, null="Trend")

## Warning in kpss.test(dlCUR12_1, null = "Trend"): p-value greater than
## printed p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCUR12_1
## KPSS Trend = 0.010696, Truncation lag parameter = 6, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and we conclude that the data is trend stationary.

#install.packages('urca')
library(urca)

cur.urkpss<-ur.kpss(CUR, type="tau", lags="long")
summary(cur.urkpss)

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 20 lags. 
## 
## Value of test-statistic is: 0.1925 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

since the value for the test statistic (0.1925) is greater than the critical value (0.119), we reject the null hypothesis at 1% level of significance that time series data is trend stationary. the conclusion is that the time series data is difference stationary

Model Estimation

estimate model - data not differenced

ND1 <- arima(dlCUR1, order=c(3,0,2), seasonal=list(order=c(0,1,2), period=12))
tsdiag(ND1,gof.lag=36)

BIC(ND1)

## [1] -2913.12

AIC(ND1)

## [1] -2950.506

ND2 <- arima(dlCUR1, order=c(2,0,2), seasonal=list(order=c(0,1,2), period=12))
tsdiag(ND2,gof.lag=36)

BIC(ND2)

## [1] -2919.142

AIC(ND2)

## [1] -2951.855

estimate model - twicely differenced data

CU1 <- arima(dlCUR12_1,order=c(2,0,2),seasonal=list(order=c(2,0,2),period=12))
CU1

## 
## Call:
## arima(x = dlCUR12_1, order = c(2, 0, 2), seasonal = list(order = c(2, 0, 2), 
##     period = 12))
## 
## Coefficients:
##          ar1      ar2      ma1     ma2    sar1     sar2     sma1    sma2
##       1.4940  -0.6006  -1.4545  0.6653  0.4666  -0.1035  -1.6704  0.6783
## s.e.  0.1689   0.1576   0.1555  0.1218  0.1185   0.0537   0.1110  0.1121
##       intercept
##           0e+00
## s.e.      1e-04
## 
## sigma^2 estimated as 0.00123:  log likelihood = 1499.94,  aic = -2979.88

tsdiag(CU1,gof.lag=48)

BIC(CU1)

## [1] -2933.152

AIC(CU1)

## [1] -2979.885

BIC(CU1) = -2933.152 AIC(CU1) = -2979.885

twice differenced data

CU2 <- arima(dlCUR12_1,order=c(2,0,2),seasonal=list(order=c(2,0,1),period=12))
CU2

## 
## Call:
## arima(x = dlCUR12_1, order = c(2, 0, 2), seasonal = list(order = c(2, 0, 1), 
##     period = 12))
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     sar1     sar2     sma1
##       1.5349  -0.6376  -1.4835  0.6868  -0.1682  -0.1912  -1.0000
## s.e.  0.1258   0.1175   0.1153  0.0918   0.0396   0.0378   0.0185
##       intercept
##          -1e-04
## s.e.      1e-04
## 
## sigma^2 estimated as 0.001253:  log likelihood = 1490.65,  aic = -2963.3

tsdiag(CU2,gof.lag=36)

BIC(CU2)

## [1] -2921.239

AIC(CU2)

## [1] -2963.299

BIC(CU2) = -2921.239 AIC(CU2) = -2963.299

twice differenced data

CU3 <- arima(dlCUR12_1,order=c(2,0,3),seasonal=list(order=c(1,0,0),period=12))
CU3

## 
## Call:
## arima(x = dlCUR12_1, order = c(2, 0, 3), seasonal = list(order = c(1, 0, 0), 
##     period = 12))
## 
## Coefficients:
##           ar1     ar2     ma1      ma2     ma3     sar1  intercept
##       -0.1315  0.7282  0.2336  -0.5827  0.1837  -0.5137    -0.0007
## s.e.   0.0533  0.0512  0.0624   0.0672  0.0361   0.0323     0.0023
## 
## sigma^2 estimated as 0.00217:  log likelihood = 1299.37,  aic = -2582.75

tsdiag(CU3,gof.lag=36)

BIC(CU3)

## [1] -2545.362

AIC(CU3)

## [1] -2582.748

search for best model using BIC criteria

?auto.arima

library("forecast")

## Warning: package 'forecast' was built under R version 3.2.3

## Loading required package: timeDate

## Warning: package 'timeDate' was built under R version 3.2.3

## This is forecast 6.2

CUa <- auto.arima(dlCUR12_1, ic="bic")

## Warning in auto.arima(dlCUR12_1, ic = "bic"): Unable to fit final model
## using maximum likelihood. AIC value approximated

CUa

## Series: dlCUR12_1 
## ARIMA(2,0,2)(2,0,1)[12] with zero mean     
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     sar1     sar2     sma1
##       1.2500  -0.3470  -1.2075  0.4370  -0.2287  -0.2476  -0.8405
## s.e.  0.0785   0.0751   0.0806  0.0709   0.0334   0.0333   0.0205
## 
## sigma^2 estimated as 0.001329:  log likelihood=1497.23
## AIC=-2978.45   AICc=-2978.27   BIC=-2941.07

#AIC=-2978.45   AICc=-2978.27   BIC=-2941.07 

CUb <- auto.arima(dlCUR12, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)

## Warning in search.arima(x, d, D, max.p, max.q, max.P, max.Q, max.order, :
## Unable to fit final model using maximum likelihood. AIC value approximated

CUb

## Series: dlCUR12 
## ARIMA(2,0,0)(2,0,1)[12] with zero mean     
## 
## Coefficients:
##          ar1      ar2     sar1     sar2     sma1
##       1.1774  -0.1906  -0.2479  -0.2219  -0.7962
## s.e.  0.0347   0.0346   0.0331   0.0332   0.0241
## 
## sigma^2 estimated as 0.001489:  log likelihood=1453.96
## AIC=-2893.3   AICc=-2893.19   BIC=-2865.25

#AIC=-2893.3   AICc=-2893.19   BIC=-2865.25

CUd <- auto.arima(lCUR1, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)
CUd

## Series: lCUR1 
## ARIMA(1,1,1)(2,0,1)[12]                    
## 
## Coefficients:
##          ar1      ma1    sar1     sar2     sma1
##       0.8886  -0.7500  0.4505  -0.0608  -0.7090
## s.e.  0.0310   0.0412  0.1056   0.0548   0.1011
## 
## sigma^2 estimated as 0.001243:  log likelihood=1545.4
## AIC=-3078.8   AICc=-3078.7   BIC=-3050.67

#AIC=-3078.8   AICc=-3078.7   BIC=-3050.67 

CUc <- auto.arima(dlCUR1, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)
CUc

## Series: dlCUR1 
## ARIMA(1,0,1)(2,0,1)[12] with zero mean     
## 
## Coefficients:
##          ar1      ma1    sar1     sar2     sma1
##       0.8886  -0.7500  0.4505  -0.0608  -0.7090
## s.e.  0.0310   0.0412  0.1056   0.0548   0.1011
## 
## sigma^2 estimated as 0.001243:  log likelihood=1545.4
## AIC=-3078.8   AICc=-3078.7   BIC=-3050.67

#AIC=-3078.8   AICc=-3078.7   BIC=-3050.67

differenced data

C1 <- arima(dlCUR1,order=c(4,1,3),seasonal=list(order=c(1,1,2),period=12))
C1

## 
## Call:
## arima(x = dlCUR1, order = c(4, 1, 3), seasonal = list(order = c(1, 1, 2), period = 12))
## 
## Coefficients:
##           ar1     ar2     ar3     ar4      ma1      ma2     ma3    sar1
##       -0.1849  0.6452  0.1886  0.0258  -0.7789  -0.6872  0.4661  0.5695
## s.e.   0.2790  0.1675  0.0621  0.0544   0.2796   0.4058  0.1593  0.0900
##          sma1    sma2
##       -1.8080  0.8157
## s.e.   0.0677  0.0678
## 
## sigma^2 estimated as 0.001245:  log likelihood = 1488.26,  aic = -2954.52

tsdiag(C1,gof.lag=36)

BIC(C1)

## [1] -2903.128

AIC(C1)

## [1] -2954.521

install.packages(“forecast”)

library(forecast)

ND1.fcast<-forecast(ND1,h=24)
ND2.fcast<-forecast(ND2,h=24)

CU1.fcast<-forecast(CU1,h=24)
CU2.fcast<-forecast(CU2,h=24)
CU3.fcast<-forecast(CU3,h=24)
C1.fcast<-forecast(C1,h=24)

Construct and Plot Forecasts

par(mfrow=c(2,2), cex=0.75, mar=c(2,4,3,1))
plot(CU1.fcast, xlim=c(2013,2016))
lines(diff(diff(log(CURall,12))))

plot(CU2.fcast, xlim=c(2013,2016))
lines(diff(diff(log(CURall,12))))

plot(CU3.fcast, xlim=c(2013,2016))
lines(diff(diff(log(CURall,12))))

plot(ND1.fcast, xlim=c(2013,2016))
lines(log(CURall,12))

plot(ND2.fcast, xlim=c(2013,2016))
lines(log(CURall,12))

plot(C1.fcast, xlim=c(2013,2016))
lines(diff(log(CURall,12)))

Forecast accuracy

accuracy(CU1.fcast)

##                         ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.0005142391 0.03507082 0.02540641 NaN  Inf 0.3506648
##                     ACF1
## Training set -0.01937133

accuracy(CU2.fcast)

##                         ME      RMSE        MAE MPE MAPE      MASE
## Training set -0.0004356722 0.0353918 0.02543699 NaN  Inf 0.3510869
##                     ACF1
## Training set -0.01590938

Number 3

Import the monthly time series data for Consumer Price Index for All Urban consumers from FRED/CPILFENS.

CP <-Quandl("FRED/CPILFENS",type="ts")
#CP

Splitting the timeserries into two parts-estimation sample and prediction sample

CPall <- CP
CP1 <- window(CPall, end=c(2014,12))
CP2 <- window(CPall, start=c(2015,1))

construct log, log-change and seasonal log change of Consumer Price Index for all Urban Consumers

lCP1<-log(CP1)
dlCP1<-diff(lCP1)
dlCP12<-diff(lCP1,12)
dlCP12_1<-diff(diff(lCP1),12)

Model Identification

Ploting data

par(mfrow=c(2,3))
plot(CP1, xlab="", ylab="", main=expression(CP1))
plot(lCP1, xlab="", ylab="", main=expression(log(CP1)))
plot.new()
plot(dlCP1, xlab="", ylab="", main=expression(paste(Delta, "log(CP1)")))
plot(dlCP12, xlab="", ylab="", main=expression(paste(Delta[12], "log(CP1)")))
plot(dlCP12_1, xlab="", ylab="", main=expression(paste(Delta, Delta[12], "log(CP1)")))

Civilian Unemployment Rate time series data shows an increasing trend. the trend increases at decreasing rate. This might an indication that this data is non stationary.

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

library(tseries)
adf.test(CP)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  CP
## Dickey-Fuller = -3.4243, Lag order = 8, p-value = 0.04958
## alternative hypothesis: stationary

We reject the null hypothesis that the time series has a unit root and conclude that the time series is stationary.

adf.test(lCP1)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  lCP1
## Dickey-Fuller = -0.3957, Lag order = 8, p-value = 0.9866
## alternative hypothesis: stationary

We fail to reject the null hypothesis that the time series has a unit root and conclude that the time series has a unit root

adf.test(dlCP1)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCP1
## Dickey-Fuller = -3.7846, Lag order = 8, p-value = 0.01977
## alternative hypothesis: stationary

We reject the null hypothesis that the time serries has a unit root and we conclude that the time series is stationary

adf.test(dlCP12)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCP12
## Dickey-Fuller = -3.1898, Lag order = 8, p-value = 0.08969
## alternative hypothesis: stationary

We reject the null hypothesis that the time serries has a unit root and we conclude the data is stationary. this statistic is significant at 10% level of significance

adf.test(dlCP12_1)

## Warning in adf.test(dlCP12_1): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  dlCP12_1
## Dickey-Fuller = -6.211, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary

We reject the null hypothesis that this time serries is non stationary and we conclude that the differenced log transformed times serries is stationary.

#install.packages('urca')
library(urca)

cur.urdf<-ur.df(CUR, type="trend", selectlags="BIC")
summary(cur.urdf)

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.64394 -0.11178 -0.00623  0.10994  1.33334 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.859e-02  2.729e-02   2.147  0.03211 *  
## z.lag.1     -1.009e-02  4.829e-03  -2.090  0.03691 *  
## tt           3.258e-06  3.369e-05   0.097  0.92299    
## z.diff.lag   1.269e-01  3.478e-02   3.649  0.00028 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2096 on 811 degrees of freedom
## Multiple R-squared:  0.021,  Adjusted R-squared:  0.01738 
## F-statistic: 5.799 on 3 and 811 DF,  p-value: 0.0006355
## 
## 
## Value of test-statistic is: -2.0902 1.6469 2.4589 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

KPSS Test

library(tseries)
kpss.test(CP, null="Trend")

## Warning in kpss.test(CP, null = "Trend"): p-value smaller than printed p-
## value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  CP
## KPSS Trend = 1.5657, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(lCP1, null="Trend")

## Warning in kpss.test(lCP1, null = "Trend"): p-value smaller than printed p-
## value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  lCP1
## KPSS Trend = 1.853, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(dlCP1, null="Trend")

## Warning in kpss.test(dlCP1, null = "Trend"): p-value smaller than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCP1
## KPSS Trend = 1.2391, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(dlCP12, null="Trend")

## Warning in kpss.test(dlCP12, null = "Trend"): p-value smaller than printed
## p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCP12
## KPSS Trend = 1.3584, Truncation lag parameter = 6, p-value = 0.01

we reject the null hypothesis that the time series data is trend stationary and conclude that the data is difference stationary.

kpss.test(dlCP12_1, null="Trend")

## Warning in kpss.test(dlCP12_1, null = "Trend"): p-value greater than
## printed p-value

## 
##  KPSS Test for Trend Stationarity
## 
## data:  dlCP12_1
## KPSS Trend = 0.046471, Truncation lag parameter = 6, p-value = 0.1

we fail to reject the null hypothesis that the time series data is trend stationary and we conclude that the data is trend stationary.

#install.packages('urca')
library(urca)

cp.urkpss<-ur.kpss(CP, type="tau", lags="long")
summary(cp.urkpss)

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 19 lags. 
## 
## Value of test-statistic is: 0.5659 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

since the value for the test statistic (0.5659) is greater than the critical value (0.119), we reject the null hypothesis at 1% level of significance that time series data is trend stationary. the conclusion is that the time series data is difference stationary

Plot ACF and PACF

library(zoo)
maxlag <-60
par(mfrow=c(2,4))
plot(acf(coredata(lCP1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for log(CP1)")))
plot(acf(coredata(dlCP1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta,"log(CP1)")))
plot(acf(coredata(dlCP12), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta[12], "log(CP1)")))
plot(acf(coredata(dlCP12_1), type='correlation', lag=maxlag, plot=FALSE), ylab="", main=expression(paste("ACF for ", Delta, Delta[12], "log(CP1)")))

acf(coredata(lCP1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for log(CP1)")))
acf(coredata(dlCP1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta, "log(CP1)")))
acf(coredata(dlCP12), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta[12], "log(CP1)")))
acf(coredata(dlCP12_1), type='partial', lag=maxlag, ylab="", main=expression(paste("PACF for ", Delta,Delta[12], "log(CP1)")))

Model Estimation

estimate model - data not differenced

C1 <- arima(dlCP1, order=c(2,1,3), seasonal=list(order=c(2,1,2), period=12))
tsdiag(C1,gof.lag=36)

BIC(C1)

## [1] -6616.162

AIC(C1)

## [1] -6661.412

C2 <- arima(dlCP1, order=c(2,0,3), seasonal=list(order=c(2,1,2), period=12))
tsdiag(C2,gof.lag=36)

BIC(C2)

## [1] -6634.68

AIC(C2)

## [1] -6679.945

C3 <- arima(dlCP1, order=c(2,0,2), seasonal=list(order=c(0,1,2), period=12))
tsdiag(C3,gof.lag=36)

BIC(C3)

## [1] -6651.029

AIC(C3)

## [1] -6682.715

estimate model - twicely differenced data

Cb1 <- arima(dlCP12_1,order=c(3,0,3),seasonal=list(order=c(1,0,1),period=12))
Cb1

## 
## Call:
## arima(x = dlCP12_1, order = c(3, 0, 3), seasonal = list(order = c(1, 0, 1), 
##     period = 12))
## 
## Coefficients:
##          ar1      ar2     ar3      ma1     ma2      ma3    sar1     sma1
##       1.3688  -0.8769  0.4702  -1.0626  0.7406  -0.4245  0.0298  -0.8335
## s.e.  0.2732   0.3776  0.1869   0.2625  0.2940   0.1334  0.0494   0.0301
##       intercept
##           0e+00
## s.e.      1e-04
## 
## sigma^2 estimated as 3.136e-06:  log likelihood = 3351.99,  aic = -6683.98

tsdiag(Cb1,gof.lag=48)

BIC(Cb1)

## [1] -6638.715

AIC(Cb1)

## [1] -6683.979

twice differenced data

Cb2 <- arima(dlCP12_1,order=c(3,0,2),seasonal=list(order=c(2,0,1),period=12))
Cb2

## 
## Call:
## arima(x = dlCP12_1, order = c(3, 0, 2), seasonal = list(order = c(2, 0, 1), 
##     period = 12))
## 
## Coefficients:
##          ar1     ar2      ar3      ma1      ma2    sar1     sar2     sma1
##       0.9414  0.1398  -0.1112  -0.6369  -0.1604  0.0361  -0.0005  -0.8330
## s.e.  0.2907  0.3185   0.0597   0.2900   0.2299  0.0519   0.0478   0.0352
##       intercept
##           0e+00
## s.e.      1e-04
## 
## sigma^2 estimated as 3.161e-06:  log likelihood = 3349.44,  aic = -6678.87

tsdiag(Cb2,gof.lag=48)

BIC(Cb2)

## [1] -6633.61

AIC(Cb2)

## [1] -6678.875

search for best model using BIC criteria

?auto.arima

library("forecast")
Cpa <- auto.arima(dlCP12_1, ic="bic")

## Warning in auto.arima(dlCP12_1, ic = "bic"): Unable to fit final model
## using maximum likelihood. AIC value approximated

Cpa

## Series: dlCP12_1 
## ARIMA(2,0,2)(2,0,1)[12] with zero mean     
## 
## Coefficients:
##          ar1     ar2     ma1      ma2     sar1     sar2     sma1
##       0.2459  0.6514  0.0497  -0.4985  -0.1002  -0.0864  -0.7229
## s.e.  0.0723  0.0641  0.0808   0.0602   0.0455   0.0394   0.0376
## 
## sigma^2 estimated as 3.128e-06:  log likelihood=3359.44
## AIC=-6702.76   AICc=-6702.55   BIC=-6666.55

#AIC=-2978.45   AICc=-2978.27   BIC=-2941.07 

Cpb <- auto.arima(dlCP12, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)
Cpb

## Series: dlCP12 
## ARIMA(1,1,1)(2,0,1)[12]                    
## 
## Coefficients:
##          ar1      ma1    sar1     sar2     sma1
##       0.9436  -0.6940  0.0231  -0.0066  -0.8153
## s.e.  0.0209   0.0487  0.0528   0.0482   0.0367
## 
## sigma^2 estimated as 3.209e-06:  log likelihood=3344.39
## AIC=-6676.78   AICc=-6676.65   BIC=-6649.62

#AIC=-2893.3   AICc=-2893.19   BIC=-2865.25

Cpd <- auto.arima(lCP1, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)
Cpd

## Series: lCP1 
## ARIMA(0,2,3)(2,0,0)[12]                    
## 
## Coefficients:
##           ma1      ma2      ma3    sar1    sar2
##       -0.6501  -0.0511  -0.1354  0.3233  0.2567
## s.e.   0.0380   0.0496   0.0432  0.0371  0.0391
## 
## sigma^2 estimated as 3.52e-06:  log likelihood=3370
## AIC=-6728.01   AICc=-6727.89   BIC=-6700.75

#AIC=-3078.8   AICc=-3078.7   BIC=-3050.67 

Cpc <- auto.arima(dlCP1, ic="bic", seasonal=TRUE, stationary=FALSE, stepwise=FALSE)
Cpc

## Series: dlCP1 
## ARIMA(0,1,3)(2,0,0)[12]                    
## 
## Coefficients:
##           ma1      ma2      ma3    sar1    sar2
##       -0.6501  -0.0511  -0.1354  0.3233  0.2567
## s.e.   0.0380   0.0496   0.0432  0.0371  0.0391
## 
## sigma^2 estimated as 3.52e-06:  log likelihood=3370
## AIC=-6728   AICc=-6727.88   BIC=-6700.75

#AIC=-3078.8   AICc=-3078.7   BIC=-3050.67

seasonally differenced data

C1p <- arima(dlCP12,order=c(4,1,3),seasonal=list(order=c(1,1,2),period=12))
C1p

## 
## Call:
## arima(x = dlCP12, order = c(4, 1, 3), seasonal = list(order = c(1, 1, 2), period = 12))
## 
## Coefficients:

## Warning in sqrt(diag(x$var.coef)): NaNs produced

##           ar1     ar2     ar3      ar4     ma1      ma2      ma3    sar1
##       -0.2974  0.5807  0.7846  -0.1614  0.6047  -0.2585  -0.7062  0.0264
## s.e.      NaN  0.0532     NaN   0.0508     NaN   0.0275      NaN  0.0503
##          sma1    sma2
##       -1.8128  0.8129
## s.e.   0.0469  0.0439
## 
## sigma^2 estimated as 3.194e-06:  log likelihood = 3245.25,  aic = -6468.5

tsdiag(C1p,gof.lag=36)

BIC(C1p)

## [1] -6418.904

AIC(C1p)

## [1] -6468.5

install.packages(“forecast”)

library(forecast)

C1.fcast<-forecast(C1,h=24)
C2.fcast<-forecast(C2,h=24)
C3.fcast<-forecast(C3,h=24)

Cb1.fcast<-forecast(Cb1,h=24)
Cb2.fcast<-forecast(Cb2,h=24)

C1p.fcast<-forecast(C1p,h=24)

Construct and Plot Forecasts

par(mfrow=c(2,3), cex=0.75, mar=c(2,4,3,1))

plot(Cb1.fcast, xlim=c(2013,2016))
lines(diff(diff(log(CPall,12))))

plot(Cb2.fcast, xlim=c(2013,2016))
lines(diff(diff(log(CPall,12))))


plot(C1.fcast, xlim=c(2013,2016))
lines(log(CPall,12))

plot(C2.fcast, xlim=c(2013,2016))
lines(log(CPall,12))

plot(C3.fcast, xlim=c(2013,2016))
lines(log(CPall,12))

Homework3

GERTRUDE NAKAKEETO

February 21, 2016

Importing the Data.

Introduction

Number 1

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

KPSS Test

Number 2

The first part is used to identify and estimate the model

construct log, log-change and seasonal log change of Civilian Unemployment Rate time series data

Differencing is used to handle exponetial growth of the serries and to stabilize the variability of the serries.

Model Identification

Ploting data

Plot ACF and PACF

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

KPSS Test

Model Estimation

estimate model - data not differenced

estimate model - twicely differenced data

twice differenced data

twice differenced data

search for best model using BIC criteria

differenced data

install.packages(“forecast”)

Construct and Plot Forecasts

Number 3

Splitting the timeserries into two parts-estimation sample and prediction sample

construct log, log-change and seasonal log change of Consumer Price Index for all Urban Consumers

Model Identification

Ploting data

Testing the data for nonstationarity

Augmented Dickey Fuller (ADF) Test

KPSS Test

Plot ACF and PACF

Model Estimation

estimate model - data not differenced

estimate model - twicely differenced data

twice differenced data

search for best model using BIC criteria

seasonally differenced data

install.packages(“forecast”)

Construct and Plot Forecasts