Q1) Consider the data given by the file named KoreanImports.txt. The data is quarterly time-series of 67 observations (starting with the first quarter of 1987) of the following variables (in the given order).
APP= US Apparel Imports from Korea. NONAPP= US NonApparel Imports from Korea. KOREAGDP=Korean GDP. USPROD= US Production Index. USLCOST= US Labor Cost Index.
KoreanImport = read.table('/Users/shuaibismail/Documents copy/809/midterm/KoreanImports.txt', header = FALSE)
head(KoreanImport)
KoreanImport$APP = KoreanImport$V1
KoreanImport$NONAPP = KoreanImport$V2
KoreanImport$KOREAGDP = KoreanImport$V3
KoreanImport$USPROD = KoreanImport$V4
KoreanImport$USLCOST = KoreanImport$V5
KoreanImport = KoreanImport[6:10]
head(KoreanImport)
Yes, there might be an issue of multicollinearity between KOREAGDP and USPROD, the plot and correlation shows a linear relationship.
pairs(KoreanImport)
cor(KoreanImport)
## APP NONAPP KOREAGDP USPROD USLCOST
## APP 1.00000000 0.52473708 0.01110722 0.1531741 -0.22466635
## NONAPP 0.52473708 1.00000000 0.52417788 0.6420864 0.09698359
## KOREAGDP 0.01110722 0.52417788 1.00000000 0.9723125 0.42414357
## USPROD 0.15317410 0.64208635 0.97231253 1.0000000 0.39132073
## USLCOST -0.22466635 0.09698359 0.42414357 0.3913207 1.00000000
Scaling becomes necessary since the predictors values ranges from 86 to 30000, the model might not give equal weight to all values if this isnβt done.
KoreanImport1 = KoreanImport
KoreanImport1[, 2:5] = scale(KoreanImport[, 2:5])
m_regressor = lm(formula = APP~.,
data = KoreanImport1)
summary(m_regressor)
##
## Call:
## lm(formula = APP ~ ., data = KoreanImport1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -76.905 -25.096 -6.814 19.726 132.285
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 128.398 4.584 28.009 < 2e-16 ***
## NONAPP 19.856 7.327 2.710 0.00869 **
## KOREAGDP -82.701 23.955 -3.452 0.00101 **
## USPROD 78.819 26.479 2.977 0.00415 **
## USLCOST -8.930 5.169 -1.728 0.08906 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37.52 on 62 degrees of freedom
## Multiple R-squared: 0.4714, Adjusted R-squared: 0.4373
## F-statistic: 13.82 on 4 and 62 DF, p-value: 4.079e-08
128.398 is the intercept and the expected value of APP when all other predictors are zero.
For NONAPP, theπ½1 regression coefficient is 19.856, this implies that for a unit change in US Apparel Imports from Korea, thereβs a resulting 19.856 increase in NonApparel Imports from Korea.This is an evidence of a positive linear relationship. Keeping all other predictors constant or controlled.
For KOREAGDP, the π½2 regression coefficient is -82.701, this implies that for a unit change in US Apparel Imports from Korea, thereβs a resulting -82.701 decrease in Korean GDP.This is an evidence of a negative linear relationship between US Apparel Imports and the Korean GDP.Keeping all other predictors constant or controlled.
For USPROD, the π½3 regression coefficient is 78.819, this implies that for a unit change in US Apparel Imports from Korea, thereβs a resulting 78.819 increase in US Production Index.This is an evidence of a positive linear relationship between US Apparel imports and its Production Index. Keeping all other predictors constant or controlled.
For USLCOST, the π½4 regression coefficient is -8.930, this implies that for a unit change in US Apparel Imports from Korea, thereβs a resulting -8.930 decrease in US Labor Cost Index.This is an evidence of a negative linear relationship these two variables. Keeping all other predictors constant or controlled.
confint(m_regressor, level=.90)
## 5 % 95 %
## (Intercept) 120.743133 136.0523893
## NONAPP 7.620534 32.0908346
## KOREAGDP -122.701972 -42.7009837
## USPROD 34.603811 123.0343131
## USLCOST -17.561719 -0.2982608
confint(m_regressor)
## 2.5 % 97.5 %
## (Intercept) 119.234182 137.561341
## NONAPP 5.208627 34.502741
## KOREAGDP -130.587242 -34.815714
## USPROD 25.887689 131.750435
## USLCOST -19.263286 1.403306
confint(m_regressor, level=.99)
## 0.5 % 99.5 %
## (Intercept) 116.2154883 140.580034
## NONAPP 0.3835501 39.327818
## KOREAGDP -146.3619103 -19.041045
## USPROD 8.4508812 149.187243
## USLCOST -22.6673108 4.807331
From the plot, the residual looks fairly normally distributed. However, the shapiro test says otherwise, the p-value = 0.02215,so we can reject the null hypothesis (residuals are normally distributed)
shapiro.test(m_regressor$residuals)
##
## Shapiro-Wilk normality test
##
## data: m_regressor$residuals
## W = 0.95746, p-value = 0.02215
hist(m_regressor$residuals,breaks=20, col="steelblue")
p-value is 0.191421, so we fail to reject the null hypothesis (constant variance)
library(skedastic)
white_lm(m_regressor, interactions = TRUE)
Box.test(m_regressor$residuals,lag = 20, type = 'Ljung')
##
## Box-Ljung test
##
## data: m_regressor$residuals
## X-squared = 108.12, df = 20, p-value = 4.319e-14
The VIF value of the Korean GDP and US Production Index are greater than 10. This shows thereβs multicollinearity between the two predictors.
library(car)
## Loading required package: carData
vif(m_regressor)
## NONAPP KOREAGDP USPROD USLCOST
## 2.516747 26.899961 32.867384 1.252612
m2_regressor = lm(formula = APP~.-USPROD,
data = KoreanImport1)
summary(m2_regressor)
##
## Call:
## lm(formula = APP ~ . - USPROD, data = KoreanImport1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -94.043 -23.760 -7.633 23.045 135.865
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 128.398 4.862 26.410 < 2e-16 ***
## NONAPP 34.278 5.829 5.880 1.7e-07 ***
## KOREAGDP -13.700 6.407 -2.138 0.0364 *
## USLCOST -8.752 5.482 -1.596 0.1154
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 39.79 on 63 degrees of freedom
## Multiple R-squared: 0.3959, Adjusted R-squared: 0.3671
## F-statistic: 13.76 on 3 and 63 DF, p-value: 5.265e-07
No more multicollinearity.
vif(m2_regressor)
## NONAPP KOREAGDP USLCOST
## 1.416269 1.710699 1.252444
Compare this model to your model from part 2 in terms of model fit. Which one provides a better fit to data based on the residual standard error, Adjusted R-squared, AIC, and BIC measures?
Based on residual standard error, the second model is better.
Based on Adjusted R-squared, the first model is provides a better fit.
Based on AIC, the first model is provides a better fit.
Based on BIC, the first model is provides a better fit.
summary(m_regressor)
##
## Call:
## lm(formula = APP ~ ., data = KoreanImport1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -76.905 -25.096 -6.814 19.726 132.285
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 128.398 4.584 28.009 < 2e-16 ***
## NONAPP 19.856 7.327 2.710 0.00869 **
## KOREAGDP -82.701 23.955 -3.452 0.00101 **
## USPROD 78.819 26.479 2.977 0.00415 **
## USLCOST -8.930 5.169 -1.728 0.08906 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37.52 on 62 degrees of freedom
## Multiple R-squared: 0.4714, Adjusted R-squared: 0.4373
## F-statistic: 13.82 on 4 and 62 DF, p-value: 4.079e-08
summary(m2_regressor)
##
## Call:
## lm(formula = APP ~ . - USPROD, data = KoreanImport1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -94.043 -23.760 -7.633 23.045 135.865
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 128.398 4.862 26.410 < 2e-16 ***
## NONAPP 34.278 5.829 5.880 1.7e-07 ***
## KOREAGDP -13.700 6.407 -2.138 0.0364 *
## USLCOST -8.752 5.482 -1.596 0.1154
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 39.79 on 63 degrees of freedom
## Multiple R-squared: 0.3959, Adjusted R-squared: 0.3671
## F-statistic: 13.76 on 3 and 63 DF, p-value: 5.265e-07
AIC(m_regressor)
## [1] 682.6847
AIC(m2_regressor)
## [1] 689.6343
BIC(m_regressor)
## [1] 695.9129
BIC(m2_regressor)
## [1] 700.6578
#Plotting the confidence and prediction intervals
predint = predict(m2_regressor,interval="prediction")
## Warning in predict.lm(m2_regressor, interval = "prediction"): predictions on current data refer to _future_ responses
predlower = predint[,2]
predupper = predint[,3]
plot(KoreanImport$APP, col="lightgray", ylim=c(10,400))
lines(predict(m2_regressor), col="red")
lines(predlower,col="orange")
lines(predupper,col="orange")
timeS<-seq(1, length(KoreanImport1$APP))
m3_regressor<-lm(formula = APP~timeS+NONAPP+KOREAGDP+USLCOST,
data = KoreanImport1)
summary(m3_regressor)
##
## Call:
## lm(formula = APP ~ timeS + NONAPP + KOREAGDP + USLCOST, data = KoreanImport1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -92.487 -22.952 -7.278 22.530 135.305
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 111.806 66.603 1.679 0.0982 .
## timeS 0.488 1.954 0.250 0.8036
## NONAPP 33.553 6.552 5.121 3.18e-06 ***
## KOREAGDP -22.422 35.508 -0.631 0.5301
## USLCOST -9.440 6.173 -1.529 0.1313
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 40.09 on 62 degrees of freedom
## Multiple R-squared: 0.3965, Adjusted R-squared: 0.3575
## F-statistic: 10.18 on 4 and 62 DF, p-value: 2.116e-06
Yes, the seasonal model provides a better fit.
plot.ts(KoreanImport1$APP, col="gray")
lines(predict(m3_regressor),col="red")
Q2) Consider the time series data given by the file βdataQ2.txtβ. The data has 300 monthly observations for a financial series. In all your analysis below, use only the first 290 observations of data as your training set and use the last 10 observations as your test set.
dataQ2 = read.table('/Users/shuaibismail/Documents copy/809/midterm/dataQ2.txt', header = FALSE)
head(dataQ2)
Train-test Split
dataQ2["Month"]<-rep(1:12, times=25)
train_data<- dataQ2[1:290,]
test_data<-dataQ2[291:300,]
#c(nrow(dataQ2), nrow(train_data), nrow(test_data))
plot.ts(dataQ2$V1, col='steelblue')
acf(dataQ2$V1, col='steelblue')
Box plot
boxplot(dataQ2$V1~dataQ2$Month, col="steelblue")
2. Obtain a polynomial time series model. In doing so, please write down the estimated model. Check if the residuals are white noise. Obtain forecasts for the last 10 observations.
Estimating a Polynomial model
timeQ<-seq(1, length(train_data$V1))
summary(poly_model<- lm(formula = V1~poly(timeQ, 3, raw=TRUE),
data = train_data))
##
## Call:
## lm(formula = V1 ~ poly(timeQ, 3, raw = TRUE), data = train_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9132 -1.0078 -0.0357 1.1122 3.1055
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.192e-01 3.539e-01 2.315 0.02134 *
## poly(timeQ, 3, raw = TRUE)1 -2.967e-02 1.051e-02 -2.821 0.00512 **
## poly(timeQ, 3, raw = TRUE)2 2.534e-04 8.388e-05 3.021 0.00274 **
## poly(timeQ, 3, raw = TRUE)3 -5.962e-07 1.895e-07 -3.146 0.00183 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.487 on 286 degrees of freedom
## Multiple R-squared: 0.03926, Adjusted R-squared: 0.02918
## F-statistic: 3.896 on 3 and 286 DF, p-value: 0.00942
The prediction doesnβt look good.
plot.ts(test_data$V1, col="gray")
lines(predict(poly_model),col="red")
The residuals are not white-noise
acf(poly_model$residuals, lag = 20)
The forecast doesnβt look good.
plot.ts(test_data$V1, col="gray")
lines(predict(poly_model),col="blue")
3. Obtain the periodogram of the time series data (de-trend it if needed) and identify the sine-cosine pairs which have the highest amplitudes. 7, 13, 14, 15 have the highest amplitudes.
library(TSA)
##
## Attaching package: 'TSA'
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
order(periodogram(train_data$V1)$spec)
## [1] 110 112 91 78 46 50 95 59 45 51 131 129 146 138 57 102 84 116
## [19] 96 128 111 141 143 149 69 68 121 64 105 29 67 120 74 48 145 122
## [37] 109 2 140 148 117 101 76 41 43 72 75 9 130 70 103 90 88 107
## [55] 62 134 82 135 114 106 92 93 33 100 60 124 125 133 150 115 18 26
## [73] 89 36 126 3 34 132 24 22 23 83 19 21 63 147 25 98 73 144
## [91] 49 38 77 31 81 65 142 137 30 86 99 85 113 42 5 53 97 58
## [109] 123 136 118 66 139 35 79 55 71 39 94 104 1 108 44 11 119 40
## [127] 27 16 56 4 10 87 54 47 80 52 37 127 61 6 32 28 20 12
## [145] 8 17 15 14 13 7
We can see that the periods corresponding to the 7th, 13th and 14th harmonics are statistically significant. These are the hidden periodicities. This tells us that thereβs a repeated pattern in every 7th, 13th and 14th month over the 25 years period. The 15th and 17th harmonics are not statistically significant.
n<-290
sin1<-sin(2*pi*timeQ*7/n)
cos1<-cos(2*pi*timeQ*7/n)
sin2<-sin(2*pi*timeQ*13/n)
cos2<-cos(2*pi*timeQ*13/n)
sin3<-sin(2*pi*timeQ*14/n)
cos3<-cos(2*pi*timeQ*14/n)
sin4<-sin(2*pi*timeQ*15/n)
cos4<-cos(2*pi*timeQ*15/n)
sin5<-sin(2*pi*timeQ*17/n)
cos5<-cos(2*pi*timeQ*17/n)
summary(trig_model<-lm(V1~timeQ+sin1+cos1+sin2+cos2+sin3+cos3+sin4+cos4+sin5+cos5,
data=train_data))
##
## Call:
## lm(formula = V1 ~ timeQ + sin1 + cos1 + sin2 + cos2 + sin3 +
## cos3 + sin4 + cos4 + sin5 + cos5, data = train_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.99477 -0.41458 0.00067 0.40287 1.66130
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1196099 0.0757419 -1.579 0.1154
## timeQ 0.0007139 0.0004526 1.577 0.1158
## sin1 1.2645589 0.0532659 23.740 < 2e-16 ***
## cos1 0.4445201 0.0529338 8.398 2.36e-15 ***
## sin2 1.3770328 0.0530280 25.968 < 2e-16 ***
## cos2 0.1295426 0.0529338 2.447 0.0150 *
## sin3 0.2718404 0.0530146 5.128 5.51e-07 ***
## cos3 0.0280577 0.0529338 0.530 0.5965
## sin4 0.0881655 0.0530038 1.663 0.0974 .
## cos4 0.0424693 0.0529338 0.802 0.4231
## sin5 0.0288706 0.0529876 0.545 0.5863
## cos5 -0.0291144 0.0529338 -0.550 0.5828
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6374 on 278 degrees of freedom
## Multiple R-squared: 0.8285, Adjusted R-squared: 0.8217
## F-statistic: 122.1 on 11 and 278 DF, p-value: < 2.2e-16
trig_model2<-lm(V1~timeQ+sin1+cos1+sin2+cos2+sin3+cos3,
data=train_data)
summary(trig_model2)
##
## Call:
## lm(formula = V1 ~ timeQ + sin1 + cos1 + sin2 + cos2 + sin3 +
## cos3, data = train_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.97443 -0.44488 0.01976 0.41648 1.74848
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1124192 0.0756010 -1.487 0.138
## timeQ 0.0006645 0.0004514 1.472 0.142
## sin1 1.2639084 0.0532649 23.729 < 2e-16 ***
## cos1 0.4445695 0.0529343 8.399 2.24e-15 ***
## sin2 1.3766842 0.0530281 25.961 < 2e-16 ***
## cos2 0.1295920 0.0529343 2.448 0.015 *
## sin3 0.2715170 0.0530148 5.122 5.62e-07 ***
## cos3 0.0281071 0.0529343 0.531 0.596
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6374 on 282 degrees of freedom
## Multiple R-squared: 0.826, Adjusted R-squared: 0.8217
## F-statistic: 191.3 on 7 and 282 DF, p-value: < 2.2e-16
acf(trig_model2$residuals, lag = 20)
Prediction Plot for test data. The model didnβt do well in predicting new data
plot.ts(test_data$V1, col="gray", ylim=c(.5,3))
lines(predict(trig_model2),col="steelblue")
pred_poly = predict(poly_model, data.frame(timeQ=c(291:300)))
pred_trig = predict(trig_model2, data.frame(timeQ=c(291:300),
sin1<-sin(2*pi*(7/n)*c(291:300)),
cos1<-cos(2*pi*(7/n)*c(291:300)),
sin2<-sin(2*pi*(13/n)*c(291:300)),
cos2<-cos(2*pi*(13/n)*c(291:300)),
sin3<-sin(2*pi*(14/n)*c(291:300)),
cos3<-cos(2*pi*(14/n)*c(291:300))
))
The Trig model has a better MAE
mae_poly<- mean(abs(test_data$V1 - pred_poly))
mae_poly
## [1] 2.953709
mae_trig<- mean(abs(test_data$V1 - pred_trig))
mae_trig
## [1] 0.8939527
Trig model has a much better predictions based on the mean squared error
mse_poly<- mean((test_data$V1 - pred_poly)^2)
mse_poly
## [1] 9.515094
mse_trig<- mean((test_data$V1 - pred_trig)^2)
mse_trig
## [1] 1.048849
Trig model has a better MAPE.
mape_poly<- mean(abs(test_data$V1 - pred_poly)/test_data$V1)
mape_poly
## [1] 1.957668
mape_trig<- mean(abs(test_data$V1 - pred_trig)/test_data$V1)
mape_trig
## [1] 0.9221996
Q3)
GNP = read.table('/Users/shuaibismail/Documents copy/809/midterm/GNP_Quarterly.txt', header = FALSE)
attach(GNP)
head(GNP)
plot.ts(GNP)
Thereβs no slow decay, this seems like a stationary series.
acf(GNP)
2. This is an AR(1) process, as the series dies off after the first lag
pacf(GNP)
The model can be represented as ππ‘ = 0.0077 + 0.37871ππ‘β1 + πt.
With a t-value of 5.425, the coefficient is statistically significant.
ar1_model = arima(GNP,
order = c(1, 0, 0))
ar1_model
##
## Call:
## arima(x = GNP, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.3787 0.0077
## s.e. 0.0698 0.0012
##
## sigma^2 estimated as 9.801e-05: log likelihood = 562.47, aic = -1120.94
plot(ar1_model$aic)
acf(ar1_model$residuals, main = 'ACF Plot of Residuals', na.action = na.pass)
Box Test of Residuals: Null hypothesis is that residuals are white-noise. P-value = 0.7104, we fail to reject the null hypothesis
Box.test(ar1_model$residuals, lag = 20, type = 'Ljung')
##
## Box-Ljung test
##
## data: ar1_model$residuals
## X-squared = 16.099, df = 20, p-value = 0.7104
Random Walk Model of the First order Difference. The ACF of the first difference is not a white-noise. Hence, this is not a random walk process.
diff_GNP = diff(GNP$V1)
acf(diff_GNP)
The RW estimates are zeros.
rwModel = arima(diff_GNP,
order = c(0,0,0))
rwModel
##
## Call:
## arima(x = diff_GNP, order = c(0, 0, 0))
##
## Coefficients:
## intercept
## -1e-04
## s.e. 9e-04
##
## sigma^2 estimated as 0.0001422: log likelihood = 526.76, aic = -1051.52
acf(rwModel$residuals, main = 'ACF Plot of the Random Walk Model (First Difference)')
pred_ar<-GNP$V1-ar1_model$residuals
plot.ts(GNP$V1)
lines(pred_ar,col="blue", type="b")
#pred_rW<-GNP$V1-rwModel$residuals
#plot.ts(GNP$V1)
#lines(pred_rW,col="blue", type="b")
Q3)
set.seed(1)
n=100
variance = 2
Y = rep(0, n)
error_term = rnorm(n, 0, sqrt(variance))
for (i in 3:n){
Y[i] = 2 + .75 * Y[i-1] -.125 * Y[i-2] + error_term[i]
}
plot.ts(Y)
ACF Plot of the AR(2) series
acf(Y)
PACF plot of the series: Itβs evidence that the AR series is of the second order.
pacf(Y)
ar2_model = arima(Y,
order = c(2,0,0))
ar2_model
##
## Call:
## arima(x = Y, order = c(2, 0, 0))
##
## Coefficients:
## ar1 ar2 intercept
## 0.8618 -0.2676 5.5497
## s.e. 0.0970 0.1037 0.3271
##
## sigma^2 estimated as 1.757: log likelihood = -170.45, aic = 346.89
Investigating the residuals: Itβs evident from the plot that the residuals are white noise.
acf(ar2_model$residuals, main = 'Plot of the AR(2) Model Residuals')
Residual test shows that the residuals are white noise. p-value = 0.9494, so we fail to reject the null hypothesis
Box.test(ar2_model$residuals, lag = 20, type = 'Ljung')
##
## Box-Ljung test
##
## data: ar2_model$residuals
## X-squared = 10.873, df = 20, p-value = 0.9494
Q5) Consider a time-series ππ‘. The sample autocorrelation at lag 1 for ππ‘is estimated as 0.709. Also, the estimate of the mean (level) of Y series is π = 40.2. It is decided that ππ‘ can be modeled adequately by a first order stationary autoregressive process, that is, ππ‘ = πΆ + π1ππ‘β1 + π‘, where ππ‘ s are white-noise terms. Based on the given information in the above and using properties of an AR(1) process: 1. Estimate the constant term C, coefficient π1 and write down the estimated AR(1) model.
from E(Yt) = E(C + π1πt + πt) π = c + π1π + 0
Since π=40.2, π1 (sample auto-correlation at lag 1) = 0.709 π = c/1-π1 40.2 = c/1-0.709 c = 40.2*0.291 c = 11.4982
Hence, c = 11.4982, π1 = 0.709 ππ‘ = πΆ + π1ππ‘β1 + πt πt = 11.4982 + 0.709ππ‘β1
from ππ‘ = πΆ + π1ππ‘β1 + π2ππ‘β2 + ππ‘ π = c/1-π1-π2 40.2 = 11.4982/1-0.709-π2
40.2(0.291-π2) = 11.4982
11.69-40.2π2 = 11.4982
-40.2π2 = -0.2
π2 = 0.0049
πt = 11.4982 + 0.709ππ‘β1 + 0.0049ππ‘β2
40.2 = 11.4982/1-0.709-0.0049-π3
40.2(0.2861-π3) = 11.4982
-40.2π3= 11.4982-11.5
π3 = 0.00124
πt = 11.4982 + 0.709ππ‘β1 + 0.0049ππ‘β2 + 0.00124ππ‘β3
Since the PACF at lag k is the correlation between Yt and Yt-k. Consequently, for an AR(1), the partial autocorrelation at lag 1 is π1 = 0.709. Hence the reason why the PACF plot will chop off ππ‘β2 and ππ‘β3 as π2 and π3 will be zeros.