\(x_t+1.6x_{t-1}+0.64x_{t-2}=w_t\)
\(x_1\) is the ACF of the above AR(2) series solved by using the initial conditions. \(\rho_1\) is the ACF solved by using R code ARMAacf()
h = c(0:10)
par(mfcol=c(1,2))
x1 <- (-1.25)^(-h)*(1+.2915*h)
plot(x1, type='h', xlab='h', main='ACF')
abline(h=0)
rho1 <- ARMAacf(ar=c(-1.6,-.64), ma = 0, lag.max = 10)
plot(rho1, type='h', xlab='h', main='ACF')
abline(h=0)
\(x_t-0.4x_{t-1}-0.45x_{t-2}=w_t\)
\(x_2\) is the ACF of the above AR(2) series solved by using the initial conditions. \(\rho_2\) is the ACF solved by using R code ARMAacf()
par(mfcol=c(1,2))
x2 <- 0.8766*(10/9)^(-h)+0.1234*(-2)^(-h)
plot(x2, type='h')
rho2 <- ARMAacf(ar=c(.4,.45), ma=0, 10)
plot(rho2,type='h')
\(x_t-1.2x_{t-1}+0.85x_{t-2}=w_t\)
\(x_3\) is the ACF of the above AR(2) series solved by using the initial conditions. \(\rho_3\) is the ACF solved by using R code ARMAacf()
polyroot(c(1, -1.2, 0.85))
## [1] 0.7058824+0.8235294i 0.7058824-0.8235294i
par(mfcol=c(1,2))
x3 <- ((1.08465)^(-h))*1.00241*cos(0.86217*h-0.06939)
plot(x3, type='h')
abline(h=0)
rho3 <- ARMAacf(ar=c(1.2,-0.85), ma=0, 10)
plot(rho3, type='h')
abline(h=0)
We plot the ACF and PACF of series a, while a represents the cardiovascular mortality series.
library(astsa)
a <- ts(cmort) #Convert to time series data
par(mfcol=c(1,1))
acf(a, 48, main="ACF of series a")
From the original data’s ACF, we can find that there’s a periodicity in the data, repeats almost every 20 lags.
par(mfcol=c(1,1))
acf(a, lag=48, type='partial', main="PACF of series a")
According to the PACF, we can tell the \(x_t\) almost has nothing to do with \(x_{t-3}\) and the data points that are even further, as a consequence, we possibly should build an AR model with lag equals to 2.
We use the R code ar.ols to fit an AR(2) to \(x_t\)
The fitted AR(2) model is \(x_t=\hat{\beta_0}+\hat{\beta_1x_{t-1}}+\hat{\beta_2x_{t-2}}+\epsilon_t\), \(\epsilon_t \sim(0,\sigma^2)\)
(regr = ar.ols(a, order=2, demean=F, intercept=T, na.action=na.pass)) #ignore the missing values
##
## Call:
## ar.ols(x = a, order.max = 2, na.action = na.pass, demean = F, intercept = T)
##
## Coefficients:
## 1 2
## 0.4286 0.4418
##
## Intercept: 11.45 (2.394)
##
## Order selected 2 sigma^2 estimated as 32.32
The estimated parameters are the following:
| \(\hat{\beta_0}\) | \(\widehat{\beta_1}\) | \(\widehat{\beta_2}\) |
|---|---|---|
| 11.45 | 0.4286 | 0.4418 |
The followings are the asymptotic-theory standard errors of the coeffifients estimates:
regr$asy.se.coef #standard errors
## $x.mean
## [1] 2.393673
##
## $ar
## [1] 0.03979433 0.03976163
To test if the model is fitted properly, we firstly plot the residuals then calculate the ACF.
plot(regr$resid)
acf(regr$resid, na.action=na.pass, main="ACF of residuals")
From the ACF, we know that \(x_t\) only relates to its value at the current time(lag equls to 0). In other words, the AR(2) model is well-fitted.
Second, we use Student’s t-test to verify the coefficients of the model are ststistically significant.
In our model \(x_t=\hat{\beta_0}+\hat{\beta_1}x_{t-1}+\hat{\beta_2}x_{t-2}+\epsilon_t\), we want to test the null hypothesis that whether \(\hat{\beta_1}\) and \(\hat{\beta_2}\) are equal to 0, which means these \(x_{t-1}\) and \(x_{t-2}\) and \(x_t\) are independent.
Then
\(t_{score}=\frac{\hat{\beta_i}-0}{SE_\hat{\beta_i}} \sim\tau_{n-2}\) , \(i=1,2\)
We calculate the \(t_{score}\) of these two coefficients.
regr$ar/regr$asy.se.coef$ar #sample>500 d.f. is regared as infinity
## , , 1
##
## [,1]
## [1,] 10.77014
## [2,] 11.11090
| \(t_{score}\text{ of }\hat{\beta_1}\) | \(t_{score}\text{ of }\hat{\beta_2}\) |
|---|---|
| 10.77014 | 11.11090 |
Both of them are larger than the t-statistic 1.645, which means the coefficients are statistically significant at 95% confidence level.
First of all, we import the csv file Data into R and named data. Then we transfer it to another data type “number.”
data <- read.csv("D:/Data.csv", header=T, sep=",")
X0 <- data[1:314];
X <- as.numeric(X0)
Plot original time series data of X.
par(mfcol=c(1,1))
plot(X, type="l", xlab="week", main='Time Series data of X')
Taking log of the original series, we name it Y. By doing so, it’s easier for us to compare values that cover a larger range which the original series X has. Besides, the qqplot and the histogram show a nearly normally distributed pattern of Y.
Y <- log(X)
plot(Y, type='l', xlab='week')
qqnorm(Y)
hist(Y)
Throughplot(Y), we find an obvious trend of series Y. As a result, we’re going to detrend our data for doing analysis subsequently.
First of all, we use cosine and sine functions as the independent variables in the regression fitted model:
\(Y=\hat{\alpha_1}cos(\frac{2\pi*t}{52})+\hat{\alpha_2}sin(\frac{2\pi*t}{52})+\epsilon_t\)
t <- 1:314
x1 <- cos(2*pi*t/52)
x2 <- sin(2*pi*t/52)
fit = lm(Y ~ x1 + x2)
summary(fit)
##
## Call:
## lm(formula = Y ~ x1 + x2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.28964 -0.44569 -0.03159 0.45123 1.31822
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.87178 0.03327 266.687 < 2e-16 ***
## x1 -0.70689 0.04691 -15.070 < 2e-16 ***
## x2 -0.29527 0.04718 -6.258 1.29e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5895 on 311 degrees of freedom
## Multiple R-squared: 0.4616, Adjusted R-squared: 0.4582
## F-statistic: 133.3 on 2 and 311 DF, p-value: < 2.2e-16
After taking regression fitting, we take the residuals of the model and name it Z1. Then plot Z1, Z1’s sample ACF and Z1’s sample PACF.
Z <- fit$residuals
plot(Z, type='l')
par(mfcol=c(1,1))
acf(Z,na.action = na.pass, lag=100) #sampled ACF
par(mfcol=c(1,1))
acf(Z,type="partial") #sampled PACF
According to the sample ACF of Z, we decide to construct an AR(5) model using Yule-Walker equations.
Z.yw = ar.yw(Z)
Z.yw$x.mean
## [1] 3.716949e-17
Z.yw$ar
## [1] 0.87777455 0.20798037 -0.06145854 0.13197222 -0.22672403
The 5 estimated coefficients are listed as below:
| \(\widehat{\phi_1}\) | \(\widehat{\phi_2}\) | \(\widehat{\phi_3}\) | \(\widehat{\phi_4}\) | \(\widehat{\phi_5}\) |
|---|---|---|---|---|
| 0.87777455 | 0.20798037 | -0.06145854 | 0.13197222 | -0.22672403 |
After constructing the AR(5) model, we then solve it and find its 5 roots. It turns out the polynimial has one real root and two complex conjugate roots.
z <- c(1,-Z.yw$ar) #coefficients of the polynomial
polyroot(z) #solve the polynomial
## [1] 1.143154-0.120283i -1.387205+0.000000i -0.158510-1.543143i
## [4] 1.143154+0.120283i -0.158510+1.543143i
Then we can find the pseudo period of the data.
b <- polyroot(z)
1/Mod(b) #r, the decreasing speed of the function
## [1] 0.8699702 0.7208739 0.6446363 0.8699702 0.6446363
Arg(b) #theta
## [1] -0.1048341 3.1415927 -1.6731562 0.1048341 1.6731562
(2*pi)/Arg(b) #pseudo period
## [1] -59.934530 2.000000 -3.755289 59.934530 3.755289
From the results above, the 2 pseudo periods of the fitted model are 59.934530 (weeks) and 3.755289 (weeks).
To test how well our AR(5) model fit the data, we firstly conduct Ljung-Box test to see if the residuals are normally distribured.
z <- Z.yw$resid
Box.test(z, type="Ljung-Box", lag=Z.yw$order)
##
## Box-Ljung test
##
## data: z
## X-squared = 10.543, df = 5, p-value = 0.06122
The p-value turns out to be 0.06122, which seems that our model successfully fits the data.
Then we calculate the ACF of the residuals.
plot(Z.yw$resid, type='l')
acf(Z.yw$resid, na.action=na.pass, lag=52, main="ACF of residuals")
We can find that the residuals are independent to each others at different time.
The red line is the model fitted ACF. We compare the sample ACF with our model fitted ACF, and find that the model fitted ACF perfectly matches the sample ACF until lags around 25. This means that our model can only explain the relation of the former 20 lags.
par(mfcol=c(1,1))
ar5_acf <- ARMAacf(ar=Z.yw$ar, ma=0, 52)
acf(Z, 52, main="Sample ACF & Model Fitted ACF")
lines(0:52, ar5_acf, col=2, lwd=2) #compare sample ACF with model ACF
The red dots represent model fitted PACF. We compare the sample PACF with our model fitted PACF, and find that the latter perfectly matches the former at lag 5. However, after lag 5 these two PACF seem to go in the different way.
par(mfcol=c(1,1))
ar5_pacf <- ARMAacf(Z.yw$ar, ma=0, 52, pacf=TRUE)
acf(Z, 52, type="partial", main="Sample PACF & Model Fitted PACF")
points(1:52, ar5_pacf, col=2, pch=16) #compare sample PACF with model PACF