Today in class we covered polynomial trends and autocorrelation. Time series can be written as: \(Y_t=TR_t+\epsilon_t\), which is the value at time t equal to the trend at time t plus the error term at time t. You can have either no trend, a linear trend, a quadratic trend, or a \(p^{th}\) order polynomial. For linear trends, you can have either positive or negative trends; this is shown by \(TR_t=\beta_0+\beta_1t\). For your quadratice trend, you can have either growth at increasing rate, growth at decreasing rate, decline at increasing rate, or decline at decreasing rate; this is show by the trend equation \(TR_t=\beta_0+\beta_1t+\beta_2t^2\). The \(p^{th}\) order polynomial trend looks like: \(TR_t=\beta_0+\beta_1t+\beta_2t^2+...+\beta_pt^p\). In general, we use the \(p^{th}\) order polynomial with \(p\geq3\) if we see a reversal in curvature.
Furthermore, if \(yr^2\) is significant, then we need to keep yr, even if yr is no longer significant. We can check out polynomial trends using the aatemp data
library(faraway)
## Warning: package 'faraway' was built under R version 3.4.4
data(aatemp)
attach(aatemp)
plot(aatemp)
tempmod<-lm(temp~year, data=aatemp)
plot(tempmod)
summary(tempmod)
##
## Call:
## lm(formula = temp ~ year, data = aatemp)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9843 -0.9113 -0.0820 0.9946 3.5343
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.005510 7.310781 3.284 0.00136 **
## year 0.012237 0.003768 3.247 0.00153 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.466 on 113 degrees of freedom
## Multiple R-squared: 0.08536, Adjusted R-squared: 0.07727
## F-statistic: 10.55 on 1 and 113 DF, p-value: 0.001533
To start we created a linear mod. We can see that our variables are significant, so there is at least a linear relationship. Looking at the residuals vs fitted model graph, we can see that there is a possible cubic relationship (shown by the wave shape of the data). To see if this is true, we can create a cubic model and check out our pvalues.
tempmod3<-lm(temp~ year + I(year^2)+I(year^3), data=aatemp)
summary(tempmod3)
##
## Call:
## lm(formula = temp ~ year + I(year^2) + I(year^3), data = aatemp)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8557 -0.9646 -0.1552 1.0485 4.1538
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.959e+04 1.734e+04 2.283 0.0243 *
## year -6.159e+01 2.694e+01 -2.286 0.0241 *
## I(year^2) 3.197e-02 1.395e-02 2.291 0.0238 *
## I(year^3) -5.527e-06 2.407e-06 -2.296 0.0236 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.443 on 111 degrees of freedom
## Multiple R-squared: 0.1298, Adjusted R-squared: 0.1063
## F-statistic: 5.518 on 3 and 111 DF, p-value: 0.001436
plot(tempmod3)
Looking first at our summary, we see that each of our variables is significant. This means that the cubic model is still a good fit. To check out our model assumptions we will check out the plot of the model. We can see the residuals are nearly homoscedastic, which is what we want. Each subsequent plot is also pretty good. This means we have a cubic model for our data.
If we want to forecast a value for 2002, we can do that useing the coef command and matrix multiplication.
coef(tempmod3)%*%c(1,2002,2002^2,2002^3)
## [,1]
## [1,] 47.50484
This tells us that in 2002 we expect to see a mean temperature of 47.505 degrees F.
Secondly, we covered autocorrelation. If the residuals trough time are correlated to each other you have autocorrelation. Positive autocorrelation is when positive residuals follow other positive residuals and negative residuals follow other negative residuals. Negative autocorrelation is when you see positive residuals folloing negative residuals and vice versa. Autocorrelation is bad because we assume the residuals are independent.
To detect autocorrelation we can either visually inspect or do a Dubin Watson test. For our purposes we are only worried about first order autocorrelation, which is how \(\epsilon_{t-1}\) and \(\epsilon_{t+1}\) are related to \(\epsilon_t\). Under the Dubin Watson test \(H_0:\) no autocorrelation. Our alternative can either be that there is autocorrelation, there is positive autocorrelation or that there is negative autocorrelation. If 4-d is small, there is negative autocorrelation, if d is small, there is positive autocorrelation.
library(lmtest)
## Warning: package 'lmtest' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.4
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
dwtest(tempmod3, alternative = c("greater"))
##
## Durbin-Watson test
##
## data: tempmod3
## DW = 1.7171, p-value = 0.03464
## alternative hypothesis: true autocorrelation is greater than 0
We can see that the pvalue is very small, this means that there is positive autocorrelation. If we change the alternative to “less”, we get a very large pvalue, this means that there is no negative autocorrelation.