{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) So my R markdown started to not allow me to use it so this is my learning log 16 other than the fact the code is showing not the output.

Faraway

To understand these trends we looked at the package faraway and the dataset airpass to determine which trend it was.

install.packages("faraway")
library("faraway")
data(airpass)
attach(airpass)
head(airpass)
plot(pass~year,type="l")
mod<-lm(pass~year)
summary(mod)
res<-residuals(mod)
plot(res)
abline(0,0)
plot(mod)
mod2<-lm(pass~year+I(year^2))
summary(mod2)
plot(mod2)
coef(mod) %*% c(1,62)
coef(mod) %*% c(1,62,62^2)

We determined that airpass has a quadratic trend and that low p value means that there is a relationship. The year^2 causes year to decrease significates because it accounts for some of the significates in year and you must keep both.

Next we looked at the dataset aatemp to see that the trend was for that package.

install.packages("faraway")
library("faraway")
data(aatemp)
attach(aatemp)
head(aatemp)
plot(temp~year,type="l")
mod<-lm(temp~year)
summary(mod)
res<-residuals(mod)
plot(res)
abline(0,0)
plot(mod)
mod2<-lm(temp~year+I(year^2))
summary(mod2)
plot(mod2)
coef(mod) %*% c(1,2002)
coef(mod2) %*% c(1,2002,2002^2)
newdata1<-data.frame(year=2002)
predict(mod, newdata=newdata1, interval = "confidence")
predict(mod2, newdata=newdata1, interval="confidence")
plot(temp~year, type="l")
lines(mod$fitted.values~year,col="red")
lines(mod2$fitted.values~year,col="blue")

We determined that this was a linear trend not a quadratic.

Next we looked at divusa data.

data(divusa)
attach(divusa)
head(divusa)
plot(birth~year,type="l")
mod<-lm(birth~year)
summary(mod)
res<-residuals(mod)
plot(res)
abline(0,0)
plot(mod)
mod2<-lm(birth~year+I(year^2))
summary(mod2)
plot(mod2)
mod3<-lm(birth~year+poly(year,3),data=divusa)
plot(mod3)
summary(mod3)
mod4<-lm(birth~year+poly(year,4),data=divusa)
plot(mod4)
summary(mod4)
mod5<-lm(birth~year+poly(year,5),data=divusa)
plot(mod5)
summary(mod5)
mod6<-lm(sqrt(birth)~year+poly(year,6),data=divusa)
plot(mod6)
summary(mod6)
mod8<-lm(sqrt(birth)~year+poly(year,8),data=divusa)
plot(mod8)
summary(mod8)
coef(mod) %*% c(1,2000)
coef(mod2) %*% c(1,2000,2000^2)
newdata1<-data.frame(year=2000)
predict(mod, newdata=newdata1, interval = "confidence")
predict(mod8, newdata=newdata1, interval="confidence")
plot(birth~year, type="l")
lines(mod$fitted.values~year,col="red")

It took add multiple time^ but eventurally we got to a good mod with poly(year,8). This means it was a poly trend.

The class than looked at auto correlation.

Auto Correlation

If your resids through time are correlated they have auto correlation.

Positive auto corr a pos. resid is likely to be followed by another pos. a neg. resid is likely to be followed by another neg.

Negative auto corr a pos. resid is likely to be followed by a neg. a neg. resid is likely to be followed by a pos.

Detecting. Plot resids through time Durbin Watson statistic (pvalue)

Durbin Watson Test

$ H_0$: no autocorr. $ H_a$: there is autocorr or there is neg autocorr reject if 4-d is small or there is pos autocorr reject if d is small

DW test stat

d= Sum of pairs in time/ sum of errors in time squared

d=(Sum(E_t-E_(t-1))2)/(sum(E_t2))

The following code downloads the package:

install.packages("lmtest")
library("lmtest")
dwtest((mod))

Now lets show this in the linear trend dataset that we showed above.

data(aatemp)
attach(aatemp)
head(aatemp)
plot(temp~year,type="l")
mod<-lm(temp~year)
summary(mod)
res<-residuals(mod)
plot(res)
abline(0,0)
plot(mod)
mod2<-lm(temp~year+I(year^2))
summary(mod2)
dwtest(mod)
dwtest(mod2)

As shown by the summary the Pvalue is small so you would reject the null and say that there is autocorr.