The file austres.csv contains a time series of quarterly estimates of Australian resident population size for the period of 1975 to 1993. Code to read this in and convert it to a time series object is given below Plot the time series. The series has a marked increasing trend with time.

setwd("C:/doug/classes/geog6000")
austres <- read.csv("austres.csv", na.strings="")
library(forecast)
## Warning: package 'forecast' was built under R version 3.5.3

I used your suggested code rather than following the instructions. so, my start date is 1971 rather than 1975, though the end date is still 1995

austres.ts <- ts(austres$Residents, start=c(1971,2), freq=4)
plot(austres.ts)

Plot the first order difference of the time series to remove this trend.

austres.diff.1 <- diff(austres.ts)
plot(austres.diff.1, main="Australian Resident Count First Order Difference", lwd=2)

Has this adequately removed any trend in the 12 data? #I dont know, it looks like there is still a bit of upward trend from 82 to 89 or so, therefore I’ll second diff to be sure #

austres.diff.2 <- diff(austres.ts, differences = 2)
plot(austres.diff.2, main="Australian Resident Count Second Order Difference", lwd=2)

Make and plot the autocorrelation and partial autocorrelation functions of the final differenced time series.

Acf(austres.diff.2) 

Pacf(austres.diff.2)

# alternatively, captured in one screen: #

tsdisplay(austres.diff.2)

Describe these plots and any autocorrelation in the series

there is significant negative one period autocorrelation, and presumably 10th period and then positively in the 12th period in the data. However, I can think of no reason to speculate on a 10 quarter lag effect while a 12 quarter lag could be some sort of political based visa issue. The partial ACF appears to show some residual negative autocorrelation at one period. I’ll leave it at worrying only about the one period autocorr

Use the auto.arima() function from the forecast package to estimate the orders of an ARIMA model for this data, and report the results of this process

austres.arima <- auto.arima(austres.ts, max.p = 3, max.q = 3, max.d = 2, trace = TRUE, seasonal = TRUE) 
## 
##  ARIMA(2,2,2)(1,0,1)[4]                    : Inf
##  ARIMA(0,2,0)                              : 672.5046
##  ARIMA(1,2,0)(1,0,0)[4]                    : 662.0899
##  ARIMA(0,2,1)(0,0,1)[4]                    : 652.3751
##  ARIMA(0,2,1)                              : 653.134
##  ARIMA(0,2,1)(1,0,1)[4]                    : Inf
##  ARIMA(0,2,1)(0,0,2)[4]                    : 654.5584
##  ARIMA(0,2,1)(1,0,0)[4]                    : 652.1537
##  ARIMA(0,2,1)(2,0,0)[4]                    : 653.9808
##  ARIMA(0,2,1)(2,0,1)[4]                    : Inf
##  ARIMA(0,2,0)(1,0,0)[4]                    : 672.1348
##  ARIMA(1,2,1)(1,0,0)[4]                    : 653.4438
##  ARIMA(0,2,2)(1,0,0)[4]                    : 653.2852
##  ARIMA(1,2,2)(1,0,0)[4]                    : 654.795
## 
##  Best model: ARIMA(0,2,1)(1,0,0)[4]
austres.arima
## Series: austres.ts 
## ARIMA(0,2,1)(1,0,0)[4] 
## 
## Coefficients:
##           ma1    sar1
##       -0.6051  0.1921
## s.e.   0.0974  0.1075
## 
## sigma^2 estimated as 103.7:  log likelihood=-322.93
## AIC=651.86   AICc=652.15   BIC=659.26

Having obtained these orders, now build an ARIMA model using the function Arima()

austres.arima2 <- arima(austres.ts, order = c(0,2,1))

I excluded the seasonality for two reasons: population count seems to be season independent to me (the graph seems to confirm that) AND the autoarima process gave me seasonality of (1,0,0) which I believe is the same as no seasonality, correct?

tsdiag(austres.arima2)

Report the model coefficients and the model AIC A good ARIMA model will remove all autocorrelation in the model residuals. Plot these using a combination of the Acf() and residuals() functions. Use the Ljung-Box test to test for autocorrelation (Box.test()). Is there any remaining autocorrelation in the residuals?

Box.test(residuals(austres.arima2), type = 'Ljung')
## 
##  Box-Ljung test
## 
## data:  residuals(austres.arima2)
## X-squared = 0.051041, df = 1, p-value = 0.8213

Use the forecast() function to estimate the resident population for the next two years. Plot out these estimates and give the expected population size by the start (Q1) of 1995 and the 95% confidence interval of this estimate

austres.forecast <- forecast(austres.arima2, level = c(95), h = 7, fan  = FALSE)
austres.forecast
##         Point Forecast    Lo 95    Hi 95
## 1993 Q3       17704.73 17685.01 17724.44
## 1993 Q4       17747.96 17713.91 17782.01
## 1994 Q1       17791.19 17741.77 17840.60
## 1994 Q2       17834.42 17768.35 17900.48
## 1994 Q3       17877.64 17793.63 17961.66
## 1994 Q4       17920.87 17817.66 18024.08
## 1995 Q1       17964.10 17840.51 18087.70
plot(austres.forecast)