Introduction to data

The data set consists of quarterly (that is, four times yearly) observations of the earnings of Johnson & Johnson corporation from 1960 to 1980. The goal of this analysis is to forecast the earnings in the second quarter of 1981. The data is plotted below.

plot(jj, ylab = 'Earnings', main = 'Quarterly Earnings of Johnson & Johnson')

Transformation of response

Notice the variance of the data is not constant, so a log transformation will be applied.

plot(log(jj), ylab = 'Log of Earnings', main = 'Log Transformed Quarterly Earnings of Johnson & Johnson') 

Classical decomposition of series

The next step is to decompose the series into trend, seasonal, and random components.

plot(decompose(log(jj)))

mu <- decompose(log(jj))$trend
seas <- decompose(log(jj))$seasonal
x <- decompose(log(jj))$random    
head(x)
##             Qtr1        Qtr2        Qtr3        Qtr4
## 1960          NA          NA  0.19208032 -0.19734182
## 1961 -0.04101776  0.00384848

Notice that because the trend is estimated with a moving average, estimates of trend do not exist for the first and last two quarters in the data. To continue analysis, we will remove the missing values from the two components that will be forecasted: the trend and noise

x <- x[!is.na(x)]      
mu <- mu[!is.na(mu)]   

Test for whiteness of residuals

Now we test to see if there is a significant dependence structure remaining in the noise sequence.

Box.test(x, type="Ljung-Box", lag=20)  # Strong yes
Box-Ljung test

data: x X-squared = 93.167, df = 20, p-value = 2.061e-11

turning.point.test(x)                  # No
Turning Point Test

data: x statistic = 0, n = 80, p-value = 1 alternative hypothesis: non randomness

difference.sign.test(x)                # No
Difference Sign Test

data: x statistic = -0.19245, n = 80, p-value = 0.8474 alternative hypothesis: nonrandomness

rank.test(x)                           # No
Mann-Kendall Rank Test

data: x statistic = 0.21604, n = 80, p-value = 0.829 alternative hypothesis: trend

We see an emphatic “yes” from the first test, and no evidence of dependence from the last three tests. However, the last three primarily detect trend, so I’ll conclude an overall answer of “yes”.

Model the noise

As a preliminary investigation, we inspect the ACF and PACF of the noise sequence.

acf(x)                      

acf(x, type = 'partial')

The ACF decreases gradually, so it does not appear a moving average model is appropriate. The PACF drops off rapidly after a lag of 3, so an AR(3) model may be appropriate. However, this is not a very systematic approach, so we will test the AICC of many models.

bestAICC <- Inf
best_pq <- c(0,0)
for(p in 0:4){
  for(q in 0:4){
    tempAICC <- arima(x, order=c(p,0,q), include.mean=F)$aic
    print(paste0('The AICC of an ARMA(',as.character(p),',',as.character(q),') model is ',as.character(tempAICC),'.'))
    if(tempAICC < bestAICC){
      bestAICC <- tempAICC
      best_pq <- c(p,q)
    }
  }
}
print(paste0('The bestAICC value of ', bestAICC, ' occurs for an ARMA(',best_pq[1],',',best_pq[2],' model.'))
## [1] "The AICC of an ARMA(0,0) model is -172.029768258359."
## [1] "The AICC of an ARMA(0,1) model is -231.329886574165."
## [1] "The AICC of an ARMA(0,2) model is -236.736769339962."
## [1] "The AICC of an ARMA(0,3) model is -242.152203682292."
## [1] "The AICC of an ARMA(0,4) model is -246.257409267."
## [1] "The AICC of an ARMA(1,0) model is -183.935318127026."
## [1] "The AICC of an ARMA(1,1) model is -230.692978992472."
## [1] "The AICC of an ARMA(1,2) model is -229.392146801971."
## [1] "The AICC of an ARMA(1,3) model is -240.173199605519."
## [1] "The AICC of an ARMA(1,4) model is -251.447778247133."
## [1] "The AICC of an ARMA(2,0) model is -197.267117369976."
## [1] "The AICC of an ARMA(2,1) model is -246.208346988066."
## [1] "The AICC of an ARMA(2,2) model is -250.754188681511."
## [1] "The AICC of an ARMA(2,3) model is -262.924328648299."
## [1] "The AICC of an ARMA(2,4) model is -262.382501442061."
## [1] "The AICC of an ARMA(3,0) model is -267.202137161034."
## [1] "The AICC of an ARMA(3,1) model is -266.329071476036."
## [1] "The AICC of an ARMA(3,2) model is -272.786262993381."
## [1] "The AICC of an ARMA(3,3) model is -275.080648024962."
## [1] "The AICC of an ARMA(3,4) model is -273.32391803768."
## [1] "The AICC of an ARMA(4,0) model is -265.744076575595."
## [1] "The AICC of an ARMA(4,1) model is -275.585187211414."
## [1] "The AICC of an ARMA(4,2) model is -274.309309397555."
## [1] "The AICC of an ARMA(4,3) model is -273.368428901423."
## [1] "The AICC of an ARMA(4,4) model is -271.368426149774."
## [1] "The bestAICC value of -275.585187211414 occurs for an ARMA(4,1 model."

We see that the best value of AICC occurs for an ARMA(4,1) model. Let’s now extract the coefficients for this model.

model <- arima(x, order=c(4,0,1), include.mean=F)
model
phi_hat <- model$coef[1:4]
theta_hat <- model$coef[5]
sigma2_hat <- model$sigma2
## 
## Call:
## arima(x = x, order = c(4, 0, 1), include.mean = F)
## 
## Coefficients:
##           ar1      ar2      ar3     ar4      ma1
##       -0.1922  -0.3076  -0.2820  0.4810  -1.0000
## s.e.   0.1024   0.1037   0.1058  0.1092   0.0528
## 
## sigma^2 estimated as 0.001466:  log likelihood = 143.79,  aic = -275.59

The estimated parameters are \[\hat{\phi} = (-0.1921703, -0.3076474, -0.281964, 0.480996), \hat{\theta} = -0.9999749, \widehat{\sigma^2} = 0.0014662\]

Forecast the noise sequence

The next bit of code will do an innovation style prediction. We want to predict two quarters into the future, but because the moving average resulted in two missing values at the beginning and end of the noise sequence, we actually need to predict four quarters into the future.

H <- 4; n <- length(x)
xhat_n_hs <- hStepPred(x, 4, phi_hat, theta_hat)
plot(NA, NA, xlim = c(1,n+H), ylim = c(min(x)-(max(x)-min(x))*.05,max(x)+(max(x)-min(x))*.05), xlab='Time', ylab='J&J stock residuals', main='Forecasted residuals', type='l')
abline(h=0)
points(1:n,x,type='l')
points((n+1):(n+H), xhat_n_hs, col = 'blue', pch='+')

Reconstruct series

To begin reconstructing the series, we first forecast the trend with Holt’s method. Then add seasonality to the forecasted noise and mean.

holt_fit <- holt(mu, h=H) 
plot(holt_fit, ylab='Trend of log earnings')

log_pred = holt_fit$mean[4]+seas[2]+xhat_n_hs[4]

Undo the log transformation to obtain the final prediction.

pred <- exp(log_pred)
pred      # Here is the final prediction
## [1] 16.83805

The forecasted earning for Q2 1981 is 16.838055 per share.