library(TSA)

## 
## Attaching package: 'TSA'

## The following objects are masked from 'package:stats':
## 
##     acf, arima

## The following object is masked from 'package:utils':
## 
##     tar

library(forecast)

## Warning: package 'forecast' was built under R version 4.0.3

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Registered S3 methods overwritten by 'forecast':
##   method       from
##   fitted.Arima TSA 
##   plot.Arima   TSA

data(JJ)
JJ

##           Qtr1      Qtr2      Qtr3      Qtr4
## 1960  0.710000  0.630000  0.850000  0.440000
## 1961  0.610000  0.690000  0.920000  0.550000
## 1962  0.720000  0.770000  0.920000  0.600000
## 1963  0.830000  0.800000  1.000000  0.770000
## 1964  0.920000  1.000000  1.240000  1.000000
## 1965  1.160000  1.300000  1.450000  1.250000
## 1966  1.260000  1.380000  1.860000  1.560000
## 1967  1.530000  1.590000  1.830000  1.860000
## 1968  1.530000  2.070000  2.340000  2.250000
## 1969  2.160000  2.430000  2.700000  2.250000
## 1970  2.790000  3.420000  3.690000  3.600000
## 1971  3.600000  4.320000  4.320000  4.050000
## 1972  4.860000  5.040000  5.040000  4.410000
## 1973  5.580000  5.850000  6.570000  5.310000
## 1974  6.030000  6.390000  6.930000  5.850000
## 1975  6.930000  7.740000  7.830000  6.120000
## 1976  7.740000  8.910000  8.280000  6.840000
## 1977  9.540000 10.260000  9.540000  8.729999
## 1978 11.880000 12.060000 12.150000  8.910000
## 1979 14.040000 12.960000 14.850000  9.990000
## 1980 16.200000 14.670000 16.020000 11.610000

Part (a)

Plot the time series and also the logarithm of the series. Argue that we should transform by logs to model the series.

ts.plot(JJ, main="The quarterly earnings per share for 1960-1980 
        of the U.S. company Johnson & Johnson", xlab = "Time", ylab= "Quarterly Earnings")

ts.plot(log(JJ), main="The LOGARITHMIC quarterly earnings per share for 1960-1980 
        of the U.S. company Johnson & Johnson", xlab = "Time", ylab= "Quarterly Earnings")

It can be seen that the ordinary time series plot shows higher variation at the higher values of the data series. On the other hand, the plot of the logarithmic values shows much more equal variation at all levels of the series. Hence, the logarithmic values will be used for all of the remaining modelling.

Part(b)

Investigate the stationarity of the series.

plot(diff(log(JJ)),  main="The differenced logarithmic quarterly earnings per share for 1960-1980 
     of the U.S. company Johnson & Johnson", ylab = 'Difference of Logarithmic Quarterly Earnings', type="o")

The data series is clearly not stationary as its variability and distribution is inconsistent throughout the years. We do not expect stationay series to have less variability in the middle of the series as this one does but we might entertain a stationary model and see where it leads us.

Part(c)

Calculate and graph the sample ACF of the first differences. Interpret the results.

acf(diff(log(as.vector(JJ))), main = "Sample ACF of the first differences")

In this quarterly series, the strongest autocorrelations are at the seasonal lags of 4, 8, 12, and 16. Clearly, we need to address the seasonality in this series.

Part(d)

Display the plot of seasonal differences and first differences. Interpret the plot.

seasonal.diff = diff(diff(log(JJ),lag=4))

plot(seasonal.diff,  main = "The seasonal & differenced logarithmic quarterly earnings per share 
     for 1960-1980 of the U.S. company Johnson & Johnson", ylab = 'Seasonal & First Difference', type='l')

points(y = seasonal.diff, x = time(seasonal.diff), pch = as.vector(season(seasonal.diff)))

The various quarters seem to be quite randomly distributed among high, middle, and low values, so that most of the seasonality is accounted for in the seasonal difference.

Part (e)

Graph and interpret the sample ACF of the seasonal differences with the first differences.

acf(as.vector(seasonal.diff), main="Sample ACF of the seasonal differences with the first differences")

They only significant autocorrelations are at lags 1 and 7. Lag 4 (the quarterly lag) is nearly significant.

Part (f)

Suggest a model to fit the data. Assess the significance of the estimated coefficients.

Model = arima(log(JJ), order = c(0,1,1), seasonal = list(order=c(0,1,1), period=4))
Model

## 
## Call:
## arima(x = log(JJ), order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 4))
## 
## Coefficients:
##           ma1     sma1
##       -0.6809  -0.3146
## s.e.   0.0982   0.1070
## 
## sigma^2 estimated as 0.007931:  log likelihood = 78.38,  aic = -152.75

Since both ACF of the differenced and seasonal differenced shows a cut off at lag 1, Hence, the model ARIMA(0,1,1)×(0,1,1)4 is fitted.

According to the output, it can be seen that both the seasonal and nonseasonal MA parameters are significant in this model.

Part (g)

Perform all of the diagnostic tests on the residuals.

checkresiduals(Model)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)(0,1,1)[4]
## Q* = 7.1195, df = 6, p-value = 0.3099
## 
## Model df: 2.   Total lags used: 8

qqnorm(residuals(Model))
qqline(residuals(Model))

shapiro.test(residuals(Model))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Model)
## W = 0.98583, p-value = 0.489

These diagnostic plots for residuals do not show any inadequacies with the model. We can see that from the ACF plot that most of the lags are within the significance level. On the other hand, the residuals are distributed with mean 0 and can be seen to be random as there is not clear pattern.

The Ljung-Box Test generates a p-value of 0.3099 > 0.05 = alpha. This supports that the residuals are independently distributed as the null hypothesis is not rejected.

The Shapiro-Wilk Normality test has a p-value = 0.489 > 0.05 = alpha which supports the SARIMA model used to fit the data. The QQ-plot also showed that the residuals are approximately distributed on the line.

Hence this means that there are no outliers are detected and there is little autocorrelation in the residuals and the normality of the error terms looks like a very good assumption. This concludes that the suggested model is a good and suitable model as it has a residual with a strict white noise property of a randomly and normally distribution with zero mean and constant variance.

Part(h)

Calculate and plot forecasts for the next two years of the series. Be sure to include forecast limits.

library(TSA)
    
plot(Model,n1=c(1978,1),n.ahead=8,pch=19,ylab='Logarithmic Quarterly Earnings', main = "Forecasted values of the logarithmic quarterly earnings per share of the U.S. company Johnson & Johnson for 2 YEARS")

## Warning in plot.window(...): "n1" is not a graphical parameter

## Warning in plot.window(...): "n.ahead" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "n1" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "n.ahead" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n1" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n1" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter

## Warning in box(...): "n1" is not a graphical parameter

## Warning in box(...): "n.ahead" is not a graphical parameter

## Warning in title(...): "n1" is not a graphical parameter

## Warning in title(...): "n.ahead" is not a graphical parameter

plot(Model,n1=c(1978,1),n.ahead=8,pch=19,ylab='Quarterly Earnings',transform=exp, main = "Forecasted values of the quarterly earnings per share of the U.S. company Johnson & Johnson for 2 YEARS")

## Warning in plot.window(...): "n1" is not a graphical parameter

## Warning in plot.window(...): "n.ahead" is not a graphical parameter

## Warning in plot.window(...): "transform" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "n1" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "n.ahead" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "transform" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n1" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "transform" is not
## a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n1" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "transform" is not
## a graphical parameter

## Warning in box(...): "n1" is not a graphical parameter

## Warning in box(...): "n.ahead" is not a graphical parameter

## Warning in box(...): "transform" is not a graphical parameter

## Warning in title(...): "n1" is not a graphical parameter

## Warning in title(...): "n.ahead" is not a graphical parameter

## Warning in title(...): "transform" is not a graphical parameter

### The forecasts follow the general pattern of seasonality and “trend” in the earnings series and the forecast limits give a good indication of the confidence in these forecasts. ### Converting to the original values makes the uncertainty in the forecasts to be more understandable.

Time Series Take Home Test

Lim Jin Yoong

12/1/2020