The dataset use in this project describes the minimum daily temperature over 10 years (19981 and 1990) in Melbourne, Australia. The units are in degree celsius and there are 3650 observations. The source of the data was created by Australian Bureau of meteorology. We will analyze this dataset to capture the trend, applied transformation on the data where required, plot the ACF & PACF’s and coefficient tests on suitable models, and forecast for the next five years.
# loading relevant packages into rstudio
library(readr)
library(dplyr)
library(tidyr)
library(TSA)
library(MTS)
library(tseries)
library(lmtest)
library(forecast)
library(fUnitRoots)
library(FitAR)
library(astsa)
library(urca)
# reading the dataset into R
Mel_Min_temp <- read.csv("Minimum Daily Temperature.csv", header = TRUE)
head(Mel_Min_temp,5)
# converting the dataframe to time series object
min_TempTS <- ts(Mel_Min_temp$Minimum.Daily.Temperature.Degree..Degree.C..,
start=c(1981, 1), end=c(1990, 365), frequency = 365)
#checking the class of the dataset
class(min_TempTS)
[1] "ts"
#plotting time Series data
plot(min_TempTS , col=c("blue"), ylab='Min temperature', xlab='Time', type='o',
main = "Time series plot of minimum temperature in Melbourne.")
NA
NA
Figure 1: shows the changes in daily minimum temperature in melbourne from 1981 to 1990. We can conclude the plot as well as some strong cyclic behaviour. there is no apparent trend in the data over this period.
#converting daily minimum temperature to a monthly mean teamperature to better see seasonality
monthlyMinTemp <- aggregate(Mel_Min_temp$Minimum.Daily.Temperature.Degree..Degree.C..~Month+Mel_Min_temp$ï..Year,
Mel_Min_temp, mean )
head(monthlyMinTemp,5)
NA
# converting monthly temperature data into times series data
monthlyMinTempTS <- ts(monthlyMinTemp$`Mel_Min_temp$Minimum.Daily.Temperature.Degree..Degree.C..`,
start=c(1981, 1), end=c(1990, 12), frequency = 12)
monthlyMinTempTS
Jan Feb Mar Apr May Jun Jul
1981 17.712903 17.678571 13.500000 12.356667 9.490323 7.306667 7.577419
1982 16.567742 15.921429 14.935484 11.470000 9.583871 5.606667 4.641935
1983 13.180645 16.807143 15.777419 10.596667 10.116129 6.600000 6.890323
1984 14.309677 14.944828 12.867742 10.750000 8.112903 7.730000 5.987097
1985 14.219355 14.032143 15.877419 12.976667 9.419355 7.073333 6.135484
1986 13.825806 14.196429 14.690323 11.653333 10.274194 7.526667 6.961290
1987 13.235484 13.889286 12.619355 12.250000 9.806452 8.273333 5.983871
1988 16.493548 14.524138 14.748387 12.833333 11.387097 8.386667 8.232258
1989 15.180645 16.371429 15.803226 12.563333 10.725806 6.560000 6.332258
1990 15.577419 15.417857 14.835484 13.433333 9.748387 7.720000 8.183871
Aug Sep Oct Nov Dec
1981 7.238710 10.143333 10.087097 11.890000 13.680645
1982 7.903226 7.280000 9.545161 12.486667 13.754839
1983 8.706452 9.210000 10.312903 11.993333 14.396774
1984 8.696774 8.046667 10.632258 12.623333 12.643333
1985 7.635484 8.803333 10.490323 13.073333 14.109677
1986 7.387097 8.933333 9.683871 11.793333 12.935484
1987 8.022581 9.810000 10.238710 13.150000 13.254839
1988 8.725806 9.883333 10.890323 12.253333 15.436667
1989 6.770968 8.486667 9.867742 12.876667 13.951613
1990 7.825806 9.166667 11.345161 12.656667 14.367742
plot(monthlyMinTempTS, col=c("blue"), ylab='Min temperature', xlab='Time', type='o',
main = "Time series plot of monthly minimum temperature in Melbourne.")
After converting the daily minimum temperature to a monthly mean max temperature, we can clearly see a strong seasonality pattern. Further Analysis will need to be performed.
# scatterPlot to Check correlations of changes in minimum daily temperature
y = min_TempTS # read the daily min temp series into y
x = zlag(min_TempTS) # generate first lag of the min temp series
head(y)
Time Series:
Start = c(1981, 1)
End = c(1981, 6)
Frequency = 365
[1] 20.7 17.9 18.8 14.6 15.8 15.8
head(x)
[1] NA 20.7 17.9 18.8 14.6 15.8
index = 2:length(x) # Create an index to get rid of the first NA value in x
cor(y[index],x[index]) # Calculate correlation between numerical values in x and y
[1] 0.7748702
plot(y[index],x[index],ylab='Min temperature series', xlab='The first lag of min temperature series', main = 'Scatter plot of min temperature series and its first lag')
NA
NA
As Scatter plot shows, the correlation value 0.774 proved that there is strong positive linear relationship between previous day’s min temperature on the next day’s min temperature
we observe a high correlation between temperatures of succeeding months
bb = monthlyMinTempTS
aa = zlag(monthlyMinTempTS)
index = 2: length(aa)
cor(bb[index], aa[index])
[1] 0.8020674
We can see an upward trend and high correlation
plot(y = monthlyMinTempTS, x = zlag(monthlyMinTempTS), ylab = 'Min temp', xlab = 'Previous Month temp')
Daily data
day.= season(min_TempTS)
model1_Daily = lm(min_TempTS~day.)
summary(model1_Daily)
Call:
lm(formula = min_TempTS ~ day.)
Residuals:
Min 1Q Median 3Q Max
-8.300 -1.808 -0.070 1.700 11.870
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.585e+01 8.760e-01 18.094 < 2e-16 ***
day.Season-2 -5.700e-01 1.239e+00 -0.460 0.645459
day.Season-3 -1.160e+00 1.239e+00 -0.936 0.349142
day.Season-4 -1.810e+00 1.239e+00 -1.461 0.144087
day.Season-5 -2.000e+00 1.239e+00 -1.614 0.106523
day.Season-6 -2.190e+00 1.239e+00 -1.768 0.077181 .
day.Season-7 -1.100e+00 1.239e+00 -0.888 0.374630
day.Season-8 -9.300e-01 1.239e+00 -0.751 0.452872
day.Season-9 -6.600e-01 1.239e+00 -0.533 0.594227
day.Season-10 -4.000e-01 1.239e+00 -0.323 0.746797
day.Season-11 -5.300e-01 1.239e+00 -0.428 0.668801
day.Season-12 -6.800e-01 1.239e+00 -0.549 0.583100
day.Season-13 -9.000e-01 1.239e+00 -0.727 0.467579
day.Season-14 5.500e-01 1.239e+00 0.444 0.657088
day.Season-15 8.800e-01 1.239e+00 0.710 0.477529
day.Season-16 -5.000e-02 1.239e+00 -0.040 0.967807
day.Season-17 -9.000e-01 1.239e+00 -0.727 0.467579
day.Season-18 -7.600e-01 1.239e+00 -0.613 0.539591
day.Season-19 -1.600e+00 1.239e+00 -1.292 0.196597
day.Season-20 -3.500e-01 1.239e+00 -0.283 0.777554
day.Season-21 -9.100e-01 1.239e+00 -0.735 0.462648
day.Season-22 -1.740e+00 1.239e+00 -1.405 0.160240
day.Season-23 -1.690e+00 1.239e+00 -1.364 0.172592
day.Season-24 -1.320e+00 1.239e+00 -1.066 0.286708
day.Season-25 -1.050e+00 1.239e+00 -0.848 0.396727
day.Season-26 -8.600e-01 1.239e+00 -0.694 0.487594
day.Season-27 -8.600e-01 1.239e+00 -0.694 0.487594
day.Season-28 -1.060e+00 1.239e+00 -0.856 0.392246
day.Season-29 -1.600e-01 1.239e+00 -0.129 0.897241
day.Season-30 4.600e-01 1.239e+00 0.371 0.710418
day.Season-31 -1.060e+00 1.239e+00 -0.856 0.392246
day.Season-32 -7.600e-01 1.239e+00 -0.613 0.539591
day.Season-33 -1.860e+00 1.239e+00 -1.501 0.133335
day.Season-34 -1.320e+00 1.239e+00 -1.066 0.286708
day.Season-35 -3.000e-01 1.239e+00 -0.242 0.808664
day.Season-36 -1.630e+00 1.239e+00 -1.316 0.188337
day.Season-37 -1.120e+00 1.239e+00 -0.904 0.366009
day.Season-38 -4.900e-01 1.239e+00 -0.396 0.692468
day.Season-39 1.180e+00 1.239e+00 0.953 0.340897
day.Season-40 5.600e-01 1.239e+00 0.452 0.651263
day.Season-41 -1.140e+00 1.239e+00 -0.920 0.357513
day.Season-42 -1.140e+00 1.239e+00 -0.920 0.357513
day.Season-43 -1.080e+00 1.239e+00 -0.872 0.383376
day.Season-44 -5.300e-01 1.239e+00 -0.428 0.668801
day.Season-45 1.180e+00 1.239e+00 0.953 0.340897
day.Season-46 2.120e+00 1.239e+00 1.711 0.087115 .
day.Season-47 1.420e+00 1.239e+00 1.146 0.251768
day.Season-48 8.000e-01 1.239e+00 0.646 0.518464
day.Season-49 -9.700e-01 1.239e+00 -0.783 0.433675
day.Season-50 -1.260e+00 1.239e+00 -1.017 0.309175
day.Season-51 -2.400e-01 1.239e+00 -0.194 0.846395
day.Season-52 5.000e-02 1.239e+00 0.040 0.967807
day.Season-53 -2.500e-01 1.239e+00 -0.202 0.840079
day.Season-54 -1.280e+00 1.239e+00 -1.033 0.301560
day.Season-55 -2.020e+00 1.239e+00 -1.631 0.103069
day.Season-56 -1.310e+00 1.239e+00 -1.057 0.290374
day.Season-57 -1.390e+00 1.239e+00 -1.122 0.261923
day.Season-58 -6.700e-01 1.239e+00 -0.541 0.588651
day.Season-59 2.100e-01 1.239e+00 0.170 0.865399
day.Season-60 6.700e-01 1.239e+00 0.541 0.588651
day.Season-61 1.700e-01 1.239e+00 0.137 0.890858
day.Season-62 -6.700e-01 1.239e+00 -0.541 0.588651
day.Season-63 -1.800e-01 1.239e+00 -0.145 0.884482
day.Season-64 4.800e-01 1.239e+00 0.387 0.698433
day.Season-65 -5.673e-13 1.239e+00 0.000 1.000000
day.Season-66 -2.800e-01 1.239e+00 -0.226 0.821196
day.Season-67 -4.600e-01 1.239e+00 -0.371 0.710418
day.Season-68 -1.220e+00 1.239e+00 -0.985 0.324784
day.Season-69 -2.370e+00 1.239e+00 -1.913 0.055817 .
day.Season-70 -1.750e+00 1.239e+00 -1.413 0.157852
day.Season-71 -1.410e+00 1.239e+00 -1.138 0.255122
day.Season-72 -2.770e+00 1.239e+00 -2.236 0.025417 *
day.Season-73 -1.750e+00 1.239e+00 -1.413 0.157852
day.Season-74 -1.430e+00 1.239e+00 -1.154 0.248445
day.Season-75 -1.240e+00 1.239e+00 -1.001 0.316917
day.Season-76 -2.080e+00 1.239e+00 -1.679 0.093239 .
day.Season-77 -1.380e+00 1.239e+00 -1.114 0.265370
day.Season-78 -7.500e-01 1.239e+00 -0.605 0.544940
day.Season-79 -4.200e-01 1.239e+00 -0.339 0.734603
day.Season-80 -1.170e+00 1.239e+00 -0.944 0.345004
day.Season-81 -6.300e-01 1.239e+00 -0.509 0.611098
day.Season-82 -8.500e-01 1.239e+00 -0.686 0.492669
day.Season-83 -1.350e+00 1.239e+00 -1.090 0.275898
day.Season-84 -2.160e+00 1.239e+00 -1.744 0.081319 .
day.Season-85 -2.550e+00 1.239e+00 -2.058 0.039627 *
day.Season-86 -1.910e+00 1.239e+00 -1.542 0.123216
day.Season-87 -2.830e+00 1.239e+00 -2.284 0.022408 *
day.Season-88 -3.370e+00 1.239e+00 -2.720 0.006555 **
day.Season-89 -2.120e+00 1.239e+00 -1.711 0.087115 .
day.Season-90 -1.780e+00 1.239e+00 -1.437 0.150850
day.Season-91 -2.320e+00 1.239e+00 -1.873 0.061188 .
day.Season-92 -2.670e+00 1.239e+00 -2.155 0.031210 *
day.Season-93 -2.800e+00 1.239e+00 -2.260 0.023871 *
day.Season-94 -1.780e+00 1.239e+00 -1.437 0.150850
day.Season-95 -1.980e+00 1.239e+00 -1.598 0.110068
day.Season-96 -2.100e+00 1.239e+00 -1.695 0.090135 .
day.Season-97 -3.080e+00 1.239e+00 -2.486 0.012958 *
day.Season-98 -3.360e+00 1.239e+00 -2.712 0.006717 **
day.Season-99 -3.840e+00 1.239e+00 -3.100 0.001953 **
day.Season-100 -3.990e+00 1.239e+00 -3.221 0.001291 **
day.Season-101 -2.560e+00 1.239e+00 -2.067 0.038858 *
day.Season-102 -2.160e+00 1.239e+00 -1.744 0.081319 .
day.Season-103 -1.610e+00 1.239e+00 -1.300 0.193815
day.Season-104 -1.890e+00 1.239e+00 -1.526 0.127189
day.Season-105 -2.630e+00 1.239e+00 -2.123 0.033827 *
day.Season-106 -4.940e+00 1.239e+00 -3.988 6.82e-05 ***
day.Season-107 -5.760e+00 1.239e+00 -4.650 3.46e-06 ***
day.Season-108 -4.830e+00 1.239e+00 -3.899 9.86e-05 ***
day.Season-109 -4.930e+00 1.239e+00 -3.980 7.05e-05 ***
day.Season-110 -5.600e+00 1.239e+00 -4.520 6.39e-06 ***
day.Season-111 -5.510e+00 1.239e+00 -4.448 8.96e-06 ***
day.Season-112 -4.810e+00 1.239e+00 -3.883 0.000105 ***
day.Season-113 -4.300e+00 1.239e+00 -3.471 0.000525 ***
day.Season-114 -4.390e+00 1.239e+00 -3.544 0.000400 ***
day.Season-115 -4.820e+00 1.239e+00 -3.891 0.000102 ***
day.Season-116 -5.250e+00 1.239e+00 -4.238 2.32e-05 ***
day.Season-117 -4.600e+00 1.239e+00 -3.713 0.000208 ***
day.Season-118 -4.540e+00 1.239e+00 -3.665 0.000251 ***
day.Season-119 -4.040e+00 1.239e+00 -3.261 0.001121 **
day.Season-120 -5.370e+00 1.239e+00 -4.335 1.50e-05 ***
day.Season-121 -5.190e+00 1.239e+00 -4.190 2.87e-05 ***
day.Season-122 -4.040e+00 1.239e+00 -3.261 0.001121 **
day.Season-123 -3.970e+00 1.239e+00 -3.205 0.001365 **
day.Season-124 -3.810e+00 1.239e+00 -3.076 0.002118 **
day.Season-125 -4.400e+00 1.239e+00 -3.552 0.000388 ***
day.Season-126 -4.250e+00 1.239e+00 -3.431 0.000609 ***
day.Season-127 -6.030e+00 1.239e+00 -4.868 1.18e-06 ***
day.Season-128 -6.290e+00 1.239e+00 -5.077 4.04e-07 ***
day.Season-129 -6.210e+00 1.239e+00 -5.013 5.65e-07 ***
day.Season-130 -5.070e+00 1.239e+00 -4.093 4.37e-05 ***
day.Season-131 -5.110e+00 1.239e+00 -4.125 3.80e-05 ***
day.Season-132 -4.880e+00 1.239e+00 -3.939 8.34e-05 ***
day.Season-133 -6.310e+00 1.239e+00 -5.094 3.71e-07 ***
day.Season-134 -5.860e+00 1.239e+00 -4.730 2.34e-06 ***
day.Season-135 -6.330e+00 1.239e+00 -5.110 3.41e-07 ***
day.Season-136 -6.760e+00 1.239e+00 -5.457 5.20e-08 ***
day.Season-137 -6.750e+00 1.239e+00 -5.449 5.44e-08 ***
day.Season-138 -6.380e+00 1.239e+00 -5.150 2.76e-07 ***
day.Season-139 -6.120e+00 1.239e+00 -4.940 8.19e-07 ***
day.Season-140 -7.570e+00 1.239e+00 -6.111 1.11e-09 ***
day.Season-141 -8.150e+00 1.239e+00 -6.579 5.49e-11 ***
day.Season-142 -8.300e+00 1.239e+00 -6.700 2.44e-11 ***
day.Season-143 -6.860e+00 1.239e+00 -5.538 3.31e-08 ***
day.Season-144 -5.550e+00 1.239e+00 -4.480 7.71e-06 ***
day.Season-145 -6.390e+00 1.239e+00 -5.158 2.64e-07 ***
day.Season-146 -6.980e+00 1.239e+00 -5.634 1.90e-08 ***
day.Season-147 -6.420e+00 1.239e+00 -5.182 2.32e-07 ***
day.Season-148 -6.150e+00 1.239e+00 -4.964 7.24e-07 ***
day.Season-149 -6.260e+00 1.239e+00 -5.053 4.58e-07 ***
day.Season-150 -5.630e+00 1.239e+00 -4.545 5.70e-06 ***
day.Season-151 -6.790e+00 1.239e+00 -5.481 4.55e-08 ***
day.Season-152 -7.660e+00 1.239e+00 -6.183 7.05e-10 ***
day.Season-153 -8.000e+00 1.239e+00 -6.458 1.22e-10 ***
day.Season-154 -8.840e+00 1.239e+00 -7.136 1.18e-12 ***
day.Season-155 -7.070e+00 1.239e+00 -5.707 1.25e-08 ***
day.Season-156 -8.060e+00 1.239e+00 -6.506 8.87e-11 ***
day.Season-157 -8.520e+00 1.239e+00 -6.878 7.26e-12 ***
day.Season-158 -8.290e+00 1.239e+00 -6.692 2.58e-11 ***
day.Season-159 -8.730e+00 1.239e+00 -7.047 2.22e-12 ***
day.Season-160 -8.300e+00 1.239e+00 -6.700 2.44e-11 ***
day.Season-161 -7.160e+00 1.239e+00 -5.780 8.18e-09 ***
day.Season-162 -7.980e+00 1.239e+00 -6.442 1.35e-10 ***
day.Season-163 -8.610e+00 1.239e+00 -6.950 4.38e-12 ***
day.Season-164 -8.590e+00 1.239e+00 -6.934 4.90e-12 ***
day.Season-165 -7.660e+00 1.239e+00 -6.183 7.05e-10 ***
day.Season-166 -7.800e+00 1.239e+00 -6.296 3.45e-10 ***
day.Season-167 -8.800e+00 1.239e+00 -7.104 1.48e-12 ***
day.Season-168 -9.730e+00 1.239e+00 -7.854 5.40e-15 ***
day.Season-169 -9.840e+00 1.239e+00 -7.943 2.68e-15 ***
day.Season-170 -8.620e+00 1.239e+00 -6.958 4.14e-12 ***
day.Season-171 -8.660e+00 1.239e+00 -6.991 3.30e-12 ***
day.Season-172 -8.700e+00 1.239e+00 -7.023 2.63e-12 ***
day.Season-173 -8.940e+00 1.239e+00 -7.217 6.59e-13 ***
day.Season-174 -9.080e+00 1.239e+00 -7.330 2.89e-13 ***
day.Season-175 -9.040e+00 1.239e+00 -7.297 3.66e-13 ***
day.Season-176 -9.250e+00 1.239e+00 -7.467 1.05e-13 ***
day.Season-177 -1.005e+01 1.239e+00 -8.113 6.92e-16 ***
day.Season-178 -9.360e+00 1.239e+00 -7.556 5.37e-14 ***
day.Season-179 -8.810e+00 1.239e+00 -7.112 1.40e-12 ***
day.Season-180 -9.280e+00 1.239e+00 -7.491 8.73e-14 ***
day.Season-181 -7.940e+00 1.239e+00 -6.409 1.67e-10 ***
day.Season-182 -8.030e+00 1.239e+00 -6.482 1.04e-10 ***
day.Season-183 -8.720e+00 1.239e+00 -7.039 2.35e-12 ***
day.Season-184 -8.970e+00 1.239e+00 -7.241 5.53e-13 ***
day.Season-185 -9.620e+00 1.239e+00 -7.766 1.08e-14 ***
day.Season-186 -9.780e+00 1.239e+00 -7.895 3.93e-15 ***
day.Season-187 -9.860e+00 1.239e+00 -7.959 2.36e-15 ***
day.Season-188 -9.270e+00 1.239e+00 -7.483 9.27e-14 ***
day.Season-189 -8.100e+00 1.239e+00 -6.539 7.18e-11 ***
day.Season-190 -8.250e+00 1.239e+00 -6.660 3.20e-11 ***
day.Season-191 -9.600e+00 1.239e+00 -7.749 1.22e-14 ***
day.Season-192 -9.160e+00 1.239e+00 -7.394 1.80e-13 ***
day.Season-193 -8.850e+00 1.239e+00 -7.144 1.11e-12 ***
day.Season-194 -9.470e+00 1.239e+00 -7.644 2.74e-14 ***
day.Season-195 -1.003e+01 1.239e+00 -8.097 7.89e-16 ***
day.Season-196 -9.400e+00 1.239e+00 -7.588 4.21e-14 ***
day.Season-197 -9.710e+00 1.239e+00 -7.838 6.13e-15 ***
day.Season-198 -9.380e+00 1.239e+00 -7.572 4.75e-14 ***
day.Season-199 -9.060e+00 1.239e+00 -7.314 3.25e-13 ***
day.Season-200 -8.760e+00 1.239e+00 -7.071 1.87e-12 ***
[ reached getOption("max.print") -- omitted 165 rows ]
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.77 on 3285 degrees of freedom
Multiple R-squared: 0.5834, Adjusted R-squared: 0.5372
F-statistic: 12.64 on 364 and 3285 DF, p-value: < 2.2e-16
#output hidden due to large output
month.=season(monthlyMinTempTS) # period added to improve table display and this line sets up indicators
model2=lm(monthlyMinTempTS~month.-1) # -1 removes the intercept term
summary(model2)
Call:
lm(formula = monthlyMinTempTS ~ month. - 1)
Residuals:
Min 1Q Median 3Q Max
-2.05065 -0.62012 0.03393 0.55556 2.68258
Coefficients:
Estimate Std. Error t value Pr(>|t|)
month.January 15.0303 0.3092 48.60 <2e-16 ***
month.February 15.3783 0.3092 49.73 <2e-16 ***
month.March 14.5655 0.3092 47.10 <2e-16 ***
month.April 12.0883 0.3092 39.09 <2e-16 ***
month.May 9.8665 0.3092 31.91 <2e-16 ***
month.June 7.2783 0.3092 23.54 <2e-16 ***
month.July 6.6926 0.3092 21.64 <2e-16 ***
month.August 7.8913 0.3092 25.52 <2e-16 ***
month.September 8.9763 0.3092 29.03 <2e-16 ***
month.October 10.3094 0.3092 33.34 <2e-16 ***
month.November 12.4797 0.3092 40.36 <2e-16 ***
month.December 13.8532 0.3092 44.80 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9779 on 108 degrees of freedom
Multiple R-squared: 0.9936, Adjusted R-squared: 0.9929
F-statistic: 1405 on 12 and 108 DF, p-value: < 2.2e-16
All of the parameters corresponding to months are statistically significant at 5% level
Including intercept parameter
model3=lm(monthlyMinTempTS~month.) # remove -1 to include the intercept term in the model
summary(model3)
Call:
lm(formula = monthlyMinTempTS ~ month.)
Residuals:
Min 1Q Median 3Q Max
-2.05065 -0.62012 0.03393 0.55556 2.68258
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15.0303 0.3092 48.605 < 2e-16 ***
month.February 0.3480 0.4373 0.796 0.42792
month.March -0.4648 0.4373 -1.063 0.29019
month.April -2.9420 0.4373 -6.727 8.56e-10 ***
month.May -5.1639 0.4373 -11.808 < 2e-16 ***
month.June -7.7520 0.4373 -17.726 < 2e-16 ***
month.July -8.3377 0.4373 -19.065 < 2e-16 ***
month.August -7.1390 0.4373 -16.324 < 2e-16 ***
month.September -6.0540 0.4373 -13.843 < 2e-16 ***
month.October -4.7210 0.4373 -10.795 < 2e-16 ***
month.November -2.5507 0.4373 -5.832 5.77e-08 ***
month.December -1.1772 0.4373 -2.692 0.00824 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9779 on 108 degrees of freedom
Multiple R-squared: 0.912, Adjusted R-squared: 0.9031
F-statistic: 101.8 on 11 and 108 DF, p-value: < 2.2e-16
all of the differences between January and the other months are statistically significant at 5% level in both models model2 and model3
Plot the Time series data of Monthly minimum temperatures first
plot(monthlyMinTempTS, col=c("blue"), ylab='Min temperature', xlab='Time', type='o',
main = "Time series plot of monthly minimum temperature in Melbourne.")
We only see a seasonal trend and there is no other trend associated with the data
#Plotting ACF with max 36 lags Because we have strong correlation at lags 12,24,36 and so on we consider the existence of seasonal auto correlation relationship.There is also a substantial correlation that needs to be considered.
par(mfrow=c(1,2))
acf(monthlyMinTempTS,lag.max=36)
PACF
Pacf(monthlyMinTempTS,lag.max=36)
adf.test(monthlyMinTempTS, alternative = c("stationary"))
p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: monthlyMinTempTS
Dickey-Fuller = -10.105, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary
Plot after taking seasonal differencing Since we have a seasonal effect pattern in time series plot, we take a seasonal difference
plot(diff(monthlyMinTempTS,lag=12),xlab='Time',ylab='First and Seasonal Difference of CO2',main= "Figure 11. Time series plot of the seasonal differences of Monthly temp levels.")
ACF of the first and seasonal differences of monthly Temperature levels
acf(as.vector(diff(monthlyMinTempTS,lag=12)),lag.max=36,ci.type='ma',main="Fig-ACF of seasonal differences of monthly temp levels")
PACF of the seasonal differences of monthly Temperature levels
pacf(as.vector(diff(monthlyMinTempTS,lag=12)),lag.max=36,ci.type='ma',main="Fig-PACF of seasonal differences of monthly temp levels")
##Residuals approach
###Specification of seasonal component When D=1 ###Fit m1 = SARIMA(0,0,0)x(0,1,0)12 model and display time series plots of the residuals
m1.monthlyMinTempTS = arima(monthlyMinTempTS,order=c(0,0,0),seasonal=list(order=c(0,1,0), period=12))
res.m1 = residuals(m1.monthlyMinTempTS);
plot(res.m1,xlab='Time',ylab='Residuals',main="m1-Time series plot of the residuals")
###ACF and PACF of residuals-m1
par(mfrow=c(1,2))
acf(res.m1, lag.max = 36, main = "m1-ACF of residuals")
PACF
pacf(res.m1, lag.max = 36, main = "m1-PACF")
From the above plots, there is no pattern implying the existence of a seasonal trend .But We still have one significant correlation at the first seasonal lag in both ACF and PACF.The significant spike at lag 12 in the ACF suggests a seasonal MA(1) component. So we start with SARMA(0,1) and see if we get rid of the effect of the seasonal component on the residuals
m2.monthlyMinTempTS = arima(monthlyMinTempTS,order=c(0,0,0),seasonal=list(order=c(0,1,1), period=12))
res.m2 = residuals(m2.monthlyMinTempTS);
plot(res.m2,xlab='Time',ylab='Residuals',main="m2-Time series plot of the residuals")
###ACF and PACF of residuals-m3
par(mfrow=c(1,2))
acf(res.m2, lag.max = 36, main = "m2-ACF of residuals")
pacf(res.m2, lag.max = 36, main = "m2-PACF of residuals")
At the seasonal lags, especially at the first seasonal lag in ACF, the autocorrelations became insignificant or slightly significant after adding the seasonal component. We will use the resulting ACF and PACF plots to specify the orders of ARMA component as there is no highly significant correlation at the lags s,2s,3s,….
The last ACF-PACF pair has significant pikes at the first lags implying an ARMA(1,1) model or due to the (insignificant) decrease of correlations in PACF, we would specify ARMA(0,1) for the residuals.
So we are settled with (0,1,1) for seasonal component as (P,D,Q).
The data may follow an ARIMA(p ,d ,0) model if the ACF and PACF plots of the differenced data show the following patterns:
(For the non seasoning component)Also ,there is a significant spike at lag 2 in the PACF, but none beyond lag 2 which suggests a (2,d,0) model.The pattern in the first two spikes is what we would expect from an ARIMA(2,0,0), as the PACF tends to decrease. So in this case, the ACF and PACF lead us to think an ARIMA(2,0,0) model might be appropriate
By getting p as 2 , we can add MA component upto order 1 maximum
Check stationarity by doing ADF test to identify if we need ordinary differencing or not
adf.test(monthlyMinTempTS, alternative = c("stationary"))
p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: monthlyMinTempTS
Dickey-Fuller = -10.105, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary
The ADF unit root test shows a p-value of less than 0.05 which means the null hypothesis of non stationarity is now rejected and the data is proved to be stationary .Therefore we do not need differencing and fix d as 0 d=0 IS FIXED
Now we set out to specify models for non seasonal ARIMA component We begin with (2,0,1) as mentioned earlier
m3.monthlyMinTempTS = arima(monthlyMinTempTS,order=c(2,0,1),seasonal=list(order=c(0,1,1), period=12))
res.m3 = residuals(m3.monthlyMinTempTS);
plot(res.m3,xlab='Time',ylab='Residuals',main="m3-Time series plot of the residuals")
###ACF and PACF of residuals-m4
par(mfrow=c(1,2))
acf(res.m3, lag.max = 36, main = "m3-ACF of residuals")
pacf(res.m3, lag.max = 36, main = "m3-PACF of residuals")
Looks almost like a white noise series from the above plots, but we cannot confirm it until we check for (2,0,0) as well since MA component can be upto order 1. It can be 0 as well
m4.monthlyMinTempTS = arima(monthlyMinTempTS,order=c(2,0,0),seasonal=list(order=c(0,1,1), period=12))
res.m4 = residuals(m4.monthlyMinTempTS);
plot(res.m4,xlab='Time',ylab='Residuals',main="m4-Time series plot of the residuals")
###ACF and PACF of residuals-m5
par(mfrow=c(1,2))
acf(res.m4, lag.max = 36, main = "m4-ACF of residuals")
pacf(res.m4, lag.max = 36, main = "m4-PACF of residuals")
**We can clearly see that even though m3 and m4 looks closer, the residuals of m4-> SARIMA(2,0,0)*(0,1,1)s=12 are closer to the white noise .So we can pretty much conclude with the orders p=2,d=0,q=0,P=0,D=1,Q=1,and s=12**
EACF of the previous model The EACF suggests (0,1) and (1,1) as well for the non seasonal component
eacf(res.m4)
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 o o o o o o o o o o o o o o
1 x o o o o o o o o o o o o o
2 x x o o o o o o o o o o o o
3 o o x o o o o o o o o o o o
4 x o o o o o o o o o o o o o
5 o o o x o o o o o o o o o o
6 x o o x o o o o o o o o o o
7 x x x x x o o o o o o o o o
So our estimated models are - (0,0,1)(0,1,1)s=12 - (1,0,1)(0,1,1)s=12 - (2,0,1)(0,1,1)s=12 - (2,0,0)(0,1,1)s=12. Next we will fit these models
###**Fitting the SARIMA(0,0,1)*(0,1,1)S=12 model using ML method**
mf_001_ml = arima(monthlyMinTempTS,order=c(0,0,1),seasonal=list(order=c(0,1,1),
period=12),method = "ML")
coeftest(mf_001_ml)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
ma1 0.169651 0.085367 1.9873 0.04689 *
sma1 -0.802720 0.150343 -5.3393 9.332e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
###**Fitting the SARIMA(1,0,1)*(0,1,1)S=12 model**
mf_101_ml = arima(monthlyMinTempTS,order=c(1,0,1),seasonal=list(order=c(0,1,1),
period=12),method = "ML")
coeftest(mf_101_ml)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
ar1 0.79671 0.16339 4.8760 1.082e-06 ***
ma1 -0.58510 0.21173 -2.7634 0.005721 **
sma1 -0.80772 0.15120 -5.3420 9.192e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
###**Fitting the SARIMA(2,0,1)*(0,1,1)S=12 model**
mf_201_ml = arima(monthlyMinTempTS,order=c(2,0,1),seasonal=list(order=c(0,1,1),
period=12),method = "ML")
coeftest(mf_201_ml)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
ar1 0.41314 0.38629 1.0695 0.2848
ar2 0.17411 0.14834 1.1737 0.2405
ma1 -0.23140 0.39513 -0.5856 0.5581
sma1 -0.78151 0.14788 -5.2848 1.258e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
###**Fitting the SARIMA(2,0,0)*(0,1,1)S=12 model**
mf_200_ml = arima(monthlyMinTempTS,order=c(2,0,0),seasonal=list(order=c(0,1,1),
period=12),method = "ML")
coeftest(mf_200_ml)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
ar1 0.197848 0.099944 1.9796 0.04775 *
ar2 0.225413 0.098794 2.2817 0.02251 *
sma1 -0.791129 0.150271 -5.2647 1.404e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
sc.AIC=AIC(mf_001_ml, mf_101_ml, mf_201_ml, mf_200_ml)
sc.BIC=AIC(mf_001_ml, mf_101_ml, mf_201_ml, mf_200_ml, k = length(monthlyMinTempTS))
###AIC values comparison The SARIMA model (2,0,0)*(0,1,1) has the least AIC value and hence is considered to be the best model fit
sort.score(sc.AIC, score = "aic")
###BIC values comparison The saSARIMA model (0,0,1) * (0,1,1) has the lease BIC value closely followed by (2,0,0)*(0,1,1)
sort.score(sc.BIC, score = "aic")
**Over-parameterizations: SARIMA(3,0,0)x(1,1,1)_12**
plot(window(rstandard(mf_200_ml),start=c(1981,1)), ylab='Standardized Residuals',type='o', main="Residuals from the ARIMA(2,0,0)x(0,1,1)_12 model.")
abline(h=0)
Since we have some strange behaviour near 1983 , we will have to investigate this best model further for outliers
**plot the ACF of standardised residuals of our best model(0,1,1)*(0,1,1)s=12**
acf(as.vector(window(rstandard(mf_201_ml),start=c(1981,1))),lag.max=36,main="ACF of residuals from the ARIMA(2,0,0)x(0,1,1)_12 model.")
Except for the slightly significant autocorrelation at lag 25, there is not any sign of violation of the independence of residuals.
Function for residual analysis
residual.analysis <- function(model, std = TRUE){
library(TSA)
library(FitAR)
if (std == TRUE){
res.model = rstandard(model)
}else{
res.model = residuals(model)
}
par(mfrow=c(3,2))
plot(res.model,type='o',ylab='Standardized residuals', main="Time series plot of standardized residuals")
abline(h=0)
hist(res.model,main="Histogram of standardized residuals")
qqnorm(res.model,main="QQ plot of standardized residuals")
qqline(res.model, col = 2)
acf(res.model,main="ACF of standardized residuals")
print(shapiro.test(res.model))
k=0
LBQPlot(res.model, lag.max = length(model$residuals)-1 , StartLag = k + 1, k = 0, SquaredQ = FALSE)
}
Please perform Box test,hist,qqnorm,qqline,shapiro-wilk test for all 3 models given below
residual.analysis(model=mf_001_ml)
Shapiro-Wilk normality test
data: res.model
W = 0.97776, p-value = 0.04413
SARIMA(0,0,1)x(0,1,1)_12 1. Time Series plots does not have any trend except for some intervention. 2. Histogram shows residuals slightly skewed to the right. 3. QQplot is normally distributed with 1 or 2 outliers. 4. ACF plot highlights the white noise presence. 5. Ljung Box test shows the presence of residuals.
residual.analysis(model=mf_101_ml)
Shapiro-Wilk normality test
data: res.model
W = 0.98339, p-value = 0.1467
SARIMA(1,0,1)x(0,1,1)_12 1. Time Series plots does not have any trend except for some intervention. 2. Histogram is slightly skewed to the right. 3. QQplot is normally distributed except for some points not lying on the line. 4. ACF plot highlights the white noise presence. 5. Ljung Box test shows the presence of good amount of residuals.
residual.analysis(model=mf_201_ml)
Shapiro-Wilk normality test
data: res.model
W = 0.98111, p-value = 0.09036
SARIMA(2,0,1)x(0,1,1)_12 1. Time Series plots does not have any trend except for some intervention. 2. Histogram looks to be very slightly skewed to the right. 3. QQplot is normally distributed except for one or two outliers. 4. ACF plot highlights the white noise presence. 5. Ljung Box test shows the presence of residuals.
residual.analysis(model=mf_200_ml)
Shapiro-Wilk normality test
data: res.model
W = 0.98135, p-value = 0.09516
SARIMA(2,0,0)x(0,1,1)_12 1. Time Series plots does not have any trend except for some intervention. 2. Histogram looks to be very slightly skewed to the right. 3. QQplot is normally distributed except for one or two outliers. 4. ACF plot highlights the white noise presence. 5. Ljung Box test shows the presence of residuals.
With SARIMA(2,0,0) * (0,1,1) being the best model with the S is equal to 12, below is the forcast for next 10 years.
sarima.for(monthlyMinTempTS,120,2,0,0,0,1,1,12)
$pred
Jan Feb Mar Apr May Jun Jul
1991 15.248472 15.486091 14.845551 12.556087 10.210343 7.579259 7.102069
1992 15.258478 15.474305 14.868050 12.580553 10.242439 7.613236 7.138020
1993 15.296362 15.512218 14.905981 12.618493 10.280385 7.651184 7.175971
1994 15.334314 15.550171 14.943933 12.656446 10.318337 7.689137 7.213923
1995 15.372267 15.588124 14.981886 12.694399 10.356290 7.727090 7.251876
1996 15.410220 15.626076 15.019839 12.732351 10.394243 7.765042 7.289829
1997 15.448173 15.664029 15.057791 12.770304 10.432196 7.802995 7.327781
1998 15.486125 15.701982 15.095744 12.808257 10.470148 7.840948 7.365734
1999 15.524078 15.739935 15.133697 12.846209 10.508101 7.878900 7.403687
2000 15.562031 15.777887 15.171649 12.884162 10.546054 7.916853 7.441639
Aug Sep Oct Nov Dec
1991 7.992644 9.220982 10.615139 12.752758 14.195904
1992 8.029371 9.258276 10.652706 12.790498 14.233734
1993 8.067323 9.296228 10.690659 12.828450 14.271687
1994 8.105275 9.334181 10.728611 12.866403 14.309640
1995 8.143228 9.372134 10.766564 12.904356 14.347592
1996 8.181181 9.410086 10.804517 12.942308 14.385545
1997 8.219133 9.448039 10.842469 12.980261 14.423498
1998 8.257086 9.485992 10.880422 13.018214 14.461450
1999 8.295039 9.523944 10.918375 13.056166 14.499403
2000 8.332991 9.561897 10.956327 13.094119 14.537356
$se
Jan Feb Mar Apr May Jun Jul
1991 0.9893561 1.0073505 1.0368022 1.0404600 1.0427207 1.0431994 1.0434020
1992 1.0604507 1.0611095 1.0622007 1.0623390 1.0624243 1.0624424 1.0624497
1993 1.0790274 1.0796649 1.0807299 1.0808644 1.0809477 1.0809652 1.0809724
1994 1.0972701 1.0978970 1.0989444 1.0990766 1.0991585 1.0991758 1.0991828
1995 1.1152144 1.1158312 1.1168618 1.1169919 1.1170725 1.1170895 1.1170964
1996 1.1328745 1.1334817 1.1344963 1.1346244 1.1347037 1.1347204 1.1347272
1997 1.1502635 1.1508616 1.1518608 1.1519870 1.1520651 1.1520816 1.1520883
1998 1.1673936 1.1679828 1.1689674 1.1690917 1.1691687 1.1691849 1.1691916
1999 1.1842758 1.1848567 1.1858273 1.1859498 1.1860257 1.1860417 1.1860482
2000 1.2009208 1.2014936 1.2024508 1.2025716 1.2026464 1.2026622 1.2026687
Aug Sep Oct Nov Dec
1991 1.0434553 1.0434710 1.0434704 1.0434219 1.0433923
1992 1.0624510 1.0624478 1.0624418 1.0623923 1.0623627
1993 1.0809736 1.0809705 1.0809645 1.0809159 1.0808868
1994 1.0991840 1.0991809 1.0991751 1.0991273 1.0990987
1995 1.1170975 1.1170945 1.1170888 1.1170418 1.1170136
1996 1.1347283 1.1347254 1.1347197 1.1346734 1.1346457
1997 1.1520894 1.1520865 1.1520809 1.1520353 1.1520080
1998 1.1691927 1.1691898 1.1691843 1.1691394 1.1691125
1999 1.1860493 1.1860465 1.1860411 1.1859968 1.1859703
2000 1.2026697 1.2026670 1.2026616 1.2026179 1.2025918
###Using auto.arima function To automatically fit the data we have and check the best model that could be fitted for our data using auto.arima function The auto.arima() function uses nsdiffs() to determine
D (the number of seasonal differences to use), and ndiffs() to determine
d (the number of ordinary differences to use). The selection of the other model parameters (p,q,P and Q ) are all determined by minimizing the AICc, as with non-seasonal ARIMA models. This result also comes as SARIMA(2,0,0)*(0,1,1)S=12
autoArimaModel = auto.arima(monthlyMinTempTS)
autoArimaModel
Series: monthlyMinTempTS
ARIMA(2,0,0)(0,1,1)[12]
Coefficients:
ar1 ar2 sma1
0.1978 0.2254 -0.7911
s.e. 0.0999 0.0988 0.1503
sigma^2 estimated as 1.013: log likelihood=-158.37
AIC=324.74 AICc=325.13 BIC=335.47