In this study, We aim to analyse and forecast horizontal solar radiation reaching the Earth’s surface at specific global locations using monthly average solar radiation and precipitation data spanning from January 1960 to December 2014. The goal is to provide accurate two-year ahead forecasts based on the most suited model. We’ll explore various time series regression methods, including dLagM, exponential smoothing and the corresponding state-space models and measure performance based on MASE, AIC and residual check.
data <- read_csv("data1.csv")
## Rows: 660 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): solar, ppt
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
## # A tibble: 6 × 2
## solar ppt
## <dbl> <dbl>
## 1 5.05 1.33
## 2 6.42 0.921
## 3 10.8 0.947
## 4 16.9 0.615
## 5 24.0 0.544
## 6 26.3 0.703
summary(data)
## solar ppt
## Min. : 1.582 Min. :0.0010
## 1st Qu.: 9.739 1st Qu.:0.2192
## Median :16.460 Median :0.4153
## Mean :17.788 Mean :0.4686
## 3rd Qu.:23.711 3rd Qu.:0.6251
## Max. :54.056 Max. :2.7213
solar.ts = ts(data$solar,start=c(1960,1), frequency=12)
head(solar.ts)
## Jan Feb Mar Apr May Jun
## 1960 5.051729 6.415832 10.847920 16.930264 24.030797 26.298202
ppt.ts= ts(data$ppt,start=c(1960,1), frequency=12)
head(ppt.ts)
## Jan Feb Mar Apr May Jun
## 1960 1.333 0.921 0.947 0.615 0.544 0.703
plot(solar.ts, main = " Solar Radiation Series", ylab = "Amount of solar radiation", xlab = "Time")
points(solar.ts,x=time(solar.ts), pch=as.vector(season(solar.ts)),col=rainbow(12))
The Solar radiation series plot shows no apparent trend for the given time period. There is obvious seasonality in the series, changing variance is not apparent and there is no intervention point between 1960 and 2014.
plot(ppt.ts, main = " Precipitation Series", ylab = "Amount of Precipitation", xlab = "Time")
points(ppt.ts,x=time(ppt.ts), pch=as.vector(season(ppt.ts)),col=rainbow(12))
The Precipitation series plot shows no apparent trend for the given time period. There is obvious seasonality in the series, changing variance is not apparent and there is no intervention point between 1960 and 2014.
data.ts = ts(data, start = 1960, frequency = 12) #converting data to time series object
data.scale=scale(data.ts) #scaling data
plot(data.scale, plot.type="s", col=c("gold","black"),
main = " All Series", xlab="Year", ylab="Scaled Data")
legend ("topleft", lty =1, text.width=1.5, col=c("gold","black"),
c('Solar Radiation', 'Precipitation'))
par(mfrow=c(1,2))
par(mar=c(4,4,4,4))
acf(solar.ts, main = " ACF for Solar Radiation series", )
pacf(solar.ts, main = "PACF for Solar Radiation series")
par(mfrow=c(1,2))
par(mar=c(4,4,4,4))
acf(ppt.ts, main = " ACF for Precipitation series", )
pacf(ppt.ts, main = " PACF for Precipitation series")
With respect to both ACF plots. We observe positive values which gradually decrease as the lags increase. This pattern suggests the presence of a trend in the series. Additionally, the wave like pattern strongly suggests the presence of seasonality.
k = ar(solar.ts)$order # length of lag for ADF
adf.test(solar.ts,k=k)
## Warning in adf.test(solar.ts, k = k): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: solar.ts
## Dickey-Fuller = -4.557, Lag order = 25, p-value = 0.01
## alternative hypothesis: stationary
k = ar(ppt.ts)$order # length of lag for ADF
adf.test(ppt.ts,k=k)
##
## Augmented Dickey-Fuller Test
##
## data: ppt.ts
## Dickey-Fuller = -3.2594, Lag order = 28, p-value = 0.07769
## alternative hypothesis: stationary
The ADF test for both series have a P-value < 0.05 therefore we reject the H0 (null hypothesis) and conclude all series are stationary at 5% level.
solar.x12 <- x12(solar.ts)
plot(solar.x12, sa= TRUE, trend= TRUE, main= "X12 Decomposition of Solar Radiation series")
ppt.x12 <- x12(ppt.ts)
plot(ppt.x12, sa= TRUE, trend= TRUE, main= "X12 Decomposition of Precipitation series")
The X12 decomposition for both series shows the seasonally adjusted and trend lines have different heights from the original data for most of the plot. This suggests both components have influence over the original series.
for (i in 1:10){
model1 = dlm(x = as.vector(ppt.ts), y = as.vector(solar.ts), q = i)
cat("q = ", i, "AIC = ", AIC(model1$model), "BIC = ", BIC(model1$model),
"MASE =", MASE(model1)$MASE , "\n")
}
## q = 1 AIC = 4728.713 BIC = 4746.676 MASE = 1.688457
## q = 2 AIC = 4712.649 BIC = 4735.095 MASE = 1.675967
## q = 3 AIC = 4688.551 BIC = 4715.478 MASE = 1.662703
## q = 4 AIC = 4663.6 BIC = 4695.003 MASE = 1.646357
## q = 5 AIC = 4644.622 BIC = 4680.499 MASE = 1.613848
## q = 6 AIC = 4637.489 BIC = 4677.837 MASE = 1.607532
## q = 7 AIC = 4632.716 BIC = 4677.532 MASE = 1.607042
## q = 8 AIC = 4625.986 BIC = 4675.267 MASE = 1.604806
## q = 9 AIC = 4615.084 BIC = 4668.827 MASE = 1.593121
## q = 10 AIC = 4602.658 BIC = 4660.858 MASE = 1.577996
The lag length q=10 denotes the lowest AIC ,BIC values and MASE values.
model1 = dlm(x = as.vector(ppt.ts), y = as.vector(solar.ts), q =10)
AIC_value <- AIC(model1)
## [1] 4602.658
BIC_value <- BIC(model1)
## [1] 4660.858
MASE_value <- MASE(model1)
model_accuracy <- data.frame(MASE = MASE_value,AIC = AIC_value, BIC = BIC_value)
model_accuracy
## MASE AIC BIC
## model1 1.577996 4602.658 4660.858
model2 = polyDlm(x = as.vector(ppt.ts), y = as.vector(solar.ts), q = 10, k = 2, show.beta = TRUE)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -5.040 0.435 -11.600 2.39e-28
## beta.1 -2.320 0.314 -7.400 4.33e-13
## beta.2 -0.188 0.251 -0.749 4.54e-01
## beta.3 1.370 0.237 5.770 1.23e-08
## beta.4 2.340 0.243 9.600 1.74e-20
## beta.5 2.720 0.248 11.000 7.32e-26
## beta.6 2.530 0.243 10.400 1.40e-23
## beta.7 1.750 0.235 7.430 3.39e-13
## beta.8 0.387 0.249 1.560 1.20e-01
## beta.9 -1.560 0.312 -4.990 7.68e-07
## beta.10 -4.090 0.433 -9.440 7.08e-20
AIC_value <- AIC(model2)
## [1] 4591.904
BIC_value <- BIC(model2)
## [1] 4614.289
MASE_value <- MASE(model2)
model_accuracy <- data.frame(MASE = MASE_value,AIC = AIC_value, BIC = BIC_value)
model_accuracy
## MASE AIC BIC
## model2 1.592555 4591.904 4614.289
model3 = koyckDlm(x = as.vector(ppt.ts), y = as.vector(solar.ts))
AIC_value <- AIC(model3)
## [1] 3946.476
BIC_value <- BIC(model3)
## [1] 3964.439
MASE_value <- MASE(model3)
model_accuracy <- data.frame(MASE = MASE_value,AIC = AIC_value, BIC = BIC_value)
model_accuracy
## MASE AIC BIC
## model3 1.032483 3946.476 3964.439
for(i in 1:5){
for(j in 1:5){
model4 = ardlDlm(x = as.vector(ppt.ts), y = as.vector(solar.ts), p = i, q = j)
cat("p = ", i, "q = ", j, "AIC = ", AIC(model4$model), "BIC = ", BIC(model4$model),"MASE =", MASE(model4)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = 3712.311 BIC = 3734.765 MASE = 0.8392434
## p = 1 q = 2 AIC = 3239.416 BIC = 3266.352 MASE = 0.4971918
## p = 1 q = 3 AIC = 3143.522 BIC = 3174.936 MASE = 0.4740063
## p = 1 q = 4 AIC = 3138.399 BIC = 3174.288 MASE = 0.4697571
## p = 1 q = 5 AIC = 3100.283 BIC = 3140.644 MASE = 0.450425
## p = 2 q = 1 AIC = 3639.223 BIC = 3666.159 MASE = 0.7834855
## p = 2 q = 2 AIC = 3229.051 BIC = 3260.476 MASE = 0.4951319
## p = 2 q = 3 AIC = 3137.634 BIC = 3173.535 MASE = 0.4738939
## p = 2 q = 4 AIC = 3132.962 BIC = 3173.337 MASE = 0.4702773
## p = 2 q = 5 AIC = 3097.288 BIC = 3142.134 MASE = 0.4503599
## p = 3 q = 1 AIC = 3608.793 BIC = 3640.207 MASE = 0.7572489
## p = 3 q = 2 AIC = 3226.623 BIC = 3262.524 MASE = 0.4955334
## p = 3 q = 3 AIC = 3139.409 BIC = 3179.798 MASE = 0.4737144
## p = 3 q = 4 AIC = 3134.777 BIC = 3179.638 MASE = 0.4701162
## p = 3 q = 5 AIC = 3098.808 BIC = 3148.139 MASE = 0.4502885
## p = 4 q = 1 AIC = 3602.664 BIC = 3638.553 MASE = 0.7580664
## p = 4 q = 2 AIC = 3224.285 BIC = 3264.66 MASE = 0.4959949
## p = 4 q = 3 AIC = 3131.289 BIC = 3176.15 MASE = 0.4695096
## p = 4 q = 4 AIC = 3131.424 BIC = 3180.772 MASE = 0.4665123
## p = 4 q = 5 AIC = 3096.024 BIC = 3149.839 MASE = 0.4479481
## p = 5 q = 1 AIC = 3599.402 BIC = 3639.764 MASE = 0.7572617
## p = 5 q = 2 AIC = 3221.853 BIC = 3266.699 MASE = 0.4954501
## p = 5 q = 3 AIC = 3127.103 BIC = 3176.434 MASE = 0.4675479
## p = 5 q = 4 AIC = 3127.868 BIC = 3181.684 MASE = 0.4651969
## p = 5 q = 5 AIC = 3097.877 BIC = 3156.177 MASE = 0.4479311
MASE and AIC are the lowest at p=4 and q=5.
model4 = ardlDlm(x = as.vector(ppt.ts), y = as.vector(solar.ts), p = 4, q = 5)
AIC_value <- AIC(model4)
## [1] 3096.024
BIC_value <- BIC(model4)
## [1] 3149.839
MASE_value <- MASE(model4)
model_accuracy <- data.frame(MASE = MASE_value,AIC = AIC_value, BIC = BIC_value)
model_accuracy
## MASE AIC BIC
## model4 0.4479481 3096.024 3149.839
models <- c("Finite DLM", "Poly DLM", "Koyck", "ARDL")
mase_values <- MASE(model1, model2, model3, model4)
# empty vectors to store AIC, BIC
aic_values <- numeric(length(models))
bic_values <- numeric(length(models))
# list of model
model_list <- list(model1, model2, model3, model4)
# Loop to calculate AIC & BIC for each model
for (i in 1:length(models)) {
aic_values[i] <- AIC(model_list[[i]])
bic_values[i] <- BIC(model_list[[i]])
}
## [1] 4602.658
## [1] 4660.858
## [1] 4591.904
## [1] 4614.289
## [1] 3946.476
## [1] 3964.439
## [1] 3096.024
## [1] 3149.839
results <- data.frame(ModelName = models, MASE = mase_values$MASE, AIC = aic_values, BIC = bic_values)
head(results)
## ModelName MASE AIC BIC
## 1 Finite DLM 1.5779955 4602.658 4660.858
## 2 Poly DLM 1.5925546 4591.904 4614.289
## 3 Koyck 1.0324829 3946.476 3964.439
## 4 ARDL 0.4479481 3096.024 3149.839
The ARDLM has the lowest MASE,AIC and BIC values therefore we further check the residuals and summary of the the model.
summary(model4)
##
## Time series regression with "ts" data:
## Start = 6, End = 660
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.5901 -1.3826 -0.2867 1.0526 18.5709
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.46103 0.43729 5.628 2.72e-08 ***
## X.t -0.61439 0.54768 -1.122 0.262364
## X.1 0.78469 0.77617 1.011 0.312409
## X.2 1.28470 0.79026 1.626 0.104509
## X.3 0.80457 0.77946 1.032 0.302355
## X.4 -1.20750 0.55570 -2.173 0.030148 *
## Y.1 1.27195 0.03849 33.049 < 2e-16 ***
## Y.2 -0.01868 0.06249 -0.299 0.765112
## Y.3 -0.40480 0.06019 -6.725 3.87e-11 ***
## Y.4 -0.23201 0.06221 -3.729 0.000209 ***
## Y.5 0.21746 0.03772 5.766 1.26e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.546 on 644 degrees of freedom
## Multiple R-squared: 0.9338, Adjusted R-squared: 0.9328
## F-statistic: 908.5 on 10 and 644 DF, p-value: < 2.2e-16
checkresiduals(model4$model)
##
## Breusch-Godfrey test for serial correlation of order up to 14
##
## data: Residuals
## LM test = 103.8, df = 14, p-value = 8.811e-16
From the ARDLM Model summary and residual diagnostics we observe:
Adjust R-squared is high at 0.9328 which means it only explains 93.28% of the variability.
The p-value: 2.2e-16 is < 0.05 therefore the model is significant.
The mean of the residuals is close to zero and there is no significant correlation in the residuals series.
ACF plot indicates presence of seasonality.
Histogram is normally distributed.
Breusch-Godfrey test p-value < 0.05 therefore there is significant autocorrelation remaining in the residuals.
Since there is presence of seasonality in the Solar Radiation series we will use Holt Winters modelling.
# Defined options for seasonal and damped
seasonal_options <- c("multiplicative", "additive")
damped_options <- c(FALSE, TRUE)
results <- list()
mase_values <- numeric()
model_names <- character()
aic_values <- numeric()
bic_values <- numeric()
# Loop all combinations of seasonal and damped
for (seasonal in seasonal_options) {
for (damped in damped_options) {
args <- list(seasonal = seasonal, damped = damped)
result <- hw(solar.ts, seasonal = args$seasonal, damped = args$damped, h=2*12)
sm <- summary(result)
mase <- accuracy(sm)
mase <- mase[,6]
aic <- sm$model$aic
bic <- sm$model$bic
model_name <- sm$method
# Store the values in the vectors
mase_values <- c(mase_values, mase)
model_names <- c(model_names, model_name)
aic_values <- c(aic_values, aic)
bic_values <- c(bic_values, bic)
}
}
# Data frame to display the results
results_table2 <- data.frame(
ModelName = model_names,
MASE = mase_values,
AIC = aic_values,
BIC = bic_values
)
# Display the all exponential smoothing model metrics in a table
print(results_table2)
## ModelName MASE AIC BIC
## 1 Holt-Winters' multiplicative method 0.2233077 6648.746 6725.114
## 2 Damped Holt-Winters' multiplicative method 0.2035619 6366.701 6447.561
## 3 Holt-Winters' additive method 0.2471600 5434.708 5511.076
## 4 Damped Holt-Winters' additive method 0.2461797 5428.422 5509.282
The Damped Holt-Winters’ multiplicative method has the lowest MASE therefore we further check the residuals and summary of the the model.
hw.mult.damped.model = hw(solar.ts, seasonal = "multiplicative", damped = TRUE, h=2*12)
summary(hw.mult.damped.model)
##
## Forecast method: Damped Holt-Winters' multiplicative method
##
## Model Information:
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = solar.ts, h = 2 * 12, seasonal = "multiplicative", damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.805
## beta = 1e-04
## gamma = 0.0124
## phi = 0.94
##
## Initial states:
## l = 9.9033
## b = 0.4703
## s = 0.4341 0.5649 0.8263 1.1716 1.4514 1.6316
## 1.5503 1.3834 1.1045 0.8837 0.5761 0.4221
##
## sigma: 0.3145
##
## AIC AICc BIC
## 6366.701 6367.768 6447.561
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0456208 2.045442 1.239102 -2.172476 10.00296 0.2035619
## ACF1
## Training set 0.04177493
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2015 5.611616 3.3502090 7.873023 2.1530925 9.070139
## Feb 2015 7.060438 3.3373603 10.783517 1.3664818 12.754395
## Mar 2015 10.515176 3.8550902 17.175263 0.3294535 20.700899
## Apr 2015 12.993446 3.5145037 22.472389 -1.5033454 27.490238
## May 2015 16.284381 2.9386340 29.630128 -4.1261776 36.694939
## Jun 2015 18.261020 1.7242212 34.797820 -7.0298315 43.551872
## Jul 2015 19.031597 0.2101276 37.853067 -9.7533565 47.816551
## Aug 2015 17.105654 -1.2074754 35.418784 -10.9018606 45.113169
## Sep 2015 13.764743 -2.0801991 29.609686 -10.4680049 37.997491
## Oct 2015 9.774733 -2.2585804 21.808047 -8.6286319 28.178099
## Nov 2015 6.581938 -2.0456267 15.209503 -6.6127835 19.776659
## Dec 2015 5.163049 -2.0168711 12.342970 -5.8176914 16.143790
## Jan 2016 5.616270 -2.6571166 13.889656 -7.0367828 18.269322
## Feb 2016 7.066360 -3.9144360 18.047156 -9.7273184 23.860039
## Mar 2016 10.524088 -6.6891829 27.737360 -15.8013383 36.849515
## Apr 2016 13.004566 -9.3405081 35.349641 -21.1692761 47.178409
## May 2016 16.298445 -13.0720043 45.668893 -28.6197807 61.216670
## Jun 2016 18.276925 -16.2140342 52.767885 -34.4724452 71.026296
## Jul 2016 19.048304 -18.5466197 56.643228 -38.4481704 76.544779
## Aug 2016 17.120781 -18.1782620 52.419825 -36.8644468 71.106009
## Sep 2016 13.777000 -15.8651593 43.419159 -31.5567705 59.110770
## Oct 2016 9.783493 -12.1627698 31.729756 -23.7804195 43.347405
## Nov 2016 6.587872 -8.8063951 21.982139 -16.9556278 30.131371
## Dec 2016 5.167730 -7.4021686 17.737629 -14.0562710 24.391731
checkresiduals(hw.mult.damped.model)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt-Winters' multiplicative method
## Q* = 23.57, df = 24, p-value = 0.4864
##
## Model df: 0. Total lags used: 24
From the Damped Holt-Winters’ multiplicative summary and residual diagnostics we observe:
The mean of the residuals is close to zero and there is no significant correlation in the residuals series.
ACF plot indicates presence of significant lags at lags 12 and 24 .
Histogram is not normally distributed.
Damped Holt-Winters’ multiplicative method has a mase value of 0.2035619 which is much smaller compared to ARDLM model therefore it can provide better forecast for the Solar Radiation series.
# Defined options for ETS
etsmodel_options <- c("AAA", "MAA", "MAM", "MMM")
damped_options <- c(FALSE, TRUE)
# Empty list to store the results
results <- list()
mase_values <- numeric()
model_names <- character()
aic_values <- numeric()
bic_values <- numeric()
# Loop through all combinations of ETS
for (etsmodel in etsmodel_options) {
for (damped in damped_options) {
args <- list(etsmodel = etsmodel, damped = damped)
result <- ets(solar.ts, model = args$etsmodel, damped = args$damped)
sm <- summary(result)
mase <- accuracy(sm)
mase <- mase[,6]
aic <- sm$aic
bic <- sm$bic
model_name <- sm$method
# Store values in the vectors
mase_values <- c(mase_values, mase)
model_names <- c(model_names, model_name)
aic_values <- c(aic_values, aic)
bic_values <- c(bic_values, bic)
}
}
# Data frame to display the results
results_table3 <- data.frame(
ModelName = model_names,
MASE = mase_values,
AIC = aic_values,
BIC = bic_values
)
# ETS auto model
Ets.auto.model <- ets(solar.ts)
sm<-summary(Ets.auto.model)
#ETS auto function detects the following model:
sm$method
## [1] "ETS(A,Ad,A)"
# Display all ETS model metrics
print(results_table3)
## ModelName MASE AIC BIC
## 1 ETS(A,A,A) 0.2471600 5434.708 5511.076
## 2 ETS(A,Ad,A) 0.2461797 5428.422 5509.282
## 3 ETS(M,A,A) 0.4748561 7602.755 7679.123
## 4 ETS(M,Ad,A) 0.3798095 6469.079 6549.940
## 5 ETS(M,A,M) 0.3721664 6105.959 6182.327
## 6 ETS(M,Ad,M) 0.3222574 5953.502 6034.363
## 7 ETS(M,M,M) 0.5292151 6670.168 6746.536
## 8 ETS(M,Md,M) 0.3201193 5995.550 6076.410
The ETS(A,Ad,A) and ETS(A,A,A) have the lowest MASE, AIC and BIC values respectively, therefore we further check the residuals and summary of the model.
ets.AAA.model <- ets(solar.ts, model = "AAA")
summary(ets.AAA.model)
## ETS(A,Ad,A)
##
## Call:
## ets(y = solar.ts, model = "AAA")
##
## Smoothing parameters:
## alpha = 0.9999
## beta = 1e-04
## gamma = 1e-04
## phi = 0.9388
##
## Initial states:
## l = 11.154
## b = 0.7632
## s = -10.4919 -8.137 -3.348 2.5794 8.08 11.1219
## 9.9586 6.9916 1.9573 -1.8565 -7.1607 -9.6946
##
## sigma: 2.3446
##
## AIC AICc BIC
## 5428.422 5429.489 5509.282
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.01091357 2.314163 1.498521 -1.468083 12.44796 0.2461797
## ACF1
## Training set 0.1700724
checkresiduals( ets.AAA.model)
##
## Ljung-Box test
##
## data: Residuals from ETS(A,Ad,A)
## Q* = 210.76, df = 24, p-value < 2.2e-16
##
## Model df: 0. Total lags used: 24
From the state space model AAA summary and residual diagnostics we observe:
The mean of the residuals is close to zero and there is no significant correlation in the residuals series.
ACF plot indicates presence of significant lags and the presence of seasonality .
Histogram is normally distributed.
The mase value is higher than the Damped Holt-Winters’ multiplicative Model.
ets.AAdA.model <- ets(solar.ts, model = "AAA", damped = TRUE)
summary(ets.AAdA.model)
## ETS(A,Ad,A)
##
## Call:
## ets(y = solar.ts, model = "AAA", damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.9999
## beta = 1e-04
## gamma = 1e-04
## phi = 0.9388
##
## Initial states:
## l = 11.154
## b = 0.7632
## s = -10.4919 -8.137 -3.348 2.5794 8.08 11.1219
## 9.9586 6.9916 1.9573 -1.8565 -7.1607 -9.6946
##
## sigma: 2.3446
##
## AIC AICc BIC
## 5428.422 5429.489 5509.282
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.01091357 2.314163 1.498521 -1.468083 12.44796 0.2461797
## ACF1
## Training set 0.1700724
checkresiduals(ets.AAdA.model)
##
## Ljung-Box test
##
## data: Residuals from ETS(A,Ad,A)
## Q* = 210.76, df = 24, p-value < 2.2e-16
##
## Model df: 0. Total lags used: 24
From the state space model AAA Damped summary and residual diagnostics we observe:
The mean of the residuals is close to zero and there is no significant correlation in the residuals series.
ACF plot indicates presence of significant lags and the presence of seasonality .
Histogram is normally distributed.
The mase value is higher than the Damped Holt-Winters’ multiplicative Model.
All_models<- bind_rows(results, results_table2, results_table3)
All_models <- arrange(All_models,MASE)
All_models
## ModelName MASE AIC BIC
## 1 Damped Holt-Winters' multiplicative method 0.2035619 6366.701 6447.561
## 2 Holt-Winters' multiplicative method 0.2233077 6648.746 6725.114
## 3 Damped Holt-Winters' additive method 0.2461797 5428.422 5509.282
## 4 ETS(A,Ad,A) 0.2461797 5428.422 5509.282
## 5 Holt-Winters' additive method 0.2471600 5434.708 5511.076
## 6 ETS(A,A,A) 0.2471600 5434.708 5511.076
## 7 ETS(M,Md,M) 0.3201193 5995.550 6076.410
## 8 ETS(M,Ad,M) 0.3222574 5953.502 6034.363
## 9 ETS(M,A,M) 0.3721664 6105.959 6182.327
## 10 ETS(M,Ad,A) 0.3798095 6469.079 6549.940
## 11 ETS(M,A,A) 0.4748561 7602.755 7679.123
## 12 ETS(M,M,M) 0.5292151 6670.168 6746.536
We observe the Damped Holt-Winters’ multiplicative method has the best residual check in comparison to other models and has the lowest MASE value which is the main performance metrics to determine the best model for forecasting.
We use the Damped Holt-Winters’ multiplicative to model the forecast.
upper.95.int = hw.mult.damped.model$upper[,2] #CI upper bound
lower.95.int = hw.mult.damped.model$lower[,2] #CI lower bound
autoplot(solar.ts) +
autolayer(hw.mult.damped.model, series="Damped Holt-Winters' multiplicative method",
PI=FALSE) +
autolayer(upper.95.int, series="95% confidence interval",
PI=FALSE) +
autolayer(lower.95.int,series="95% confidence interval",
PI=FALSE) +
scale_color_manual(values = c("Damped Holt-Winters' multiplicative method" = "red",
"95% confidence interval" = "blue")) +
xlab("Year") +
ylab("Solar Radiation") +
ggtitle("Original series, forecasts and 95% forecast interval for the Solar Radiation Series") +
guides(colour=guide_legend(title="Forecast"))
The red line in the plot represents the forecast for the next 2 years,
while the blue lines indicate the 95% confidence intervals. The Damped
Holt-Winters’ multiplicative method uses an alpha value of 0.805, which
places more weight on the most recent data points from the original
solar radiation series when generating the forecast. As a result, the
forecast closely aligns with the most recent years in the data.
Conclusively, the Damped Holt-Winters’ multiplicative method is the most suited model for the 2-year forecast, supported by the lowest MASE value and observed residual checks.
[1]Forecasting: Principles and practice (2nd) 7.3 Holt-Winters’ seasonal method. Available at: https://otexts.com/fpp2/holt-winters.html (Accessed: 16 September 2023).
[2]Hudson, I 2023, lecture and lab notes, Forecasting, RMIT University, Melbourne