Question 1

# Load the dataset
load("visitors.rda")
# Convert the data into a time series object
visitors_ts <- ts(visitors$Arrivals, start = c(1981, 1), frequency = 4)

Question 1: Plot the time series object

Question 1: Describe the main features

Trend: Upward Trend suggesting increasing travel interests
Seasonality: Strong evidence of seasonality indicating popular times to visit
Cyclical: Decent evidence Indicating Seasonality  
Irregularities: start of the data through 1984 demonstates irregularities 

Question 2

Additive vs. Multiplicative?

Trend: clear upward trend as the level increases, the fluctuations appear to increase as well
Seasonality: amplitude of the seasonal fluctuations appears to grows as well suggesting the     seasonal effects are not constant 

Given these observations, the multiplicative appears to be the better fit for this dataset

Question 3

library(forecast)
## Warning: package 'forecast' was built under R version 4.3.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
# Forecasts using Holt-Winters methods

# Linear trend with additive seasonality
hw_additive <- hw(visitors_ts, seasonal = "additive", h = 20)

# Linear trend with multiplicative seasonality
hw_multiplicative <- hw(visitors_ts, seasonal = "multiplicative", h = 20)

# Linear trend with additive seasonality and damping
hw_additive_damped <- hw(visitors_ts, seasonal = "additive", damped = TRUE, h = 20)

# Linear trend with multiplicative seasonality and damping
hw_multiplicative_damped <- hw(visitors_ts, seasonal = "multiplicative", damped = TRUE, h = 20)

# Exponential trend with multiplicative seasonality
hw_exp_multiplicative <- hw(visitors_ts, seasonal = "multiplicative", exponential = TRUE, h = 20)



plot(hw_additive, main = "Additive Seasonality", ylab = "Arrivals", xlab = "Year")

plot(hw_multiplicative, main = "Multiplicative Seasonality", ylab = "Arrivals", xlab = "Year")

plot(hw_additive_damped, main = "Additive Seasonality with Damping", ylab = "Arrivals", xlab = "Year")

plot(hw_multiplicative_damped, main = "Multiplicative Seasonality with Damping", ylab = "Arrivals", xlab = "Year")

plot(hw_exp_multiplicative, main = "Exponential Trend with Multiplicative Seasonality", ylab = "Arrivals", xlab = "Year")

Question 4: Accuracy Function

# Calculate accuracy metrics
acc_additive <- accuracy(hw_additive)
acc_multiplicative <- accuracy(hw_multiplicative)
acc_additive_damped <- accuracy(hw_additive_damped)
acc_multiplicative_damped <- accuracy(hw_multiplicative_damped)
acc_exp_multiplicative <- accuracy(hw_exp_multiplicative)

# Extract RMSE values
rmse_values <- data.frame(
  Method = c("Additive", "Multiplicative", "Additive Damped", "Multiplicative Damped", "Exponential Multiplicative"),
  RMSE = c(acc_additive["Training set", "RMSE"], acc_multiplicative["Training set", "RMSE"], 
           acc_additive_damped["Training set", "RMSE"], acc_multiplicative_damped["Training set", "RMSE"], 
           acc_exp_multiplicative["Training set", "RMSE"])
)
# Print RMSE values
print(rmse_values)
##                       Method     RMSE
## 1                   Additive 7542.656
## 2             Multiplicative 7550.956
## 3            Additive Damped 7552.064
## 4      Multiplicative Damped 7460.002
## 5 Exponential Multiplicative 7870.123

Question 4: Which do you prefer and why?

I prefer the Multiplicative Damped method because it has the lowest RMSE value of 7460.002, indicating the best forecast accuracy among the methods tested. The damping factor likely helps in smoothing out excessive fluctuations, leading to more reliable predictions.

Question 5: Residuals from the best model look like white noise

# Generate the best forecast model (Multiplicative Damped)
best_model <- hw(visitors_ts, seasonal = "multiplicative", damped = TRUE, h = 20)

# Check the residuals of the best model
checkresiduals(best_model)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 21.921, df = 8, p-value = 0.005064
## 
## Model df: 0.   Total lags used: 8
# Provide a summary of the model's smoothing parameters
summary(best_model)
## 
## Forecast method: Damped Holt-Winters' multiplicative method
## 
## Model Information:
## Damped Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = visitors_ts, h = 20, seasonal = "multiplicative", damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.52 
##     beta  = 0.0027 
##     gamma = 1e-04 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 26914.2591 
##     b = 2002.8599 
##     s = 1.0638 0.9467 0.9133 1.0762
## 
##   sigma:  0.1034
## 
##      AIC     AICc      BIC 
## 2913.642 2915.538 2942.084 
## 
## Error measures:
##                     ME     RMSE     MAE        MPE     MAPE      MASE
## Training set -1.834419 7460.002 5363.22 -0.9701216 6.852661 0.7266756
##                      ACF1
## Training set -0.001971966
## 
## Forecasts:
##         Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
## 2012 Q4       122225.2 106033.32 138417.1 97461.86 146988.5
## 2013 Q1       123798.8 105276.90 142320.8 95471.97 152125.7
## 2013 Q2       105187.4  87816.96 122557.8 78621.62 131753.1
## 2013 Q3       109171.5  89578.75 128764.3 79206.97 139136.1
## 2013 Q4       122829.1  99138.55 146519.6 86597.54 159060.7
## 2014 Q1       124397.5  98830.37 149964.7 85295.93 163499.2
## 2014 Q2       105685.3  82691.88 128678.6 70519.92 140850.6
## 2014 Q3       109677.3  84552.31 134802.3 71251.94 148102.7
## 2014 Q4       123386.1  93754.23 153018.0 78068.05 168704.2
## 2015 Q1       124949.8  93607.44 156292.1 77015.81 172883.7
## 2015 Q2       106144.5  78421.35 133867.7 63745.60 148543.4
## 2015 Q3       110143.9  80270.02 140017.7 64455.77 155832.0
## 2015 Q4       123899.9  89083.75 158716.1 70653.18 177146.7
## 2016 Q1       125459.1  89009.06 161909.2 69713.57 181204.7
## 2016 Q2       106568.1  74614.21 138522.0 57698.83 155437.4
## 2016 Q3       110574.2  76411.57 144736.8 58326.96 162821.4
## 2016 Q4       124373.9  84836.81 163910.9 63907.17 184840.6
## 2017 Q1       125929.0  84794.32 167063.6 63018.96 188838.9
## 2017 Q2       106958.8  71100.32 142817.3 52117.97 161799.7
## 2017 Q3       110971.1  72828.41 149113.8 52636.88 169305.4

Question 6: Forecast the next 20 years and evaluate

# Generate seasonal naive forecast
snaive_forecast <- snaive(visitors_ts, h = 20)

# Calculate accuracy of the seasonal naive model
snaive_accuracy <- accuracy(snaive_forecast)

# Calculate accuracy of the best model (Multiplicative Damped)
best_model_accuracy <- accuracy(best_model)

# Print the RMSE values for comparison
rmse_comparison <- data.frame(
  Model = c("Best Model (Multiplicative Damped)", "Seasonal Naive"),
  RMSE = c(best_model_accuracy["Training set", "RMSE"], snaive_accuracy["Training set", "RMSE"])
)

print(rmse_comparison)
##                                Model      RMSE
## 1 Best Model (Multiplicative Damped)  7460.002
## 2                     Seasonal Naive 10298.985

Question 6: Did the best model beat the season naive model?

Yes, the best model beat the seasonal naive model. The best model has a RMSE of 7460.002
while the Seasonal Naive Model had an RMSE of 10298.985.

Question 7: Forecast the next 20 years and evaluate

The decomposition of the civilian labor force in Australia from 1978 to 1995 reveals several key insights. The trend component indicates a steady increase in the labor force over the period, with a noticeable dip around the early 1990s, aligning with the 1991/1992 recession. This dip suggests that the recession had a significant impact on the labor market.

The seasonal component highlights consistent, recurring patterns within each year. For example, there are peaks around March and December, indicating higher labor force numbers during these months. Each month exhibits its own distinct seasonal effect, which repeats annually.

The residual component captures the irregular fluctuations not explained by the trend or seasonal components. A significant negative deviation in the residuals around the early 1990s further underscores the impact of the recession. This deviation indicates that the recession caused unusual changes in the labor force that the trend and seasonal components couldn't account for.

In summary, the decomposition shows a generally increasing trend in the labor force, with clear seasonal patterns and the recession of 1991/1992 visibly affecting both the trend and residual components.