Introduction
In this project we want to find a time series with both trend and
seasonality. We used a Seattle weather data set and are going to use the
160 most recent observations. We will look to fit three types of
smoothing models. These models are simple exponential model, Holt
models, and Holt-Winter’s models. We will also look at visualizations
for this.
Data Description
- Month- Month of the year (in numbers. Ex: 1-Jan, 2-Feb…)
- Year- The year it was
- LowTemp- Lowest temperature
- HighTemp- Highest temperature
- WarmestMin- The lowest warm temperature
- ColdestHigh- The highest cold temperature
- AveMin-The average minimum temperature
- AveMax- The average maximum temperature
- meanTemp- The mean temperature
- TotPrecip- The total precipitation
- TotSnow- The total snow
- Max24hrPrecip- The maximum amount of precipitation in 24 hours
Research
Question
The goal of the analysis is to use time series to make conclusions on
the mean temperature in Seattle for the next year based on this data
set.
Data turned into
Smoothing Models
We want to look at only the most recent data and narrowed it down to
the last 160 observations to be productive. We made a test and training
data set then for the project. There is only 12 observations in the
testing data set and 148 in the training set. We then wanted top put the
data into smoothing models to then test them for accuracy measures later
on.
Accuracy Measures for
training data
We get many accuracy values to see which model is best for our
data.
The accuracy measures of various exponential smoothing models
based on the training data
| SES |
0.1088 |
5.5656 |
4.7334 |
-0.4109 |
9.1980 |
2.1660 |
0.5295 |
| Holt Linear |
-0.0025 |
5.6668 |
4.8193 |
-0.6222 |
9.3625 |
2.2053 |
0.4889 |
| Holt Add. Damped |
0.0909 |
4.9184 |
3.9638 |
0.4974 |
7.7992 |
1.8138 |
-0.0326 |
| Holt Exp. Damped |
-0.4527 |
4.8314 |
3.9869 |
-0.4701 |
7.7967 |
1.8244 |
-0.0588 |
| HW Add. |
-0.0370 |
1.9630 |
1.5911 |
-0.2092 |
3.1935 |
0.7281 |
0.0465 |
| HW Exp. |
-0.0228 |
2.0470 |
1.6313 |
-0.1951 |
3.2656 |
0.7465 |
0.1123 |
| HW Add. Damp |
0.0340 |
1.9434 |
1.5751 |
-0.0877 |
3.1666 |
0.7208 |
0.0594 |
| HW Exp. Damp |
0.0890 |
1.9721 |
1.5883 |
0.0096 |
3.1904 |
0.7268 |
0.0886 |
We see from the output above that HW Add Damp is the best fit for our
data due to how it has the lowest average values of all the values
above. For ACF1 it is a relatively low value meaning that residual
autocorrelation is not a concern.
Visualization
We are now visualizing the original time series data and forecast
results from different smoothing models
We see from the above visual that HW’s linear trend with an additive
damped seasonal model is the best fit. This matches up with the accuracy
measures we talked about before. It performed well due to how it does
not have a wide prediction band.
Accuracy Measures for
testing data
We saw that the visualization and our training data showed us that HW
Add Damp is the appropriate model. To make a real life forecast of our
data we need to use the whole data set with all 160 recent
observations.
The accuracy measures of various exponential smoothing models
based on the testing data
| SES |
249.974419 |
18.921862 |
| Holt.Add |
253.333843 |
19.054993 |
| Holt.Add.Damp |
324.332191 |
21.565641 |
| Holt.Exp |
353.076074 |
22.650636 |
| HW.Add |
6.124606 |
4.041155 |
| HW.Exp |
4.338878 |
3.463678 |
| HW.Add.Damp |
7.190380 |
4.315925 |
| HW.Exp.Damp |
7.956855 |
4.642423 |
Looking at the table we see that HW Add Damped is again the best
model which is consistent to our conclusions made earlier in the
project. HW.Exp has the lowest average values and HW.ADD.Damp has the
second lowest. HW Add Damped is the best model that we have looked at
due to MSE (585.04) and MAPE (40.44) both being one of the lowest
average values out of the models. It also has the lowest average in the
training data while HW.Exp has about the third best average for the
training data. Which means that the HW.Add.Damped has the best all
around accuracy measures for both the training anf testing data.
Final Model
Here we use the whole data set and refit it using smoothing
parameters.
Estimated values of the smoothing parameters in Holt-Winters
linear trend with additive seasonality
| alpha |
0.1555996 |
| beta |
0.0001000 |
| gamma |
0.0001000 |
The output above shows a value of 0.156 for alpha. This indicates
that the model places moderate weight on our recent observations. For
Gamma, 0.0001, since it is close to 0 it shows that seasonality patterns
are stable. For Beta, 0.0001, since it is close to 0 it doesn’t have
importance to updating the trend based on our recent observations.
Conclusion
The Seattle weather data set I used showed that the best model we
could use was HW’s linear trend using an additive damped seasonal model.
We got this by splitting our data into a training data set and testing
data set. We also saw from the accuracy measures and the visuals that
the additive damped model from HW was the best model. For our final
model we got gamma and beta values close to 0 so that showed seasonality
patterns were stable and we do not need to update the trend. Our alpha
was 0.156 which showed that the model placed some weight on our recent
observations.
This project we made sure to look at the most recent 160 observations
due to needing more than 100. We then used a frequency of 12 for the 12
months of the year.
Our performance metrics of MSE and MAPE showed that their values were
the lowest. This meant that our predictors are closer to the true values
and that our model has the most accurate forecast.
