Simple exponential smoothing is utilized on time series precipitation data to make forecasts for the annual rainfall in Los Angeles, California, USA for the years 2016 to 2019.
The city of Los angeles typically receives the bulk of its rainfall from the months of October to April. Datasets (csv format) containing monthly (October to April) climate observations for Los Angeles, covering the years 1945 to 2015, were originally extracted from the Weather Warehouse website. Weather Warehouse use to offered some free weather data samples to the public for personal use. However, this site no longer provides free access to weather samples for Los Angeles on this webpage. The weblink above just shows Weather Warehouse’s archived website.
Data was cleaned in Microsoft Excel before being loaded in R. Further processing was done in R to create a data frame that contains total (October to April) rainfall in inches for Los Angeles, from 1945-2015.
The data frame is then stored as a time series object in R.
# Load Monthly Los Angeles Precip Data
LA_dat_oct = read.csv("./LA_Precip_Oct.csv")
LA_dat_oct <- LA_dat_oct[,c(1,9)]
LA_dat_nov = read.csv("./LA_Precip_Nov.csv")
LA_dat_nov <- LA_dat_nov[,c(1,9)]
LA_dat_dec = read.csv("./LA_Precip_Dec.csv")
LA_dat_dec <- LA_dat_dec[,c(1,9)]
LA_dat_jan = read.csv("./LA_Precip_Jan.csv")
LA_dat_jan <- LA_dat_jan[,c(1,9)]
LA_dat_feb = read.csv("./LA_Precip_Feb.csv")
LA_dat_feb <- LA_dat_feb[,c(1,9)]
LA_dat_mar = read.csv("./LA_Precip_Mar.csv")
LA_dat_mar <- LA_dat_mar[,c(1,9)]
LA_dat_apr = read.csv("./LA_Precip_Apr.csv")
LA_dat_apr <- LA_dat_apr[,c(1,9)]
# Time series of Los Angeles monthly precip.
LA_dat_oct_ts <- ts(LA_dat_oct[,2], start = 1945, frequency = 1)
LA_dat_nov_ts <- ts(LA_dat_nov[,2], start = 1945, frequency = 1)
LA_dat_dec_ts <- ts(LA_dat_dec[,2], start = 1945, frequency = 1)
LA_dat_jan_ts <- ts(LA_dat_jan[,2], start = 1945, frequency = 1)
LA_dat_feb_ts <- ts(LA_dat_feb[,2], start = 1945, frequency = 1)
LA_dat_mar_ts <- ts(LA_dat_mar[,2], start = 1945, frequency = 1)
LA_dat_apr_ts <- ts(LA_dat_apr[,2], start = 1945, frequency = 1)
# Los Angeles annual precip.
LA_dat_annual <- LA_dat_oct[,2]+LA_dat_nov[,2]+LA_dat_dec[,2]+LA_dat_jan[,2]+LA_dat_feb[,2]+LA_dat_mar[,2]+LA_dat_apr[,2]
# Time series of Los Angeles annual precip.
LA_dat_annual_ts <- ts(LA_dat_annual, start = 1945, frequency = 1)
summary(LA_dat_annual_ts)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.87 7.30 10.68 11.28 15.16 26.23
Once the time series is loaded into R, plots of the time series data is generated for the monthly and total rainfall for the months of October to April.
The plot above shows that the total precipitation mean stays constant at about 11 inches. The random fluctuations in the time series seem to be roughly constant in size over time.
This particular precipitation time series can be best described using an additive model with constant level and no seasonality. Exponential smoothing is suitable for forecasting data with no trend or seasonal pattern. The method assigns exponentially decreasing weights as the observation get older. Recent observations are given relatively more weight in forecasting than the older observations.
Therefore, to make short-term forecasts for the time series of annual rainfall in Los Angeles, simple exponential smoothing with the “HoltWinters()” function ((beta=FALSE and gamma=FALSE) was utilized in R.
LA_dat_annual_ts_forecasts <- HoltWinters(LA_dat_annual_ts, beta=FALSE, gamma=FALSE)
LA_dat_annual_ts_forecasts
## Holt-Winters exponential smoothing without trend and without seasonal component.
##
## Call:
## HoltWinters(x = LA_dat_annual_ts, beta = FALSE, gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.1703809
## beta : FALSE
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 10.03568
The estimated value of the alpha parameter is about 0.17. This is close to zero, telling us that some weight is placed on the most recent observations when making forecasts of future values of precipitation.
In simple exponential smoothing, the first value in the time series is used as the initial value for the level. Hence, in this time series for rainfall in Los Angeles, the first value of 3.36 inches (1945 precipitation value) is used for rainfall in 1946 to start the forecast computation. Dispayed below is the forecasts for the time period covered by the original data, which is 1945-2015 for the rainfall time series.
LA_dat_annual_ts_forecasts$fitted
## Time Series:
## Start = 1946
## End = 2015
## Frequency = 1
## xhat level
## 1946 3.360000 3.360000
## 1947 4.169309 4.169309
## 1948 4.009269 4.009269
## 1949 4.839148 4.839148
## 1950 5.600896 5.600896
## 1951 8.049116 8.049116
## 1952 7.924889 7.924889
## 1953 8.433495 8.433495
## 1954 7.744561 7.744561
## 1955 7.866458 7.866458
## 1956 9.632207 9.632207
## 1957 10.764864 10.764864
## 1958 10.392605 10.392605
## 1959 9.438028 9.438028
## 1960 10.726443 10.726443
## 1961 10.764533 10.764533
## 1962 10.007270 10.007270
## 1963 12.476554 12.476554
## 1964 12.354467 12.354467
## 1965 12.982411 12.982411
## 1966 14.520540 14.520540
## 1967 13.407861 13.407861
## 1968 14.519108 14.519108
## 1969 14.810612 14.810612
## 1970 14.125576 14.125576
## 1971 12.466820 12.466820
## 1972 11.005494 11.005494
## 1973 10.471266 10.471266
## 1974 9.876421 9.876421
## 1975 10.647152 10.647152
## 1976 10.346063 10.346063
## 1977 9.849222 9.849222
## 1978 12.640193 12.640193
## 1979 12.664014 12.664014
## 1980 12.438427 12.438427
## 1981 13.987457 13.987457
## 1982 13.900996 13.900996
## 1983 15.654046 15.654046
## 1984 14.462394 14.462394
## 1985 13.202871 13.202871
## 1986 12.193727 12.193727
## 1987 12.286802 12.286802
## 1988 12.391279 12.391279
## 1989 11.465893 11.465893
## 1990 10.897521 10.897521
## 1991 11.075139 11.075139
## 1992 11.835866 11.835866
## 1993 11.100525 11.100525
## 1994 11.838185 11.838185
## 1995 11.378466 11.378466
## 1996 12.259596 12.259596
## 1997 11.228861 11.228861
## 1998 11.378990 11.378990
## 1999 11.989126 11.989126
## 2000 10.747198 10.747198
## 2001 10.734045 10.734045
## 2002 9.995607 9.995607
## 2003 10.911300 10.911300
## 2004 11.084867 11.084867
## 2005 11.095965 11.095965
## 2006 10.847896 10.847896
## 2007 11.185609 11.185609
## 2008 10.087400 10.087400
## 2009 11.525858 11.525858
## 2010 11.381740 11.381740
## 2011 10.553392 10.553392
## 2012 10.283613 10.283613
## 2013 9.599770 9.599770
## 2014 8.453146 8.453146
## 2015 9.798619 9.798619
The plot of the original time series against the precipitation forecasts is shown. The original time series is in blue, and the forecasts are plotted as a red line. The time series of rainfall forecasts is much smoother than the time series of the original rainfall data here.
plot(LA_dat_annual_ts_forecasts, col="blue", lwd=2)
Forecasts for further time points is generated by using the “forecast.HoltWinters()” function in the R “forecast” package. Prediction of precipitation for the years 2016-2019 (4 more years) is run in R to generate the estimates shown below.
library(forecast)
LA_dat_annual_ts_forecasts2 <- forecast.HoltWinters(LA_dat_annual_ts_forecasts, level=c(80,90),h=4)
LA_dat_annual_ts_forecasts2
## Point Forecast Lo 80 Hi 80 Lo 90 Hi 90
## 2016 10.03568 2.588819 17.48255 0.47773600 19.59363
## 2017 10.03568 2.481502 17.58986 0.33999659 19.73137
## 2018 10.03568 2.375689 17.69568 0.20418672 19.86718
## 2019 10.03568 2.271317 17.80005 0.07022751 20.00114
In addition to the rainfall forecasts for the years: 2016-2019, a 80% prediction interval for the forecast, and a 90% prediction interval for the forecast is also supplied for each year.
A plot of the predictions made by the forecast.HoltWinters() function is displayed below.
plot.forecast(LA_dat_annual_ts_forecasts2,shadecols="oldstyle")
Here the forecasts for the years: 2016-2019 are plotted as a blue line. The 80% prediction interval is the orange shaded area, and the 90% prediction interval is shown as a yellow shaded area.
1: Business Analytics Using Forecasting, Videos by Prof. Galit Shmueli on forecasting analytics
2: Using R for Time Series Analysis, Avril Coghlan, pdf version of the book