Introduction

Time Series data refer to the historical representation of data points collected at periodic intervals of time (Austin, 2015). These historical data are used within statistical tools such as R or Rstudio to make statistical models that can predict future values at reasonable error rate. In this post Rstudio in combination with rmarkdown was used to forecast the exchange rate of Kenyan Shilling against US dollar using daily closing rate of the shilling collected over a period of 891 days (1/01/2017 to 7/23/2020).

Data

The data for this post was obtained from Central Bank of Kenya. The data is dynamic; thus, this original dataset was saved online for future retrieval here.

The first 10 observations of the data are presented in table 1.

data <- read.csv("Key CBK Indicative Exchange Rates.csv")

library(knitr)

kable(head(data, 10), caption = "Table 1: First 10 Exchange Rates")
Table 1: First 10 Exchange Rates
ï..Date Currency Mean Buy Sell
23/07/2020 US DOLLAR 108.1288 108.0288 108.2288
22/07/2020 US DOLLAR 107.8797 107.7800 107.9794
21/07/2020 US DOLLAR 107.6209 107.5212 107.7206
20/07/2020 US DOLLAR 107.4329 107.3329 107.5329
17/07/2020 US DOLLAR 107.4582 107.3582 107.5582
16/07/2020 US DOLLAR 107.3494 107.2494 107.4494
15/07/2020 US DOLLAR 107.2850 107.1853 107.3847
14/07/2020 US DOLLAR 107.1471 107.0476 107.2465
13/07/2020 US DOLLAR 107.0512 106.9512 107.1512
10/07/2020 US DOLLAR 106.9003 106.8006 107.0000

From table 1, it is clear that we only need the Mean column and convert the series into time series and plot the series. Figure 1 shows the time plot of the exchange rate.

rate <- data[3]

## Create a daily Date object - helps my work on dates

inds <- seq(as.Date("2017-01-01"), as.Date("2020-07-23"), by = "day")

## Create a time series object

rate.ts <- ts(rate, start = c(2017, as.numeric(format(inds[1], "%j"))),
           frequency = 365)


plot.ts(rate.ts, main = "Figure 1: Time Plot of Kenyan Shillings - US Dollar Exchange Rate", col = "blue")

The components of the time series can be seen in figure 1.

Stationarity and Differencing

A seasonal time series consists of a trend, seasonal and an error component. The process of decomposing the time series means separating the time series into these three components (Coghlan, 2018).

The decompose() function can be used to estimate the trend and seasonal component of a time series described using an additive model. The function “decompose()” returns a list object as its result, where the estimates of the seasonal, trend and error component are stored in named elements of that list objects, called “seasonal”, “trend”, and “random” respectively (Coghlan, 2018). The figure 2 shows a plot of decomposed series.

decomposed.series <- decompose(rate.ts)

# Define a plot function to allows us change the title

decomp.plot <- function(x, main = NULL, ...)
{
    if(is.null(main))
    main <- paste("Decomposition of", x$type, "time series")
    plot(cbind(observed = x$random + if (x$type == "additive")
        x$trend + x$seasonal
    else x$trend * x$seasonal, trend = x$trend, seasonal = x$seasonal,
        random = x$random), main = main, ...)
}

decomp.plot(decomposed.series, col = "blue", main = "Figure 2: Decomposed Series Plot")

The decomposed series shows that seasonality and trend. However, we need to perform a stationarity test to check if the data is stationary. The Augmented Dickey-Fuller unit root test was used as follows:

Null hypothesis \(H_{0}:\) Unit root exists

Alternative hypothesis \(H_{a}:\) stationary

library(tseries)

adf.test(rate.ts, alternative = "stationary")
## 
##  Augmented Dickey-Fuller Test
## 
## data:  rate.ts
## Dickey-Fuller = -2.5421, Lag order = 9, p-value = 0.3489
## alternative hypothesis: stationary

It was expected that the series is stationary at 95% confidence level, but the ADF test statistic = -2.54, p-value = 0.35 implies that the p-vale is greater than 0.05, thus, fail to reject the null hypothesis of existence of unit root. Therefore, the series is not stationary. One way of making the series stationary is through differencing.

differenced.series <- diff(rate.ts, differences = 1)

decomp.plot(decompose(differenced.series), col = "blue", main = "Figure 3: Decomposed Differences Series Plot")

Figure 3 shows the series to have a constant variance with no trend, an indication that the first differencing have made the series stationary. However, ADF test was performed to confirm stationarity.

adf.test(differenced.series, alternative = "stationary")
## 
##  Augmented Dickey-Fuller Test
## 
## data:  differenced.series
## Dickey-Fuller = -8.2375, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

Similarly, the expectation was that the series is stationary at 95% confidence level, the ADF test statistic = -8.24, p-value = 0.01 implies that the p-vale is less than 0.0; thus, reject the null hypothesis of existence of unit root. Therefore, the series is stationary.

Exponential Smoothing Model

The historical data has been plotted as time series and differenced to make it stationary. Now we move on to forecasting. The fluctuations in Kenyan Shillings time series plot is fairly consistent over time. Therefore, simple exponential smoothing can be used to make predictions. In R the HoltWinters() function is used for simple exponential smoothing.

model <- HoltWinters(differenced.series ,beta=FALSE,gamma=FALSE)

plot(model, main ="Figure 3: Holts-Winters Filtering", col = "blue")

The exponential smoothing is defined by setting the parameter beta=FALSE. Since we are dealing with non-seasonal model the parameter gamma=FALSE. The resulting filtered model is presented in figure 3.

Forecasting

The dataset is now ready for forecasting after estimation of the exponential smoothing model. The forecast.HoltWinters() function in forecast package was used for the forecasting.

library(forecast)

forecasts1 <- forecast:::forecast.HoltWinters(model,h=60)

The parameter h=30 indicates forecasting for the next 60 intervals which is next 60 days in this instance. The output of forecast.HoltWinters is shown in figure 4.

plot(forecasts1, main = "Figure 4: Forecasts from HoltWinters")

The dark grey shaded area in figure 4 represents the range of Kshs exchange rate for next 60 days with 80% prediction. The lighter shade of grey represents the Kshs exchange rate with 90% prediction.

The forecast model is subject to prediction errors. The validation of the model involves studying the distribution of forecast errors. From the The Holt Winters forecast model creates the residuals can be extracted into a dataset. A plot of errors should follow a normal distribution with mean of zero. If the errors does not shows bell shape then the errors are high and the model is not sufficient.

hist(forecasts1$residuals, col="blue", main = "Figure 5: Histogram of Forecast Residuals", xlab = "Residuals")

Figure 5 shows the histogram of forecast errors which are approximately bell-shaped. The mean is approximately zero. Therefore, the model fits the data well.

References

Austin, B. (2015, August 2). Forecasting Exchange Rates Using R Time Series. Retrieved July 24, 2020, from Benny Austin website: https://bennyaustin.com/2015/08/03/r-time-series/

Coghlan, A. (2018). Using R for Time Series Analysis — Time Series 0.2 documentation. Retrieved July 24, 2020, from a-little-book-of-r-for-time-series.readthedocs.io website: https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html