Introduction

Hello everyone! At this page, I would like to show you an analysis about foreign exchange. I am using another time-series model for predicting changes on EUR/USD. This datasets contains close price each day from 2010-2019. So it’s gonna be a big time series model.
Without any further ado, let’s get started!

Objective

By using the dataset from 2010 to 2019, I am trying to forecast the US dollar rate for Euro for the entire 2020. I am going to use several R time-series functions to process all the problems.

Pre-Start

As usual, load the necessary packages.

library(rmarkdown)
library(ggplot2)
library(plotly)
library(tidyverse)
library(tidyquant)
library(MLmetrics)
library(caret)
library(padr)
library(lubridate)
library(imputeTS)
library(tseries)
library(TSstudio)
library(forecast)
library(tsfeatures)

Data Wrangling

First, load the needed dataset.

read<-read_csv("Foreign_Exchange_Rates.csv")
head(read)

## # A tibble: 6 x 24
##      X1 `Time Serie` `AUSTRALIA - AU~ `EURO AREA - EU~ `NEW ZEALAND - ~
##   <dbl> <date>       <chr>            <chr>            <chr>           
## 1     0 2000-01-03   1.5172           0.9847           1.9033          
## 2     1 2000-01-04   1.5239           0.97             1.9238          
## 3     2 2000-01-05   1.5267           0.9676           1.9339          
## 4     3 2000-01-06   1.5291           0.9686           1.9436          
## 5     4 2000-01-07   1.5272           0.9714           1.938           
## 6     5 2000-01-10   1.5242           0.9754           1.935           
## # ... with 19 more variables: `UNITED KINGDOM - UNITED KINGDOM
## #   POUND/US$` <chr>, `BRAZIL - REAL/US$` <chr>, `CANADA - CANADIAN
## #   DOLLAR/US$` <chr>, `CHINA - YUAN/US$` <chr>, `HONG KONG - HONG KONG
## #   DOLLAR/US$` <chr>, `INDIA - INDIAN RUPEE/US$` <chr>, `KOREA -
## #   WON/US$` <chr>, `MEXICO - MEXICAN PESO/US$` <chr>, `SOUTH AFRICA -
## #   RAND/US$` <chr>, `SINGAPORE - SINGAPORE DOLLAR/US$` <chr>, `DENMARK -
## #   DANISH KRONE/US$` <chr>, `JAPAN - YEN/US$` <chr>, `MALAYSIA -
## #   RINGGIT/US$` <chr>, `NORWAY - NORWEGIAN KRONE/US$` <chr>, `SWEDEN -
## #   KRONA/US$` <chr>, `SRI LANKA - SRI LANKAN RUPEE/US$` <chr>, `SWITZERLAND -
## #   FRANC/US$` <chr>, `TAIWAN - NEW TAIWAN DOLLAR/US$` <chr>, `THAILAND -
## #   BAHT/US$` <chr>

Use glimpse() to observe overall data

glimpse(read)

## Observations: 5,217
## Variables: 24
## $ X1                                          <dbl> 0, 1, 2, 3, 4, 5, 6, 7,...
## $ `Time Serie`                                <date> 2000-01-03, 2000-01-04...
## $ `AUSTRALIA - AUSTRALIAN DOLLAR/US$`         <chr> "1.5172", "1.5239", "1....
## $ `EURO AREA - EURO/US$`                      <chr> "0.9847", "0.97", "0.96...
## $ `NEW ZEALAND - NEW ZELAND DOLLAR/US$`       <chr> "1.9033", "1.9238", "1....
## $ `UNITED KINGDOM - UNITED KINGDOM POUND/US$` <chr> "0.6146", "0.6109", "0....
## $ `BRAZIL - REAL/US$`                         <chr> "1.805", "1.8405", "1.8...
## $ `CANADA - CANADIAN DOLLAR/US$`              <chr> "1.4465", "1.4518", "1....
## $ `CHINA - YUAN/US$`                          <chr> "8.2798", "8.2799", "8....
## $ `HONG KONG - HONG KONG DOLLAR/US$`          <chr> "7.7765", "7.7775", "7....
## $ `INDIA - INDIAN RUPEE/US$`                  <chr> "43.55", "43.55", "43.5...
## $ `KOREA - WON/US$`                           <chr> "1128", "1122.5", "1135...
## $ `MEXICO - MEXICAN PESO/US$`                 <chr> "9.4015", "9.457", "9.5...
## $ `SOUTH AFRICA - RAND/US$`                   <chr> "6.126", "6.085", "6.07...
## $ `SINGAPORE - SINGAPORE DOLLAR/US$`          <chr> "1.6563", "1.6535", "1....
## $ `DENMARK - DANISH KRONE/US$`                <chr> "7.329", "7.218", "7.20...
## $ `JAPAN - YEN/US$`                           <chr> "101.7", "103.09", "103...
## $ `MALAYSIA - RINGGIT/US$`                    <chr> "3.8", "3.8", "3.8", "3...
## $ `NORWAY - NORWEGIAN KRONE/US$`              <chr> "7.964", "7.934", "7.93...
## $ `SWEDEN - KRONA/US$`                        <chr> "8.443", "8.36", "8.353...
## $ `SRI LANKA - SRI LANKAN RUPEE/US$`          <chr> "72.3", "72.65", "72.95...
## $ `SWITZERLAND - FRANC/US$`                   <chr> "1.5808", "1.5565", "1....
## $ `TAIWAN - NEW TAIWAN DOLLAR/US$`            <chr> "31.38", "30.6", "30.8"...
## $ `THAILAND - BAHT/US$`                       <chr> "36.97", "37.13", "37.1...

Since, I want to analyze EUR to USD only, I have to select() the variable and change the type of EURO AREA EURO/US$ to integer. Then, I will filter the date from 2010.

forex <- read %>% 
  mutate(`EURO AREA - EURO/US$`= as.numeric(`EURO AREA - EURO/US$`)) %>% 
  select(date = `Time Serie`, usd_to_eur = `EURO AREA - EURO/US$`) %>% 
  filter(date >= ymd("2010-01-01"))

head(forex)

## # A tibble: 6 x 2
##   date       usd_to_eur
##   <date>          <dbl>
## 1 2010-01-01     NA    
## 2 2010-01-04      0.694
## 3 2010-01-05      0.694
## 4 2010-01-06      0.694
## 5 2010-01-07      0.699
## 6 2010-01-08      0.696

Okay, next, I have to ensure that the date has no missing value. I will use pad() to do it.

# Find the starting time
min(forex$date)

## [1] "2010-01-01"

# Find the end time
max(forex$date)

## [1] "2019-12-31"

Do the padding

forex<- forex %>% 
  pad(start_val = ymd("2010-01-01"), end_val = ymd("2019-12-31"))
head(forex)

## # A tibble: 6 x 2
##   date       usd_to_eur
##   <date>          <dbl>
## 1 2010-01-01     NA    
## 2 2010-01-02     NA    
## 3 2010-01-03     NA    
## 4 2010-01-04      0.694
## 5 2010-01-05      0.694
## 6 2010-01-06      0.694

Great!

Now, check for missing values

colSums(is.na(forex))

##       date usd_to_eur 
##          0       1150

Exploratory Data Analysis

In order to get a clearer view, I will plot the time-series data using ggplotly()

p1<- forex %>% 
  ggplot(mapping = aes(x = date, y = usd_to_eur))+
  geom_line(col = "DarkRed")+
  labs(title = "Time Series Model : USD to EUR Between 2010 and 2019", y ="USD to EUR Rate", x= NULL)+
  theme_minimal()
  

ggplotly(p1)

As we see from the chart above, the line still not connected each other. This indicates that there are some missing values in the time series data. To handle missing values of time-series model, we have to build the time series model first.

Modelling

To build time-series model is quite easey. Just insert the variable we want and set start time and end time. Do not forget to determine frequency because can affect much difference in the model.At this case, I will choose 365 (as the time-series data is a daily pattern).
Use ts() funcion.

forex_ts<-ts(forex$usd_to_eur, start = c(2010,1), frequency = 365)

Try to plot using autoplot()

forex_ts %>% 
  autoplot()+
  theme_minimal()

Nice. We have a similar chart as the previous with ggplot().

As we see from the dataframe, there are NA or missing value inside of it. I will try to replace the NA using imputeTS:: package. I am using na_kalman with model auto.arima to replace the NA.

forex_kalman<-na_kalman(forex_ts, model = "auto.arima", smooth = T)
forex_kalman %>% 
  autoplot()+
  theme_minimal()

Cool! The lines are now fully connected. Now, I can continue to process and analyze the model.

Decomposing

Decomposing is a kind process that split the time-series data into three main components which are :

Seasonal : up or down pattern of graph
Trend : repeating pattern for a certain time
Reminder : information which is not recognized by the model.

We can use decompose() function to do the process. This function has type argument which will determine the results of the decomposition.
There are two options for type :

Additive : DATA = TREND + SEASONALITY + ERROR
Multiplicative : DATA = TREND * SEASONALITY * ERROR

Before picking type, we have to do observation of the time-series chart that has been built. In this case, I will pick for additive.

dec <- decompose(x = forex_kalman, type = "additive")
dec %>% 
  autoplot()+
  theme_minimal()

As we see on the trend line, the time-series model is tended to fluctuate.This pattern in trend might be sourced from uncaptured extra seasonality from higher natural period in this case,so it can be considered as multi-seasonal data. In order to observe it, we have to change the forex ts obejct into msts

forex_msts<- msts(forex$usd_to_eur, seasonal.periods = c(7,365), start = c(2010,1))
plot(forex_msts, main="USD To EUR", xlab="Year", ylab="USD To EUR")

Great! Now we have a new correct multiple seasonal time series (msts) model. Do another NA imputation again

msts_kalman <- na_kalman(forex_msts, model = "auto.arima", smooth = T)
msts_kalman %>% 
  autoplot()+
  theme_minimal()

Next, let’s do decomposing. We can use mstl() function.

msts_kalman %>% mstl() %>% autoplot() + theme_minimal()

As we see from the chart above, this msts model has a quite clear patter in 365 days of frequency. Before going to the next step, I will split the data into train and test.

msts_test <- tail(msts_kalman, 365) 
msts_train <- head(msts_kalman, length(msts_kalman)- length(msts_test))

Forecasting For Validation

In order to forecast the incoming data, I will use several different time-series forecasting techniques:

SMA : Simple Moving Average
ETS : Error, Trend, and Seasonal
Holt : Holt method (data with trend but without seasonal)
ARIMA : Autoregressive Integrated Moving Average

At the end, I am going to compare the results based on their errors produced.

Simple Moving Average (SMA)

To use SMA, the first thing we need to do is model fitting. UseSMA() function and try to choose 3 as period number.

forex_sma <- SMA(x = msts_train, n = 5)
forex_sma <- msts (forex_sma, start = c(2010,1), seasonal.periods = c(7,365))

Visualize the model to do comparison

msts_train %>% 
  autoplot(series = "Actual") +
  autolayer(forex_sma, series ="SMA")+
  scale_color_manual(values = c("Black", "Red"))+
  theme_minimal()

From the chart above, we can see slight difference betweeen the actual series and the SMA series. If we change tha value of n in SMA model into a higher value, there will be more removed data.

Do the forecasting

forex_forecast_sma <- forecast(object = forex_sma, h = 365)

Visualize the forecast model

msts_kalman %>% 
  autoplot(series = "train") +
  autolayer(msts_test, series = "test") +
  autolayer(forex_forecast_sma$fitted, series = "forecast train") +
  autolayer(forex_forecast_sma$mean, series = "forecast test") +
  theme_minimal()

From the line graph above, there is a slight difference in the last data. SMA model produced a lower value of US dollar rate.

Check for the accuracy

sma_acc<-accuracy(f = forex_forecast_sma$mean, msts_test)
sma_acc

##                  ME       RMSE        MAE      MPE     MAPE      ACF1 Theil's U
## Test set 0.02629245 0.02978435 0.02650706 2.926293 2.950941 0.9851078  14.93062

This model has a low RMSE (Root Mean Squared Error) and MAE (Mean Abosulte Error) which is a good news.

Exponential Smoothing

Before choosing the type of exponential smoothing, we have to find out whether there is seasonal pattern on the model or no. I can use stl_features() function to do it.

stl_features(msts_kalman)

##           nperiods   seasonal_period1   seasonal_period2              trend 
##       2.000000e+00       7.000000e+00       3.650000e+02       9.304959e-01 
##              spike          linearity          curvature             e_acf1 
##       3.152743e-14       3.458855e+00      -3.451180e-01       9.853241e-01 
##            e_acf10 seasonal_strength1 seasonal_strength2              peak1 
##       8.313717e+00       4.926877e-03       1.134930e-01       1.000000e+00 
##              peak2            trough1            trough2 
##       1.460000e+02       5.000000e+00       2.910000e+02

From the result above, we know that the model has seasonal period and seasonal strength. Therefore, we can create exponential smoothing using seasonal parameter.

ETS (Error, Trend, Seasonal)

In stlm() function, there are several parameters that have to be assigned :

y : time series object
method : method of modelling (“ets”, “arima”) /li>

Now, I will try to make create model using stlm() function

#model building
stlm_ets <- msts_train %>%
  stlm(lambda = 0, method = "ets") %>% 
  forecast(h = 365) 

#plot the model
autoplot(stlm_ets)+
  scale_x_continuous(labels = scales::number_format(accuracy = 1))+
  theme_minimal()

Create foercast object to make a comparison using autoplot()

forex_forecast_ets <- forecast(object = stlm_ets, h= 365) #forecast for next 365 days

msts_kalman %>% 
  autoplot(series = "train") +
  autolayer(msts_test, series = "test") +
  autolayer(forex_forecast_ets$fitted, series = "forecast train") +
  autolayer(forex_forecast_ets$mean, series = "forecast test") +
  theme_minimal()

Great! The cart above shows us the model partition to compare the result. It seems that forecast_train data using stlm() ets() gives much difference.

Next step is model evaluation. I will use accuracy() function

ets_acc<-accuracy(f = forex_forecast_ets$mean, msts_test)
ets_acc

##                  ME       RMSE        MAE      MPE     MAPE      ACF1 Theil's U
## Test set 0.02554848 0.02896997 0.02577542 2.843545 2.869584 0.9792286   14.5232

By using accuracy() function, we can observe numbers of errors that made by the model. I woll try to focus on RMSE (Root Mean Squared Error) and MAE (Mean Absolute Error). It is very clear that RMSE and MAE value of this ETS forecasting is very low (~0.02), hence the model built produced a high accuracy result.

Holt Winters

HoltWinters method is good to be used on model that has trend and seasonal. It will do smoothing on trend and error components. Let’s call HoltWinters() function.

forex_holtwinters_holtwinters <- HoltWinters(x = msts_train)

Do forecasting for the data test

forex_forecast_holtwinters <- forecast(object = forex_holtwinters_holtwinters, h = 365)

I will try to visualize using autoplot().

msts_kalman %>% 
  autoplot(series = "train") +
  autolayer(msts_test, series = "test") +
  autolayer(forex_forecast_holtwinters$fitted, series = "forecast train") +
  autolayer(forex_forecast_holtwinters$mean, series = "forecast test") +
  theme_minimal()

The forecast result produced using Holt Winters seem slightly different form the actual data. To make sure, call the accuracy() function.

holt_acc<-accuracy(forex_forecast_holtwinters$mean, msts_test)
holt_acc

##                  ME       RMSE        MAE     MPE     MAPE      ACF1 Theil's U
## Test set 0.05306738 0.06057553 0.05394233 5.91461 6.014995 0.9887554  30.44576

As we look at the error result, this model has a low error produce (~0.05 - 0.06) for RMSE and MAE.

Autoregressive Integrated Moving Average (ARIMA)

ARIMA is a combination of two forecasting method, Autoregressive (AR) and Moving Average (MA).

Now fit the model. I am going to use stlm() function do to it and put “arima” on method argument.

forex_arima <- stlm(y = msts_train, method = "arima")

Do the forecasting

forex_forecast_arima <- forecast(object = forex_arima, h = 365)

And visualize time!

msts_kalman %>% 
  autoplot(series = "train") +
  autolayer(msts_test, series = "test") +
  autolayer(forex_forecast_arima$fitted, series = "forecast train") +
  autolayer(forex_forecast_arima$mean, series = "forecast test") +
  theme_minimal()

Check the accuracy :

arima_acc<-accuracy(f = forex_forecast_arima$mean, msts_test)
arima_acc

##                  ME       RMSE        MAE      MPE     MAPE      ACF1 Theil's U
## Test set 0.02586933 0.02938157 0.02606508 2.879121 2.901588 0.9800462  14.72804

Based on the result, RMSE and MAE value are also very low. It means it is a good model forecasting.

Forecasting For Incoming year 2020

After establishing all of the three forecasting method, we can conclude that all of them produced good models (low errors). Hence, I am going to try to build three models for year 2020.
At this part, I do no have to split the data because I am trying to forecast next 365 days using whole data.

Simple Moving Average

Build the SMA model

forex_sma_2020 <- SMA(x = msts_kalman, n = 5)
forex_sma_2020 <- msts(data = forex_sma_2020, start = c(2010,1),seasonal.periods = c(7,365))

Forecasting

forex_forecast_sma_2020 <- forecast(object = forex_sma_2020, h = 365)

Visualization

ggplotly(msts_kalman %>% 
  autoplot(series = "Actual") +
  autolayer(forex_forecast_sma_2020$mean, series = "Forecast 2020") +
  labs(title = "USD to EUR Forecast for 2020 using SMA", y = "USD to EUR")+
  theme_minimal()
)

sma_result<-data.frame("Date" = rep(seq(as.Date('2020-01-01'), as.Date('2020-12-30'), by = 'days')), "SMA"=forex_forecast_sma_2020$mean)

Exponential Smoothing

Same as the previous, I am going to build model using three methods.

ETS (Error, Trend, Seasonal)

Build The model

forex_ets_2020 <- stlm(y = msts_kalman, method = "ets")

Forecasting

forex_forecast_ets_2020 <- forecast(object = forex_ets_2020, h = 365)

Visualize!

ggplotly(msts_kalman %>% 
  autoplot(series = "Actual") +
  autolayer(forex_forecast_ets_2020$mean, series = "Forecast 2020") +
  labs(title = "USD to EUR Forecast for 2020 using ETS", y = "USD to EUR")+
  theme_minimal()
)

Store the result to an object:

ets_result <- data.frame("Date" = rep(seq(as.Date('2020-01-01'), as.Date('2020-12-30'), by = 'days')), "ETS"=forex_forecast_ets_2020$mean)

Holt Winters

Build the model:

forex_holtwinters_2020 <- HoltWinters(x = msts_kalman)

Forecasting :

forex_forecast_holwinters_2020 <- forecast(object = forex_holtwinters_2020, h = 365)

Visualize :

ggplotly(msts_kalman %>% 
  autoplot(series = "Actual") +
  autolayer(forex_forecast_holwinters_2020$mean, series = "Forecast 2020") +
  labs(title = "USD to EUR Forecast for 2020 using HoltWinters", y = "USD to EUR")+
  theme_minimal()
)

Store the result to an object

holtwtinters_result<-data.frame("Date" = rep(seq(as.Date('2020-01-01'), as.Date('2020-12-30'), by = 'days')), "HoltWinters"=forex_forecast_holwinters_2020$mean)

Autoregressive Integrated Moving Average (ARIMA)

Build the model :

forex_arima_2020 <- stlm(y = msts_kalman, method = "arima")

Forecasting :

forex_forecast_arima_2020 <- forecast(object = forex_arima_2020, h = 365)

Visualize :

ggplotly(msts_kalman %>% 
  autoplot(series = "Actual") +
  autolayer(forex_forecast_arima_2020$mean, series = "Forecast 2020") +
  labs(title = "USD to EUR Forecast for 2020 using ARIMA", y = "USD to EUR")+
  theme_minimal()
)

Store the result to an object

arima_result <- data.frame("Date" = rep(seq(as.Date('2020-01-01'), as.Date('2020-12-30'), by = 'days')), "ARIMA"=forex_forecast_arima_2020$mean)

Conclusion

To conlclude, I am going to combine all of the result data frame into one data.frame

result<-sma_result %>% 
  left_join(ets_result) %>% 
  left_join(holtwtinters_result) %>% 
  left_join(arima_result)

paged_table(result, options = list(rows.print = 10))

I will show the error comparison

error <-data.frame("Model" = c("SMA","ETS","Holt Winters","ARIMA"),"RMSE" = c(sma_acc[[2]], ets_acc[[2]], holt_acc[[2]], arima_acc[[2]]), "MAE" = c(sma_acc[[3]], ets_acc[[3]], holt_acc[[3]], arima_acc[[3]]))
paged_table(error)

I will check the range of result

range(result$SMA)

## [1] 0.8726728 0.8955830

range(result$ETS)

## [1] 0.8765365 0.8995205

range(result$HoltWinters)

## [1] 0.8190207 0.9139066

range(result$ARIMA)

## [1] 0.8770886 0.8997408

Overall, the forecast results between all models are quite similar. They can produce results with low error. Therefore, these models worked well on the dataset. And for final result, the US Dollar rate for Euro in 2020 will be around 0.87 - 0.91.

Ending

So, that’s all for the process of time-series forecasting using packages in R programming language.I hope this page can help you understand time-series problem and the solution behind it.

See you in the other page!

Author,
Alfado Sembiring

Notes :
In case you want to look up my profile, click the link below :
Jump To My Profile (open link in a new tab)

US Dollar to Euro Rate Forecast for 2020

Alfado Sembiring

Introduction

Objective

Pre-Start

Data Wrangling

Exploratory Data Analysis

Modelling

Decomposing

Forecasting For Validation

Simple Moving Average (SMA)

Exponential Smoothing

ETS (Error, Trend, Seasonal)

Holt Winters

Autoregressive Integrated Moving Average (ARIMA)

Forecasting For Incoming year 2020

Simple Moving Average

Exponential Smoothing

ETS (Error, Trend, Seasonal)

Holt Winters

Autoregressive Integrated Moving Average (ARIMA)

Conclusion

Ending