Prediction of Rupee Exchange Rate over the short term using a Neural Network

In this paper I try to model the exchange rate dynamics of Rupee vs US Dollar using a neural network. I will develop 2 models using neural networks and compare their performance.

Model 1: I will use a linear regression model for the long term trend on the exchange rate and shorter term variations will be modelled with a neural network. Reason for this is that the long term trend acts as a bias in input data which could lead to inaccuracy in predictions for short term variations

Model 2: I will use a neural network to model the exchange rate variations without the help of a linear model to remove the bias.

There is a long term correlation between the Rupee exchange rate and factors such as the trade deficit and energy imports. This has been reported in other papers. Ref: http://balabsicsr.blogspot.in/2016/03/rupee-exchange-rate-dynamics-from-1993.html

Short term exchange rates depend on currency trading, government interventions and market movements. This paper will not focus on the causal factors driving the long or short term exchange rates, but try to build an empirical model using the time series approach.

The data on Rupee exchange rate vs US Dollar used in this paper is available here: https://in.investing.com/currencies/usd-inr-historical-data

Reference for the use of neural network model in this paper is the R-Blogger article on Neural Networks: http://goo.gl/uF7F9i

The first step here is to load the data exported from the above site in a csv format.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: lattice
## Loading required package: ggplot2

Model 1: Use a linear regression model to remove the input bias and train neural network using residue

The data is sorted in a chronological order and a linear regression model of the exchange rate as a function of time is developed.

training$Date <- as.Date(training$Date, "%m/%d/%y")
training_sorted <-  training[order(as.Date(training$Date, format="%m/%d/%y")),]
training_sorted$ID <- seq.int(nrow(training_sorted))
#training_sorted
#build a linear model for rupee exchange rate over time

lm1 <- lm(Price~ID, data=training_sorted)
summary(lm1)
## 
## Call:
## lm(formula = Price ~ ID, data = training_sorted)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.7865 -1.2184  0.2925  1.4377 11.9942 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4.415e+01  1.171e-01   376.9   <2e-16 ***
## ID          1.327e-02  1.016e-04   130.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.616 on 1994 degrees of freedom
## Multiple R-squared:  0.8954, Adjusted R-squared:  0.8953 
## F-statistic: 1.707e+04 on 1 and 1994 DF,  p-value: < 2.2e-16
g1 = ggplot(training_sorted, aes(x=Date, y=Price))
g1 = g1 + geom_point(size=2, colour="black")+geom_point()+geom_smooth()
g1 = g1 + xlab("Date") + ylab("USD - INR EXCHANGE RATE")
g1

g2 = ggplot(training_sorted, aes(x=ID, y=Price))
g2 = g2 + geom_point(size=2, colour="black")+geom_point()+geom_smooth()
g2 = g2 + xlab("ID") + ylab("USD - INR EXCHANGE RATE")
g2 = g2 + geom_abline(data=training_sorted,intercept=coef(lm1)[1],slope=coef(lm1)[2],size=0.5)
g2

As can be seen, the linear regression model can predict 89% of the variation with a p-value less than 0.0001 indicating a good fit. We can thus use this model to remove the bias (long term trend) in the data. The data starts in 2010 when the price of oil in India was de-regulated.Rupee exchange rate for each of the days starting 2010 is available from the reference indicated giving a fine granularity to the short term prediction.

The predicted values using this linear regression model are subtracted from the actual values of exchange rate and the residue is stored in the data frame with the index for each point.

bias <- predict(lm(Price~ID, data=training_sorted), se.fit = TRUE)$fit
stde <- predict(lm(Price~ID, data=training_sorted), se.fit = TRUE)$se
training_sorted$Mean <- bias
training_sorted$STDE <- training_sorted$Price - training_sorted$Mean

g2 = ggplot(training_sorted, aes(x=Date, y=STDE))
g2 = g2 + geom_point(size=1, colour="black")+geom_point()+geom_smooth()
g2 = g2 + xlab("ID") + ylab("USD - INR EXCHANGE RATE STD ERROR")
g2

We now have the data set to use for training the neural network ready. This data set will be split into 2 parts: training data and test data. The training set will be used to train a neural network having 2 hidden layers with 3 and 2 neurons each. The model developed will be tested on the test data set.

The data needs to be scaled to a range of 0-1 before input to the neural network. Before scaling the data, 100 null rows are added to the data frame to store the predicted values.

The data is scaled on the ID and standard error fields which are going to be the input and output of the neural network respectively.

# STEP 1: SCALE THE STANDARD ERROR DATA
#data <- training_sorted[,c("Date","STDE")]
data <- training_sorted_predict[,c("ID","STDE")]

maxs <- apply(data, 2, max)
mins <- apply(data, 2, min)

#scaled <- as.data.frame(scale(data$STDE, center=as.numeric(mins[2]), scale=as.numeric(maxs[2])-as.numeric(mins[2])))
scaled <- as.data.frame(scale(data, center=as.numeric(mins), scale=maxs-mins))

training_sorted_predict$STDE_scaled <-  scaled$STDE
training_sorted_predict$ID_scaled <- scaled$ID

g3 = ggplot(training_sorted_predict, aes(x=Date, y=STDE_scaled))
g3 = g3 + geom_point(size=1, colour="black")+geom_point()+geom_smooth()
g3 = g3 + xlab("DATE") + ylab("USD - INR EXCHANGE RATE STD ERROR SCALED")
g3

##SPLIT DATA SET INTO TRAINING AND TESTING DATA
inTrain <- createDataPartition(training_sorted$ID, p=0.76, list=FALSE)
myTraining <- training_sorted_predict[inTrain, ]
myTesting <- training_sorted_predict[-inTrain, ]
myTesting <- myTesting[myTesting$ID<=nrow(training_sorted),]
train <- myTraining[, c("STDE_scaled","ID_scaled")]
test <- myTesting[, c("STDE_scaled","ID_scaled")]

The neural network is trained using this training data. The neuralnet library in R is used for this paper.

library(neuralnet)
n<- names(train)
f <- as.formula(paste("as.matrix(STDE_scaled)~",paste(n[!n %in% "STDE_scaled"],collapse="+")))
nn <-neuralnet(f, data=train, hidden=c(3,2), linear.output= T)
pr.nn <- compute(nn,test[,c(2)])

The mean square error of this prediction is quite low. Cross validation with different splits of this data set gives a mean square error in the range of 2-4

##mean square error of neural network on testing data
pr.nn_ <- pr.nn$net.result*(max(training_sorted_predict$STDE)-min(training_sorted_predict$STDE))+min(training_sorted_predict$STDE)
test.r <- (test$STDE_scaled)*(max(training_sorted_predict$STDE)-min(training_sorted_predict$STDE))+min(training_sorted_predict$STDE)
MSE.nn1 <- sum((test.r - pr.nn_)^2/nrow(test))
print(paste(MSE.nn1))
## [1] "2.08943674034446"

The value of exchange rate for next 100 days is predicted by combining the bias (long term trend) from the linear regression model and shorter term prediction from neural network. The result is close to what is expected for out of sample data.

pr.nn2 <- compute(nn,training_sorted_predict[training_sorted_predict$ID>nrow(training_sorted),c("ID_scaled")])
pr.nn2_unscaled <- pr.nn2$net.result*(max(training_sorted_predict$STDE)-min(training_sorted_predict$STDE))+min(training_sorted_predict$STDE)

new <- data.frame(ID = training_sorted_predict[training_sorted_predict$ID > nrow(training_sorted),c("ID")])
predicted_linear <- predict(lm1,new, se.fit=TRUE )
predicted_net <- predicted_linear$fit + (pr.nn2_unscaled)
predicted_net
##               [,1]
##   [1,] 63.34397157
##   [2,] 63.31855141
##   [3,] 63.29319293
##   [4,] 63.26789801
##   [5,] 63.24266855
##   [6,] 63.21750641
##   [7,] 63.19241345
##   [8,] 63.16739150
##   [9,] 63.14244236
##  [10,] 63.11756785
##  [11,] 63.09276972
##  [12,] 63.06804975
##  [13,] 63.04340966
##  [14,] 63.01885118
##  [15,] 62.99437601
##  [16,] 62.96998581
##  [17,] 62.94568226
##  [18,] 62.92146698
##  [19,] 62.89734159
##  [20,] 62.87330768
##  [21,] 62.84936684
##  [22,] 62.82552060
##  [23,] 62.80177050
##  [24,] 62.77811804
##  [25,] 62.75456472
##  [26,] 62.73111199
##  [27,] 62.70776130
##  [28,] 62.68451406
##  [29,] 62.66137167
##  [30,] 62.63833551
##  [31,] 62.61540691
##  [32,] 62.59258722
##  [33,] 62.56987772
##  [34,] 62.54727972
##  [35,] 62.52479445
##  [36,] 62.50242316
##  [37,] 62.48016706
##  [38,] 62.45802733
##  [39,] 62.43600514
##  [40,] 62.41410163
##  [41,] 62.39231792
##  [42,] 62.37065510
##  [43,] 62.34911425
##  [44,] 62.32769640
##  [45,] 62.30640259
##  [46,] 62.28523381
##  [47,] 62.26419105
##  [48,] 62.24327524
##  [49,] 62.22248734
##  [50,] 62.20182824
##  [51,] 62.18129883
##  [52,] 62.16089996
##  [53,] 62.14063248
##  [54,] 62.12049721
##  [55,] 62.10049493
##  [56,] 62.08062642
##  [57,] 62.06089241
##  [58,] 62.04129365
##  [59,] 62.02183082
##  [60,] 62.00250461
##  [61,] 61.98331568
##  [62,] 61.96426467
##  [63,] 61.94535218
##  [64,] 61.92657880
##  [65,] 61.90794512
##  [66,] 61.88945168
##  [67,] 61.87109900
##  [68,] 61.85288760
##  [69,] 61.83481796
##  [70,] 61.81689054
##  [71,] 61.79910579
##  [72,] 61.78146413
##  [73,] 61.76396597
##  [74,] 61.74661169
##  [75,] 61.72940165
##  [76,] 61.71233620
##  [77,] 61.69541566
##  [78,] 61.67864034
##  [79,] 61.66201052
##  [80,] 61.64552647
##  [81,] 61.62918844
##  [82,] 61.61299665
##  [83,] 61.59695133
##  [84,] 61.58105266
##  [85,] 61.56530082
##  [86,] 61.54969596
##  [87,] 61.53423823
##  [88,] 61.51892776
##  [89,] 61.50376463
##  [90,] 61.48874896
##  [91,] 61.47388080
##  [92,] 61.45916022
##  [93,] 61.44458725
##  [94,] 61.43016193
##  [95,] 61.41588425
##  [96,] 61.40175421
##  [97,] 61.38777180
##  [98,] 61.37393696
##  [99,] 61.36024966
## [100,] 61.34670983
training_sorted_predict[1997:2096,c("Price")] <- predicted_net

g7 = ggplot(training_sorted_predict, aes(x=ID, y=Price))
g7 = g7 + geom_point(size=1, colour="black")+geom_point()+geom_smooth()
g7 = g7 + xlab("ID") + ylab("USD - INR EXCHANGE RATE PRICE")
g7

The last 100 points in the above figure are the predicted values from the model 1.

Model 2: A neural network is used to make a prediction on the time series of the exchange rate without removing the bias.

Here, the in-sample error is comparable to what we see for Model 1.

##MODELS THAT DOES NOT USE LINEAR REGRESSION TO REMOVE THE BIAS

# STEP 1: SCALE THE STANDARD ERROR DATA

data <- training_sorted_predict[,c("ID","Price")]

maxs <- apply(data, 2, max)
mins <- apply(data, 2, min)

#scaled <- as.data.frame(scale(data$STDE, center=as.numeric(mins[2]), scale=as.numeric(maxs[2])-as.numeric(mins[2])))
scaled <- as.data.frame(scale(data, center=as.numeric(mins), scale=maxs-mins))

training_sorted_predict$Price_scaled <-  scaled$Price
training_sorted_predict$ID_scaled <- scaled$ID

g3 = ggplot(training_sorted_predict, aes(x=Date, y=Price_scaled))
g3 = g3 + geom_point(size=1, colour="black")+geom_point()+geom_smooth()
g3 = g3 + xlab("DATE") + ylab("USD - INR EXCHANGE RATE PRICE SCALED")
g3

##SPLIT DATA SET INTO TRAINING AND TESTING DATA
inTrain <- createDataPartition(training_sorted$ID, p=0.6, list=FALSE)
myTraining <- training_sorted_predict[inTrain, ]
myTesting <- training_sorted_predict[-inTrain, ]
myTesting <- myTesting[myTesting$ID<=nrow(training_sorted),]
dim(myTraining); dim(myTesting)
## [1] 1200   12
## [1] 796  12
##TRAIN THE NEURAL NETWORK USING TRAINING DATA
train <- myTraining[, c("Price_scaled","ID_scaled")]
test <- myTesting[, c("Price_scaled","ID_scaled")]

library(neuralnet)
n<- names(train)
f <- as.formula(paste("as.matrix(Price_scaled)~",paste(n[!n %in% "Price_scaled"],collapse="+")))
nn <-neuralnet(f, data=train, hidden=c(3,2), linear.output= T)

plot(nn)
pr.nn <- compute(nn,test[,c(2)])
plot(nn)
##mean square error of neural network on testing data
pr.nn_ <- pr.nn$net.result*(max(training_sorted_predict$Price)-min(training_sorted_predict$Price))+min(training_sorted_predict$Price)
test.r <- (test$Price_scaled)*(max(training_sorted_predict$Price)-min(training_sorted_predict$Price))+min(training_sorted_predict$Price)
MSE.nn1 <- sum((test.r - pr.nn_)^2/nrow(test))
print(paste(MSE.nn1))
## [1] "3.22408915853453"

However, the out of sample predictions are very far from actual values.

##predict values for the next 100 days
##get a more accurate by value here
##pred_days <- seq(1,by=test[736,2]-test[735,2], length.out=100)
pr.nn2 <- compute(nn,training_sorted_predict[training_sorted_predict$ID>nrow(training_sorted),c("ID_scaled")])
pr.nn2_unscaled <- pr.nn2$net.result*(max(training_sorted_predict$Price)-min(training_sorted_predict$Price))+min(training_sorted_predict$Price)

predicted_net <- (pr.nn2_unscaled)
predicted_net
##               [,1]
##   [1,] 66.73814559
##   [2,] 66.73964219
##   [3,] 66.74113415
##   [4,] 66.74262150
##   [5,] 66.74410423
##   [6,] 66.74558237
##   [7,] 66.74705593
##   [8,] 66.74852492
##   [9,] 66.74998937
##  [10,] 66.75144927
##  [11,] 66.75290465
##  [12,] 66.75435552
##  [13,] 66.75580189
##  [14,] 66.75724379
##  [15,] 66.75868121
##  [16,] 66.76011417
##  [17,] 66.76154270
##  [18,] 66.76296679
##  [19,] 66.76438647
##  [20,] 66.76580175
##  [21,] 66.76721264
##  [22,] 66.76861915
##  [23,] 66.77002131
##  [24,] 66.77141911
##  [25,] 66.77281259
##  [26,] 66.77420174
##  [27,] 66.77558658
##  [28,] 66.77696712
##  [29,] 66.77834339
##  [30,] 66.77971538
##  [31,] 66.78108312
##  [32,] 66.78244662
##  [33,] 66.78380589
##  [34,] 66.78516094
##  [35,] 66.78651178
##  [36,] 66.78785844
##  [37,] 66.78920092
##  [38,] 66.79053923
##  [39,] 66.79187339
##  [40,] 66.79320340
##  [41,] 66.79452930
##  [42,] 66.79585107
##  [43,] 66.79716875
##  [44,] 66.79848233
##  [45,] 66.79979184
##  [46,] 66.80109728
##  [47,] 66.80239867
##  [48,] 66.80369603
##  [49,] 66.80498935
##  [50,] 66.80627866
##  [51,] 66.80756397
##  [52,] 66.80884528
##  [53,] 66.81012262
##  [54,] 66.81139600
##  [55,] 66.81266542
##  [56,] 66.81393089
##  [57,] 66.81519244
##  [58,] 66.81645007
##  [59,] 66.81770380
##  [60,] 66.81895363
##  [61,] 66.82019959
##  [62,] 66.82144167
##  [63,] 66.82267989
##  [64,] 66.82391427
##  [65,] 66.82514482
##  [66,] 66.82637154
##  [67,] 66.82759446
##  [68,] 66.82881357
##  [69,] 66.83002890
##  [70,] 66.83124046
##  [71,] 66.83244825
##  [72,] 66.83365229
##  [73,] 66.83485259
##  [74,] 66.83604916
##  [75,] 66.83724201
##  [76,] 66.83843115
##  [77,] 66.83961661
##  [78,] 66.84079838
##  [79,] 66.84197647
##  [80,] 66.84315091
##  [81,] 66.84432170
##  [82,] 66.84548885
##  [83,] 66.84665237
##  [84,] 66.84781228
##  [85,] 66.84896858
##  [86,] 66.85012130
##  [87,] 66.85127043
##  [88,] 66.85241598
##  [89,] 66.85355798
##  [90,] 66.85469643
##  [91,] 66.85583134
##  [92,] 66.85696273
##  [93,] 66.85809059
##  [94,] 66.85921496
##  [95,] 66.86033583
##  [96,] 66.86145321
##  [97,] 66.86256712
##  [98,] 66.86367757
##  [99,] 66.86478457
## [100,] 66.86588813
training_sorted_predict[1997:2096,c("Price")] <- predicted_net

g7 = ggplot(training_sorted_predict, aes(x=ID, y=Price))
g7 = g7 + geom_point(size=1, colour="black")+geom_point()+geom_smooth()
g7 = g7 + xlab("ID") + ylab("USD - INR EXCHANGE RATE PRICE")
g7

The last 100 values in the above chart are the predicted values from model 2. As can be seen, there is a discontinuity in the out of sample values for model 2 which is not present in model 1.

Thus we can conclude that using a linear regression model to remove the bias in input data before training the neural network improves the out of sample accuracy of the predictions made by the neural network.