Kyat is the national currency in Myanmar. The Central Bank of Myanmar sets 1 daily exchange rate each for 38 currencies. Some days it does not issue a new rate.
The country has no real credit system. I once helped my friend’s father make a down payment on some real estate… We borrowed 7 garbage bags full of cash from two of the larger banks and literally carried them down the street to a third bank, who issued what might be called a mortgage.
All imports are purchased with USD, exchanged at official money changers, big banks, or the black market for gold and cash. Most people use a combination of these, and the rates generally average around the official ones.
library(dplyr)
library(zoo)
We use na.locf() from “zoo” to replace null values with the previous non-null value. Trading generally continues even if a new rate has not been released.
rates <- read.csv(
"https://raw.githubusercontent.com/TheWerefriend/exchange-rate-prediction/master/rates.csv") %>%
na.locf()
colnames(rates)[colnames(rates) == "X"] <- "date"
rates <- mutate(rates, date=as.Date(date[[1]]))
str(rates)
## 'data.frame': 3113 obs. of 39 variables:
## $ date: Date, format: "2012-07-02" "2012-07-02" ...
## $ AUD : num 898 903 908 907 906 ...
## $ BDT : num 10.7 10.7 10.7 10.8 10.8 ...
## $ BND : num 686 694 696 699 698 ...
## $ BRL : num 436 439 443 438 436 ...
## $ CAD : num 862 866 870 872 870 ...
## $ CHF : num 923 922 924 921 910 ...
## $ CNY : num 138 139 139 139 139 ...
## $ CZK : num 43.6 43.5 43.5 43.5 43.1 ...
## $ DKK : num 150 149 149 149 148 ...
## $ EGP : num 146 146 145 146 146 ...
## $ EUR : num 1109 1107 1110 1107 1094 ...
## $ GBP : num 1374 1381 1382 1377 1371 ...
## $ HKD : num 113 113 114 114 114 ...
## $ IDR : num 9.35 9.38 9.42 9.44 9.4 ...
## $ ILS : num 225 225 225 225 225 ...
## $ INR : num 15.8 15.9 16.3 16.2 16.1 ...
## $ JPY : num 1100 1105 1106 1104 1105 ...
## $ KES : num 10.4 10.5 10.5 10.5 10.5 ...
## $ KHR : num 21.5 21.5 21.5 21.6 21.5 ...
## $ KRW : num 76.8 77 77.7 77.7 77.6 ...
## $ KWD : num 3136 3142 3144 3150 3145 ...
## $ LAK : num 11 11 11 11 11 ...
## $ LKR : num 6.59 6.6 6.61 6.61 6.6 ...
## $ MYR : num 278 278 280 280 279 ...
## $ NOK : num 147 147 148 148 147 ...
## $ NPR : num 10.3 10.2 10.2 10.3 10.2 ...
## $ NZD : num 703 707 709 710 712 ...
## $ PHP : num 20.9 21 21.1 21.2 21.1 ...
## $ PKR : num 9.28 9.3 9.31 9.34 9.36 ...
## $ RSD : num 9.6 9.61 9.63 9.64 9.53 ...
## $ RUB : num 27.1 27.1 27.2 27.4 27.2 ...
## $ SAR : num 234 235 235 235 235 ...
## $ SEK : num 127 127 127 127 127 ...
## $ SGD : num 692 694 697 698 697 ...
## $ THB : num 27.8 27.8 28 28 27.9 ...
## $ USD : num 878 880 881 883 883 883 883 883 884 885 ...
## $ VND : num 4.23 4.22 4.21 4.23 4.22 ...
## $ ZAR : num 108 108 108 109 108 ...
Every rate is considered an independent variable, and the target variable is the rate for “USD” with a predefined lag duration. Take out the date.
xAndY <- function(days = 1, input = rates) {
x <- input[1:(nrow(input)-days), 2:39]
return(mutate(x, y = input[(days+1):nrow(input), "USD"]))
}
dataSplit <- function(data, ratio = 0.7) {
n <- nrow(data) * ratio
train <- data[1:n,]
test <- data[(n+1):nrow(data),]
return(list(train, test))
}
… and check a summary of what happened.
data <- xAndY(5) %>% dataSplit()
model <- lm(formula = y ~ ., data = data[[1]])
summary(model)
##
## Call:
## lm(formula = y ~ ., data = data[[1]])
##
## Residuals:
## Min 1Q Median 3Q Max
## -971.20 -6.27 0.14 7.40 254.67
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 731.333974 20.645845 35.423 < 2e-16 ***
## AUD 0.151422 0.050567 2.994 0.00278 **
## BDT 9.558132 7.026031 1.360 0.17385
## BND 0.019637 0.045605 0.431 0.66681
## BRL 0.116824 0.066048 1.769 0.07707 .
## CAD -0.032364 0.047783 -0.677 0.49828
## CHF -0.055661 0.034690 -1.605 0.10874
## CNY 0.751970 0.542076 1.387 0.16552
## CZK 4.882774 1.159796 4.210 2.66e-05 ***
## DKK -0.515311 0.658887 -0.782 0.43425
## EGP 0.129335 0.084175 1.536 0.12456
## EUR 0.115569 0.087683 1.318 0.18763
## GBP 0.028130 0.019845 1.417 0.15649
## HKD 1.621013 4.192098 0.387 0.69903
## IDR 4.887404 4.319901 1.131 0.25803
## ILS 0.945622 0.161575 5.853 5.59e-09 ***
## INR 3.736317 6.761463 0.553 0.58060
## JPY -0.085245 0.027454 -3.105 0.00193 **
## KES -27.797521 5.643598 -4.925 9.06e-07 ***
## KHR -8.970387 2.949512 -3.041 0.00238 **
## KRW -0.002126 0.465226 -0.005 0.99636
## KWD -0.261693 0.065146 -4.017 6.10e-05 ***
## LAK 45.754378 10.858411 4.214 2.62e-05 ***
## LKR -11.895599 8.834494 -1.346 0.17829
## MYR -0.228767 0.114190 -2.003 0.04526 *
## NOK -1.326378 0.288748 -4.594 4.61e-06 ***
## NPR 3.385088 10.729146 0.316 0.75241
## NZD -0.038123 0.043031 -0.886 0.37575
## PHP -25.360148 2.719297 -9.326 < 2e-16 ***
## PKR -11.224259 4.658354 -2.409 0.01606 *
## RSD 12.178645 7.053832 1.727 0.08440 .
## RUB 1.273497 0.706271 1.803 0.07151 .
## SAR 0.811575 2.376808 0.341 0.73279
## SEK -1.910379 0.336862 -5.671 1.61e-08 ***
## SGD -0.496608 0.161564 -3.074 0.00214 **
## THB -12.554782 2.078591 -6.040 1.81e-09 ***
## USD 1.706139 0.820625 2.079 0.03773 *
## VND -20.844373 18.593320 -1.121 0.26238
## ZAR 0.464775 0.214310 2.169 0.03022 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24.41 on 2136 degrees of freedom
## Multiple R-squared: 0.9828, Adjusted R-squared: 0.9825
## F-statistic: 3215 on 38 and 2136 DF, p-value: < 2.2e-16
According to this summary, the list of currencies with a high R squared value is:
BDT, BND, CAD, CHF, CNY, DKK, EGP, EUR, GBP, HKD, IDR, INR, KRW, LKR, NPR, NZD, SAR, VND
These are mostly the currencies with free-floating values, realistic inflationary or monetary targets, and the highest volumes of exchange with the kyat. It kinda makes sense.
Oddly enough, USD does not quite make the cut. Maybe this means we should select this subset of currencies as inputs? I’m not sure.
pred <- predict.lm(model, data[[2]])
mean(pred - data[[2]][,"y"])
## [1] -72.64751
The mean of the difference between the predicted values and the actual values (all at the temporal end of the dataset) is around 72 kyats or 5% in terms of the most recent 75 data points. This means (I suppose) that the rate of inflation increases closer to the present time. I do not yet know a better way to do this in R! Andy, please shed some light.
Removing the currencies with unexplained variance and introducing some higher order variables would probably make this model stronger. Considering that this dataset crosses multiple authoritarian regimes (rumored to set rates by astrology), and that the exchange rate had a smaller rate of change ten years ago, perhaps we can introduce the dates of various political shifts as independent variables, as well.