ABSTRACT

The following analysis uses simple Data Science techniques to explore, analyze, and measure the relationship between the drop of international oil prices and the steep devaluation of the Colombian Peso against the US dollar between the months of September 2014 and October 2015.

It has been a known fact by economist that such a relationship always existed since oil exports represent a large percentage of Colombian GDP. But the exact analysis and correlation of variables is little known.

Oil drilling in Colombia

INTRODUCTION

It was in August of 2014 that I first heard from several executives of the Colombian oil industry that the new techniques in fracking and the lower cost of shale oil drilling would have a tremendous impact on the capacity of the national oil industry to generate sales, thus kick starting the devaluation of the Colombian peso in lieu of the smaller influx of US dollars into the national reserves.

Despite having this knowledge, and despite warning many of my customers in the shoe industry, no one really saw the devaluation coming until the first days of February. Why? The US dollar usually had its peaks around the vacation period compromised between the second half of December and the third week of January. Used to years of a relatively cheap dollar currency, Colombian families would spend the year-end outside their country and demand a bigger quantity of US currency, only to return before February, when the peak would quiet and the Forex would return to normal.

But in 2015 the currency exchange rate never returned to normal. Without the usual income from oil sales, now affected by lower prices from shale oil and excess of reserves, Colombian traders began to find aggregate demand for US dollars in a market where the American currency began to grow scarce.

For the athletic shoe industry this effect had negative implications. Customers who purchased and shipped goods prior to November 2014 and the Christmas season, began defaulting payments. Most of them had worked their cost accounting based on a US dollar trading between 1,850 to 1,900 pesos. When the debts matured around January and February 2015, customers preferred to speculate and default, with the intent of avoiding an exchange rate of COP 2,150 and working some sort of payment agreement when the exchanged declined.

The exchange never declined and the Colombian peso continued to devaluate in free fall. The Colombian government - anxious to please national exporters of coffee and other raw materials - purposely stated that would use no national reserves to avoid the exchange rate ascension. For President’s Santos cash-strapped office, a higher currency exchange rate probably meant more Colombian pesos per barrel of oil to meet an ever-increasing agenda of social and infrastructure programs. Up-to-date, the State has kept the same monetary policy position.

The athletic shoe industry took a severe blow from decreasing product importation into Colombia. Because of the Colombian peso devaluation, price adjusted goods now suffered a price increase relatively equivalent to the rise of the US dollar. When the devaluation reached the 50% mark, shoes were re-liquidated at the new currency level with price increments that were above what the consumer was used - or prepared - to pay. Some analyst predict that inflation will force the Colombian government to adjust minimum wages to compensate for consumers losing purchasing power. But political analysts doubt the amount of minimum wage adjustments will be even close to the level of devaluation and inflation of imported goods.

MEASURING CORRELATION OF INTERNATIONAL OIL PRICES TO THE EXCHANGE RATE OF THE COLOMBIAN PESO

Everybody in the industry had a rough idea of the relationship between falling international oil prices and the currency peg. But nobody in the industry had actually measured mathematically said relationship, let alone establish some sort of predictive model to estimate future pricing of the Colombian Peso. The need to correctly measure such relationship corresponded to defend the position of our particular company of reducing offer in light of diminishing demand. If the strategy would seem rather obvious, as in all things corporate, such reduction of offer would negatively affect volume forecasts for the years 2015 through 2017, something all executives were trying to avoid even if that meant refuting all but the most strict and mathematically based model.

Creating a Data Set for Measuring the Correlation

It is estimated that 70% to 80% of a time spent by a Data Science is wrangling and cleaning data. This small essay is probably the purest proof to such statement. Finding accurate and readily available data for the daily quotes of WTI and Brent oil was rather simple. Finding historical and accurate information for the official exchange rate of the US dollar in Colombian Pesos was a little more complicated but not so strenuously so that it couldn’t be achieved in a few Google searches. Putting all the information together in a simple file format that could be easily manipulated in R was probably the only hard part of the exercise. Each data source had its preferred date format, and for some strange reason none coincided. The solution was to web scrape the information from several pages, clean them in Excel, and build one unique table using vertical look-up formulas to later export to CSV format.

The price of the US dollar in Colombian Pesos is known as the TRM (Tasa de Referencia Bancaria or Banking Reference Rate). After some data wrangling, the final file had precise and accurate information on daily quotes of the WTI, Brent and corresponding TRM. If the relationship truly existed, it would be easy to detect and visualize from the resulting data set.

Two different oil prices were used for the exercise: WTI and Brent.

Reading and Preparing the Data Set

The first step towards the explorative data analysis of the information was reading the required libraries for the analysis and studying the different vectors of the data set to get an idea of each particular behavior.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.1.3
library(reshape2)
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 3.1.3
## Loading required package: grid
## Loading required package: lattice
## Warning: package 'lattice' was built under R version 3.1.3
## Loading required package: survival
## Warning: package 'survival' was built under R version 3.1.3
## Loading required package: Formula
## Warning: package 'Formula' was built under R version 3.1.3
## 
## Attaching package: 'Hmisc'
## 
## The following objects are masked from 'package:base':
## 
##     format.pval, round.POSIXt, trunc.POSIXt, units
#setwd("/home/user/project/main")
oilvscop = read.csv("oilvscop_wti_brent_numeric.csv")
oilvscop$date <- as.Date(oilvscop$date, origin = "1899-12-30")

It was mentioned before that the date values were probably the hardest part of putting together the data set. Information on quotes was scrapped from websites, and the resulting date formats did not play together well. A plethora of visualizations made little sense and the resulting charts were counter intuitive to the trained eye until the dates were finally corrected. The author found the best way to normalize dates was to export to Excel and then cast the dates into simple numeric format (i.e. turn 2014-09-01 to 41,883.) When exported to numeric format in such a way, it’s much easier to read the file into an R data-frame with the syntax oilvscop$date <- as.Date(oilvscop$date, origin = "1899-12-30"). This command provides R with a column (in this case date) and tells to treat the numeric value as a date with an origin point. R understands the translation easy enough to make the rest of the analysis hassle-free.

It is interesting to get a first hand idea of the data before any more profound analysis.

summary(oilvscop)
##       date                 wti            forex          brent       
##  Min.   :2014-09-01   Min.   :38.22   Min.   :1868   Min.   : 42.23  
##  1st Qu.:2014-11-29   1st Qu.:48.83   1st Qu.:2206   1st Qu.: 54.85  
##  Median :2015-03-02   Median :56.25   Median :2427   Median : 60.94  
##  Mean   :2015-03-01   Mean   :60.59   Mean   :2434   Mean   : 64.90  
##  3rd Qu.:2015-06-02   3rd Qu.:67.30   3rd Qu.:2576   3rd Qu.: 69.78  
##  Max.   :2015-09-02   Max.   :97.86   Max.   :3263   Max.   :102.56
tail(oilvscop, 20)
##           date   wti   forex brent
## 246 2015-08-06 44.69 2938.00 48.88
## 247 2015-08-07 43.87 2955.31 48.42
## 248 2015-08-10 44.94 2909.00 48.24
## 249 2015-08-11 43.11 2943.30 48.61
## 250 2015-08-12 43.22 2940.00 48.72
## 251 2015-08-13 42.27 2982.50 48.85
## 252 2015-08-14 42.45 2990.30 48.65
## 253 2015-08-17 41.93 2983.12 48.35
## 254 2015-08-18 42.58 3003.40 48.25
## 255 2015-08-19 40.75 3024.55 46.78
## 256 2015-08-20 41.00 3061.10 46.03
## 257 2015-08-21 40.45 3101.00 45.07
## 258 2015-08-24 38.22 3238.50 42.23
## 259 2015-08-25 39.15 3207.10 42.68
## 260 2015-08-26 38.50 3262.90 42.81
## 261 2015-08-27 42.47 3162.00 43.70
## 262 2015-08-28 45.29 3075.40 46.61
## 263 2015-08-31 49.20 3055.00 48.25
## 264 2015-09-01 45.41 3113.95 48.47
## 265 2015-09-02 46.25 3171.00 47.74
names(oilvscop)
## [1] "date"  "wti"   "forex" "brent"

There are several things we can see right away from the data set.

  1. The data set had data ranging from September 1, 2014 all the way to September 2, 2015. This is exactly one year of devaluation crisis, but the author is making efforts to updating the information on a monthly basis, since it is a great data-set of financial history in the making and lends itself for further predictive model analysis.
  2. The data includes the Forex (variable name given to the TRM or exchange rate, or value of the US dollar in Colombian pesos.) The exchange rate began the analysis at around COP 1,860 to reach its maximum price of COP 3,263 at the crisis worst point.
  3. The price of the WTI barrel is included, its fluctuation going from USD 97.86 at its climax and dropping to USD 38.22 at the bottom.
  4. The price of the Brent barrel is also included. It also fluctuates from USD 102.56 at its peak and dropping to USD 42.23 at its lowest point. The price of the Brent barrel was included at a later point of the correlation model when investigation on the subject turned out Colombian oil exports are determined more by Brent prices as opposed to WTI ones.

Variation of International Oil Prices

For a researcher, especially for a researcher of the fields of economy and international finances, understanding the core problem before attempting to explore the trend of the data and play with visualizations is foremost. But for Data Scientists with the power of R at their hands, visualizing data in the fly to study the problem is both more rewarding and simple.

How far did the prices of oil dropped in a mere 12 months? Every news channel in the US and elsewhere had probably already reported on this, but it makes interesting review one more time to better understand the overall picture.

First we will analyze the behavior of WTI prices in the period.

ggplot(data=oilvscop, aes(x=date, y=wti, group=1)) + geom_line(color="gray") + labs(title="Fluctuation of WTI Oil Prices (per Barril)", x="Date Period", y="USD per Barril")

If fracking and shale oil was the innovation of the industry that changed it all, it certainly changed things for the worst for countries like Colombia, who now saw their income from exports of oil dropped from more than USD 100 a barrel to less than USD 45.

Given that Colombia oil exports are more affected by prices of Brent oil, the analysis of the trend of international prices is even more pressing.

ggplot(data=oilvscop, aes(x=date, y=brent, group=1)) + geom_line(color="blue") + labs(title="Fluctuation of Brent Oil Prices (per Barril)", x="Date Period", y="USD per Barril")

The curves are very similar in trend. We can compare both in the same graph isolating the data sets for WTI and Brent oil prices and plotting them in the same graph. From a R point of view, one of the many possible ways is to use the melt command to group variable lines into the same plot (in this case the price fluctuations of WTI and Brent quotes.)

oilvscop_2 = oilvscop
oilvscop_2$forex <- NULL
oilvscop_2 <- melt(oilvscop_2, id="date")

ggplot(data=oilvscop_2, aes(x=date, y=value, group=variable, colour=variable)) + geom_line() + labs(title="Evolution of WTI and Brent Oil Prices (per Barril)", x="Date Period", y="USD per Barril")

The two lines match-up in pattern a lot, if the price of Brent oil was ever significantly higher. For the remainder of the analysis, we will stop using the WTI price and focus solely on the price per barrel of Brent oil.

Fluctuation of the COP to USD Exchange Rate

While the international price of oil kept falling, the exchange rate for the US dollar in Colombia started to rise, increasing the TRM as can be seen in the following chart.

ggplot(data=oilvscop, aes(x=date, y=forex, group=1)) + geom_line(color="red") + labs(title="Evolution of the Colombian TRM", x="Date Period", y="COP per US dollar")

Fueled by falling prices of oil and diminished income from oil exports, the US currency became scarce and the Colombian peso started to devaluate comparatively against the dollar, left to float freely with no state intervention.

Why did the Colombia government left the dollar to float freely? There are many theories around the issue.

Whatever the reasons behind the monetary policy, the charts show that the functions behind the falling prices of international oil and the rising TRM are indications of some of indirect relationship.

Visualization and Mathematical Calculation of the Correlation

The most succinct way to visualize a correlation between two variables is through a scatter plot, another task easily accomplished through the use of R and the ggplot2 library.

ggplot(data=oilvscop, aes(brent, forex)) + geom_point(color="red") + labs(title="Correlation of Brent International Price and Colombian TRM", x="International Price Brent Oil (per Barril)", y="TRM Valuation (in COP)")

If not perfect, the visualization of both variables in a scatter plot point to an interesting correlation with a strong negative trend. The trend line is more evident if one were to take the data points below the U$70 per barrel mark. At that point in time, when Brent oil was selling at or less than U$70 per barrel, the market had already decided on the destiny and devaluation of the Colombian peso, accelerating its rise against the US dollar.

We have mentioned that the plot visualization is that of a strong negative trend. But how strong? Using Pearson’s Correlation formula, we can calculate:

varcorr = cor(oilvscop$brent, oilvscop$forex)
cat("Correlation result among Brent and COP: ", varcorr)
## Correlation result among Brent and COP:  -0.8404745

If not perfect, the visualization of both variables in a scatter plot point to an interesting correlation with a strong negative trend. The result, a correlation value of -0.84, is a strong correlation indication and one that can give insights into constructing a predictive model with linear regression techniques should all variables remain the same.

The result, a correlation value of -0.84, is a strong correlation indication and one that can give insights into constructing a predictive model with linear regression techniques should all variables remain the same. This is however unlikely, since the already evident inflation in imported goods has called the attention of the Colombian government, and some kind of intervention of the market, either through dirty flotation or similar devices, is sure to take effect soon.

A probable - and very simple - model to fit the correlation can be readily obtained through R if one wanted to saciate the curiosity.

fit <- lm(oilvscop$brent ~ oilvscop$forex)
fit
## 
## Call:
## lm(formula = oilvscop$brent ~ oilvscop$forex)
## 
## Coefficients:
##    (Intercept)  oilvscop$forex  
##      169.39854        -0.04292
summary(fit)
## 
## Call:
## lm(formula = oilvscop$brent ~ oilvscop$forex)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.904  -4.567   1.277   5.232  14.453 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    169.398536   4.184419   40.48   <2e-16 ***
## oilvscop$forex  -0.042924   0.001706  -25.16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.188 on 263 degrees of freedom
## Multiple R-squared:  0.7064, Adjusted R-squared:  0.7053 
## F-statistic: 632.8 on 1 and 263 DF,  p-value: < 2.2e-16

The t-values alone of the correlation are mathematical proof (at -25.16) that one can reject the null hypothesis (which we did not state here and will follow-up with more time and data to support the model) and accept that there is a clear model tieing both variables.

CONCLUSION

Given the availability of such incredible tools for mathematical analysis and prediction, coupled to the also available wealth of data floating freely in the Internet which makes gathering and exploring large sets of data a much simpler task, one is left to wonder if the Colombian government carefully studied the potential case for correlation of oil prices and the TRM, and the resulting inflation effect on the market and subsequent loss of purchasing power from the consumer.

We have shown the existance of a strong correlation between variables, so if the devaluation of the Colombian peso continues, it is not outrageous to think this is not by free design, but rather because policy makers have in mind a bigger plan that involves no intervention to the free flotation of the US dollar.

It is unlikely that the price of oil sees significant increases, not at least in the short term, with the looming threat of fracking techniques just around the corner to estabilize prices (usually towards the lower boundary.) Unless a new variable comes to play, for example a significant replacement for GDP income, the peso will continue to lose parity power against the US dollar, and the emergence of double-digit inflation, could very well be the result for a nation well-used to the availability of imported goods.

The following work is solely based for academic purposes and no claims are made as to the accuracy of the mathematical methods within.

Panama Republic, September 2015 | www.rpubs.com/ameilij | Realeased under Attribution-NonCommercial 3.0 Unported - Creative Commons