Accessing Bitcoin exchanges with R

An implementation in R

Ray Brown

2018-06-07

Introduction

At present, I am interested in the blockchain technology that is the foundation of bitcoin and the possibility of working with some interesting data sets. There is a wealth of historical data located on sites like www.bitstamp.net and www.bitcoincharts.com, and most of it is easily accessible from R with just a little data processing. In this post, I present some code that may be helpful to people who wish to get started working with Bitcoin data in R.

VWAP

Large institutional buyers use the VWAP ratio to be able to move into FX and stocks in a way that will not disturb the natural market dynamics of price. If these buyers were to move into a position all at once, it would unnaturally elevate the instrument price. By buying under the intraday VWAP moving average, these buyers can move into a position without too much price disruption.

The volume weighted average price (VWAP) is a trading benchmark. VWAP is calculated by adding up the dollars traded for every transaction (price multiplied by number of shares traded) and then dividing by the total shares traded for the day. \[VWAP = \sum_{i=1}^n \frac{NumShares x Price_i}{Total Shares}\]

To manage the large file sizes generated by Forex exchanges it is useful to save tick data files in the more efficient binary (internal) data formats. This enables highes throughput of data, and faster input/output disk operations.

The easiest way to access Bitcoin transaction data is from the cache of zipped .csv files at http://api.bitcoincharts.com/v1/csv/. I downloaded the data and then saved it in binary format using the save() function. Now 25 million rows of raw tick data can be loaded into R memory using the load() function. Subsequently I process the tick data to convert the Unix timestamp into a date.

load("/home/hduser/zdata/tick_raw_file.RData")
head(tick_raw,5)
##     unixtime price  amount       date
## 1 1315922016  5.80  1.0000 2011-09-13
## 2 1315922024  5.83  3.0000 2011-09-13
## 3 1315922029  5.90  1.0000 2011-09-13
## 4 1315922034  6.00 20.0000 2011-09-13
## 5 1315924373  5.95 12.4521 2011-09-13
knitr::kable(tick_raw[1:6,], caption="Subset of raw tick data")

Subset of raw tick data

unixtime price amount date
1315922016 5.80 1.0000 2011-09-13
1315922024 5.83 3.0000 2011-09-13
1315922029 5.90 1.0000 2011-09-13
1315922034 6.00 20.0000 2011-09-13
1315924373 5.95 12.4521 2011-09-13
1315924504 5.88 7.4580 2011-09-13

The above chunk of code loaded the bitstamp transaction data in US dollars, reads 25 million rows of raw tick data into a data frame.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(xts)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
library(dygraphs)

Now for the payoff, we use the standard dplyr functions and the efficient xts() timeseries function to aggregate the transactions into a time series and the dygraphs library to visualize the results.

data1 <- select(tick_raw, -unixtime)
data1 <- mutate(data1, value = price * amount)
by_date <- group_by(data1, date)
daily <- summarise(by_date, count = n(),
                   m_price  <-  mean(price, na.rm = TRUE),
                   m_amount <-  mean(amount, na.rm = TRUE),
                   m_value  <-  mean(value, na.rm = TRUE))
 
names(daily) <- c("date","count","m_value","m_price","m_amount")

Daily Table

Now we have the daily data frame with summaries of price, amount and value, we can now convert this to an R xts timeseries eXtended Time Series, which is required for input to the dygraphs routine. xts also provides faster timeseries processing for subsequent analyses.

load("/home/hduser/zdata/daily_file.RData")
knitr::kable(daily[1:6,])
date count m_value m_price m_amount
2011-09-13 12 5.874167 4.864282 28.84145
2011-09-14 14 5.582143 4.367570 24.41820
2011-09-15 6 5.120000 13.356799 68.04317
2011-09-16 4 4.835000 9.978502 48.44079
2011-09-17 1 4.870000 0.300000 1.46100
2011-09-18 8 4.840000 14.976600 72.48039
daily_ts <- xts(daily$m_value,order.by=daily$date)

The easiest way to access Bitcoin transaction data is from the cache of zipped .csv files at http://api.bitcoincharts.com/v1/csv/. I downloaded the data and then saved it in binary format using the save() function. Now 25 million rows of raw tick data can be loaded into R memory using the load() function. Subsequently I process the tick data to convert the Unix timestamp into a date.

Graph with dygraphs package

# Plot with htmlwidget dygraph
dygraph(daily_ts,ylab="Price (US Dollars)", 
        main="Average Value of bitcoin US$ Buys \n Kraken Exchange") %>%
  dySeries("V1",label="Avg_Buy") %>%
  dyRangeSelector(dateWindow = c("2016-05-01","2018-04-30"))

Kraken Bitcoin

Summary

This graph illustrates the history of bitcoin to date: a prolonged slow start then a rocket ride to a sharp peak followed by a roller coaster ride down to a what looks like a sideways move with some volatility while the markets pause for thought.

Where the market prices have settled at present implies that bitcoin retains a strong core of believers. Prices may have dropped from the peak, nevertheless they are still well above the average from 2011 through 2016.

Finally note that there R some packages to help explore Bitcoin. Rbitcoin provides a unified API interface to the bitstamp, kraken, btce and bitmarket sites while rbitcoinchartsapi provides an interface to the BitCoinCharts.com API


References

Bitcoin transaction data
CRAN R Project
R Bloggers
RStudio Download
The R Project for Statistical Computing
The Integrated Development Environment - IDE for R
Tendron Systems Ltd

The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R.

The R Journal

The R Journal - Current Issue

Volume Weighted Average Price - VWAP