Summary

The use of the cryptocurrency, Tether, is widespread in Bitcoin exchanges. While most of the transaction flow is centered around one exchange, Bitfinex, other exchanges are trading Tether between themselves.

This is a visualization of the transaction flow of Tether between identified exchanges.

Version Control

The most up to date version of this analysis can always be found on the GitHub repository: https://github.com/kerskine/tether-data-explore

Background

Tether is a “stablecoin” used in trading Bitcoin on exchanges around the word. A stablecoin is digital currency that is pegged to a particular country’s currency (fiat). The advantage in using a stablecoin is that trading accounts can be closed and held in value at a lower fee. Without a stablecoin, a trader wishing to close their account for the day would incur higher fees converting Bitcoin to fiat, and then re-incurring fees the next day moving fiat into Bitcoin.

In the case of Tether, the number of tokens are tied to US Dollars (USD) held in reserve; one token for one US Dollar. In 2017, Tether issued over two billion tokens that were used in the run up of Bitcoin’s valuation. The use of Tether has come under scrutiny as the company hasn’t provided substantial proof that they hold USD in reserve needed to back those tokens. At the time of this writing, Tether comprises a third of Bitcoin’s transaction volume. The risk of Tether not being backed by sufficient reserves could have a extreme impact on Bitcoin’s perceived value.

Data Exploration

The data file for this analysis was created by Alex Vikati using an Omnicore node which tracks Tether transactions on the Bitcoin Blockchain. It’s a 68 MB compresses csv (comma separated value) file of over 1,508,702 records with 12 variables (see Appendix for explanation of fields). All data is current to Bitcoin Block# 522647 (2018-05-14 15:48:10 UTC)

File is downloaded and a tibble is created from the csv file.

rawcsv <- "https://s3-us-west-2.amazonaws.com/data.blockspur.com/tether/tether_transactions_522647.csv.zip"

download.file(rawcsv, "raw.csv.zip", method = "curl")

rawdata <- read_csv("raw.csv.zip")

We only need to look at valid transactions, so invalid ones are removed:

valid.tether <- rawdata %>% filter(is_valid == 1)

Transaction Flow

In order to understand the extent in which Tether is used in Bitcoin trading, we need to see where it’s being traded. Is it just in one exchange or multiple exchanges? Also, are Tether transaction centralized with one exchange or do exchanges trade with each other?

First we’ll need to summarize the dataset for all unique transactions:

options(width = 85)
toptrans <- valid.tether %>%
        select(sending_address, reference_address, amount) %>% 
        group_by(sending_address, reference_address) %>%
        summarise(total = sum(amount)) %>%
        arrange(desc(total)) %>%
        ungroup(.)

toptrans
## # A tibble: 734,670 x 3
##                       sending_address                  reference_address      total
##                                 <chr>                              <chr>      <dbl>
##  1 1MZAayfFJ9Kki2csoYjFVRKHFFSkdoMLtX 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw 2961894344
##  2 168o1kqNquEJeR9vosUB5fw4eAwcVAgh8P 1LAnF8h3qMGx3TSwNUHVneBZUEpwE4gu3D 2516524200
##  3 1J1dCYzS5EerUuJCJ6iJYVPytCMVLXrgM9 1Po1oWkD2LmodfkBYiAktwh76vkF93LKnh 1998424303
##  4 1NTMakcgVwQpMdGxRQnFKyb3G1FAJysSfz 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw 1875032700
##  5 3MbYQMMmSkC3AgWkj9FMo5LsPTW1zBTwXL 1NTMakcgVwQpMdGxRQnFKyb3G1FAJysSfz 1875000000
##  6 1AA6iP6hrZfYiacfzb3VS5JoyKeZZBEYRW 1DUb2YYbQA1jjaNYzVXLZ7ZioEhLXtbUru 1788405074
##  7 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw 1J1dCYzS5EerUuJCJ6iJYVPytCMVLXrgM9 1576086283
##  8 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw 1AA6iP6hrZfYiacfzb3VS5JoyKeZZBEYRW 1311730408
##  9 1Po1oWkD2LmodfkBYiAktwh76vkF93LKnh 1MZAayfFJ9Kki2csoYjFVRKHFFSkdoMLtX 1171329199
## 10 1HckjUpRGcrrRAtFaaCAUaGjsPx9oYmLaZ 1LAnF8h3qMGx3TSwNUHVneBZUEpwE4gu3D 1168430900
## # ... with 734,660 more rows

We can see the first ten records have transactions of over a billion Tether between them, but who belongs to these addresses? After a good deal of investigation, a list of addresses and their associated exchanges was constructed (see Appendix > Address Identification):

addresses <- read_csv("address.csv")
addresses
## # A tibble: 103 x 3
##                               address         id exchange
##                                 <chr>      <chr>    <chr>
##  1 1FoWyxwPXuj4C6abqwhjDWdz6D4PZgYRjA Binance-01  Binance
##  2 12uhUkxpwkD2LGzKHUywoknoJ3fC9vev1x Binance-02  Binance
##  3 16wf3d47R2ENjF5UGwQcPmsshxPc1fYCNj Binance-03  Binance
##  4 1HsenWQk8UjPczo8DzVzqivLqWmBTfxC4V Binance-04  Binance
##  5 17Fn7FxX3rs87Nvuit47163dHGp34C2aox Binance-05  Binance
##  6 1DfaNGd66p5Ekvz3yUXrKwBF2d5hcBx1MQ Binance-06  Binance
##  7 14ZYFBHXaAgfKwdNyKpEfjr5RXJeUkfaRw Binance-07  Binance
##  8 14eEBHFkijnkizipw6qmAgrsMztcWa42FE Binance-08  Binance
##  9 1HtqbiRvXrWAG3gw6MxeE4fdL461p2qyED Binance-10  Binance
## 10 153saAmBANqkD3t9RMKb2GQXhfQSQWnCPH Binance-11  Binance
## # ... with 93 more rows

Matching up addresses with their owners allows us to produce a network graph showing the transaction flow:

# Let's only look at the top 100 transaction pairs and join it with the addresses

g <- toptrans[1:100, ] %>% 
        left_join(addresses, by = c("sending_address" = "address")) %>%
        left_join(addresses, by = c("reference_address" = "address")) 
        
        # Display the id names if there's a match, otherwise just display the 1st 5 characters
        # of the unknown address

g.id <- g %>%
        
        mutate(send.addr = if_else(is.na(id.x), 
                                   str_c(strtrim(sending_address, 5), "..."),
                                   id.x)) %>%
        mutate(recv.addr = if_else(is.na(id.y), 
                                   str_c(strtrim(reference_address, 5), "..."),
                                   id.y)) %>%
        select(send.addr, recv.addr) %>%
        
        # Use igraph to create the network diagram
        graph_from_data_frame(.)
# Now plot it

plot(g.id, 
     vertex.size = 10,
     vertex.shape = "none",
     asp = 0, 
     edge.arrow.size = 0.25, 
     vertex.label.cex = 0.8,
     vertex.label.family = "sans"
     )        
     title("Tether Flow - Top 100 Transactions - Exchange Addresses")

The above network map is busy as it shows all the different addresses (“id” variable in address) used for transferring Tether between exchanges. If we just show the exchanges themselves we can get a clearer picture.

g.ex <- g %>%
        
        mutate(send.addr = if_else(is.na(exchange.x), 
                                   str_c(strtrim(sending_address, 5), "..."),
                                   exchange.x)) %>%
        mutate(recv.addr = if_else(is.na(exchange.y), 
                                   str_c(strtrim(reference_address, 5), "..."),
                                   exchange.y)) %>%
        select(send.addr, recv.addr) %>%
        # Get only distinct connections
        distinct(.) %>%
        
        # Remove connections that are inside exchange
        filter(send.addr != recv.addr) %>%

        # Use igraph to create the network diagram
        graph_from_data_frame(.)

plot(g.ex, 
     vertex.size = 10,
     vertex.shape = "none",
     asp = 0, 
     edge.arrow.size = 0.25, 
     vertex.label.cex = 0.8,
     vertex.label.family = "sans"
)        
title("Tether Flow - Top 100 Transactions - Exchanges")

Analysis

Bitfinex is the dominate center of Tether trading: All Tether is sent to the Bitfinex-01 address (1KYiK…) which then distributes it to other exchanges and its own exchange customers. This isn’t surprising as Tether and Bitfinex share management.

Tether is traded between exchanges: Huobi, Poloniex, Bittrex and Binance are all trading Tether with each other without Bitfinex being involved. Kraken, Gate.io and OKEx just trade with Bitfinex.

Next Steps

Acknowledgements

First, I’d like to thank Alex Vikati for doing the hard work in constructing the Tether dataset used in this analysis.

I’d also like to thank Roger Peng, Brian Caffo, Jeff Leek, Johns Hopkins, and Coursera for giving me the working knowledge to attempt this project.

Appendix

Address Identification

Figuring out which address belongs to which exchange is currently an itterative process. To accomplish it, two functions; t.send and t.recv were developed to examine the transaction flows of a specific address:

As an example, let’s look at address 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 to see total Tether received and sent:

t.recv("19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4", valid.tether, 1)
## # A tibble: 7 x 4
##                    reference_address                    sending_address
##                                <chr>                              <chr>
## 1 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1DcKsGnjpD38bfj6RMxz945YwohZUTVLby
## 2 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1MEPB525tEHRFLdq6aR8d2t8jaaRQj2iWX
## 3 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1G47mSr3oANXMafVrR8UC4pzV7FEAzo3r9
## 4 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1LAnF8h3qMGx3TSwNUHVneBZUEpwE4gu3D
## 5 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1ApkXfxWgJ5CBHzrogVSKz23umMZ32wvNA
## 6 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 115E3baxJZsJHeTay1jvUh3nSTHBJhkskc
## 7 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw
## # ... with 2 more variables: total.recv <dbl>, cum.freq <dbl>

This address received Tether from at least three different exchanges: Gate.io (1DcKs…), Huobi (1LAn…), and OKex (1Apk…). Each are identified on the Tether Rich List (archived on April 6, 2018) as belonging to these exchanges. Now to look at where the Tether was sent:

t.send("19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4", valid.tether, 1)
## # A tibble: 1 x 4
##                      sending_address                  reference_address
##                                <chr>                              <chr>
## 1 19Qcmdh2FEZnTEFeEbQvWPSvfLuRBcjyo4 1KYiKJEfdJtap9QX2v9BXJMpz2SfU4pgZw
## # ... with 2 more variables: total.sent <dbl>, cum.freq <dbl>

Here we see only on address that Tether is sent to; 1KYiK… which the “Rich List” identifies as belonging to Bitfinex.

Therefore, it’s a safe assumption that the address belongs to Bitfinex and was used in the course of their business transacting Tether with other exchanges. You’ll find it named “Bitfinex-07” in the addresses tibble.

Blockspur CSV Data Fields

From Blockspur : Download Tether Data

tether_transactions_507015.csv.zip is a 68MB compressed / 240MB uncompressed CSV file that contains every Tether transaction on the Omni blockchain up until block 507015.

The fields are listed below:

Name Description Type
tx_hash The unique id of the transaction; same as the BTC txid string
block_height The numeric height of the block in the BTC blockchain integer
block_hash The unique id of the BTC block the transaction is in string
block_time The timestamp of the BTC block the transaction is in datetime, GMT 0
position_in_block The numeric position of the transaction within the block integer
sending_address The BTC address of the sender string
reference_address A BTC address used as reference. Same as the recipient address in the case of “Simple Send” string
tx_type The transaction type, with “Simple Send” being the most popular. Valid values are listed on Omni Layer’s spec string
amount The amount of token in the transaction float
version The transaction version number integer
is_valid 1 if the transaction is valid; 0 if it is not; integer
fee The transaction fee in BTC float