Motivation

Cryptocurrency is a digital currency that is created and managed through the use of cryptography, which is an advanced type of encryption technique. Bitcoin was one of the earliest forms of cryptocurrency and since then, many other types of cryptocurrencies have been created. I first heard about Bitcoin back in 2013, when I was a financially struggling college student, and I thought 80 Dollars per coin was a bit too much. Today, one bitcoin is about 100 times that amount.

Below is the index for November 19th 2017 from https://www.coindesk.com

Today’s Open: $8,033.94 Today’s High: $8,049.12 Today’s Low: $8,021.33 Today’s Closed: $8,034.42

*Update: Since the time this proposal was written, Bitcoin has increased dramatically. Below is the index for December 9th 2017 from https://www.coindesk.com

Today’s Open: $16,057.15 Today’s High: $16,291.68 Today’s Low: $15,538.25

Even though Bitcoin seems like a great investment, many investors remain skeptical. The main reason is bitcoin and other cryptocurrencies are virtual currencies are not backed by anything except for a select few. For Bitcoin, there is limited amount of bitcoin (20 million) and since the market purely fluctuates on demand, it is extremely volatile. This gives spectators reason to believe it is a bubble. Currently, many new investors, attracted by its potential, are entering the market, increasing its value significantly, but this can change if the demand drops. Currently there are many theories out there, but for now, I wanted to see how volatile bitcoin is compared to other currencies.

Currencies

Bitcoin: Created in 2009, was the first decentralized cryptocurrency. 20 million coins

Ethereum: Launched in 2015, Ethereum is a decentralized software platform that enables SmartContracts and Distributed Applications (DApps) to be built and run without any downtime, fraud, control or interference from a third party.

Litecoin: Created in 2011 by engineer Charlie Lee to be the silver to bitcoin’s gold. One of the main disparities between the two cryptocurrencies lies in their transaction speeds.

Bitcoin Cash: Hard fork of the cryptocurrency bitcoin, the fork occurred on August 2017. a hard fork is a radical change to the protocol that makes previously invalid blocks/transactions valid (or vice-versa), and as such requires all nodes or users to upgrade to the latest version of the protocol software.

Dash: An open source peer-to-peer cryptocurrency that aims to be the most user-friendly and most on-chain-scalable cryptocurrency in the world. It offers instant transactions and private transactions.

Tether: Backed by US Dollars, offers a way to own and move fiat currency across different cryptocurrencies and exchanges without the need to convert crypto assets into dollars.

Ripple: A privately held, cash flow positive company that aims to create and enable a global network of financial institutions and banks to use the Ripple software to lower the cost of international payments.

Obtain the Data

Bitcoin API from Quandl

Since bitcoin is the most widely known cryptocurrency, there is a lot of data available for it. I was able to find an API for Bitcoin’s historical data.

#Daily Bitcoin exchange rate (BTC vs. USD) on Bitstamp from 09/13/2011 to 12/09/2017
Quandl.api_key('mxcqtzYcf5Co4fbG3WAX')
Bitcoin <- Quandl('BCHARTS/BITSTAMPUSD', start_date='2011-01-01', end_date ='2017-12-07')
head(Bitcoin )
##         Date     Open     High      Low    Close Volume (BTC)
## 1 2017-12-07 13623.00 16615.62 13085.90 16599.99    25787.677
## 2 2017-12-06 11676.99 13700.00 11659.80 13623.50    19784.873
## 3 2017-12-05 11613.07 11850.00 11384.25 11677.00    11875.034
## 4 2017-12-04 11250.00 11613.07 10850.00 11613.07    13621.482
## 5 2017-12-03 10875.68 11800.01 10513.16 11250.00    14238.526
## 6 2017-12-02 10840.45 11200.00 10637.69 10872.00     9267.161
##   Volume (Currency) Weighted Price
## 1         382694044       14840.19
## 2         250560790       12664.26
## 3         138370076       11652.18
## 4         154122918       11314.70
## 5         160176290       11249.50
## 6         101270084       10927.84
tail(Bitcoin)
##            Date Open High  Low Close Volume (BTC) Volume (Currency)
## 2273 2011-09-18 4.87 4.92 4.81  4.92    119.81280          579.8431
## 2274 2011-09-17 4.87 4.87 4.87  4.87      0.30000            1.4610
## 2275 2011-09-16 4.82 4.87 4.80  4.85     39.91401          193.7631
## 2276 2011-09-15 5.12 5.24 5.00  5.13     80.14080          408.2590
## 2277 2011-09-14 5.58 5.72 5.52  5.53     61.14598          341.8548
## 2278 2011-09-13 5.80 6.00 5.65  5.97     58.37138          346.0974
##      Weighted Price
## 2273       4.839576
## 2274       4.870000
## 2275       4.854515
## 2276       5.094272
## 2277       5.590798
## 2278       5.929231

Main Competitors

The second tier of cryptocurrencies in terms of popularity include Ethereum, Litecoin. I was interested in those two, plus a couple more, Dash and Bitcoin cash. I was able to find a csv file on the historical data from the website: https://coinmetrics.io/data-downloads/

I saved the csv files in Github.

Bc_cash <- read.csv("https://raw.githubusercontent.com/mikegankhuyag/607-Projects/master/Final/bch.csv", header = TRUE, stringsAsFactors = FALSE)
Dash <- read.csv("https://raw.githubusercontent.com/mikegankhuyag/607-Projects/master/Final/dash.csv", header = TRUE, stringsAsFactors = FALSE)
Ethereum <- read.csv("https://raw.githubusercontent.com/mikegankhuyag/607-Projects/master/Final/eth.csv", header = TRUE, stringsAsFactors = FALSE)
Litecoin <- read.csv("https://raw.githubusercontent.com/mikegankhuyag/607-Projects/master/Final/ltc.csv", header = TRUE, stringsAsFactors = FALSE)
head(Bc_cash)
##        date txVolume.USD. txCount marketcap.USD. price.USD.
## 1 7/31/2017    2406864986  183998              0     294.46
## 2  8/1/2017     906913244  230867              0     380.01
## 3  8/2/2017     603435293   76537     6302360000     452.66
## 4  8/3/2017      83677447    7416     7392030000     364.05
## 5  8/4/2017     218502200   20909     5969720000     233.05
## 6  8/5/2017     263414318   26517     3809330000     213.15
##   exchangeVolume.USD. generatedCoins       fees
## 1             1075960         1837.5 138.116822
## 2            65988800         1812.5 194.868866
## 3           416207000         1112.5  51.706019
## 4           161518000           87.5   6.288903
## 5           185038000          437.5  20.234731
## 6           144043000          237.5  21.324498
head(Dash)
##        date txVolume.USD. txCount marketcap.USD. price.USD.
## 1 2/14/2014     150641.42    3421         702537   0.374024
## 2 2/15/2014      78256.18    3663        1092120   0.314865
## 3 2/16/2014      96549.99    3236        1085280   0.406976
## 4 2/17/2014     367922.18    2766        1360260   1.450000
## 5 2/18/2014     838488.30    2631        3960320   1.040000
## 6 2/19/2014     675559.38    2551        3497850   0.941647
##   exchangeVolume.USD. generatedCoins     fees
## 1               15422          33849 3.355010
## 2               21119          33235 3.739035
## 3               28017          30944 5.238558
## 4              178618          20713 2.473010
## 5              160779          15940 2.827082
## 6               60551          13548 2.363510
head(Ethereum)
##        date txVolume.USD. txCount marketcap.USD. price.USD.
## 1  8/7/2015             0       0              0   2.770000
## 2  8/8/2015       1513209    2016      167911000   0.753325
## 3  8/9/2015       1180418    2807       42637600   0.701897
## 4 8/10/2015        825663    1298       43130000   0.708448
## 5 8/11/2015       1787874    1999       42796500   1.070000
## 6 8/12/2015       1812412    4945       64018400   1.220000
##   exchangeVolume.USD. generatedCoins     fees
## 1              164329       27075.47  0.00000
## 2              674188       27437.66 37.31841
## 3              532170       27943.44 68.09997
## 4              405283       27178.28 14.09895
## 5             1463100       27817.34 31.16514
## 6             2150620       28027.81 11.31145
head(Litecoin)
##        date txVolume.USD. txCount marketcap.USD. price.USD.
## 1 4/28/2013      39038951    8847       73773400       4.35
## 2 4/29/2013      48283929    9408       74952700       4.38
## 3 4/30/2013      38686090    9092       75726800       4.30
## 4  5/1/2013      33849471    9205       73901200       3.80
## 5  5/2/2013      58715299    8927       65242700       3.37
## 6  5/3/2013      13752345    8290       58607400       3.04
##   exchangeVolume.USD. generatedCoins     fees
## 1                   0          32800 511.0816
## 2                   0          31500 634.1212
## 3                   0          32450 597.0982
## 4                   0          31600 755.4951
## 5                   0          31450 689.1598
## 6                   0          28300 551.3278

More Competitors

I was interested in two coins that are currently valued really low, Ripple and Tether. I was able to find the historical data on https://coinmarketcap.com. Since I couldn’t download it, I decided to scrape the data.

Rip <- read_html("https://coinmarketcap.com/currencies/ripple/historical-data/?start=20130428&end=20171209")
Tet <- read_html("https://coinmarketcap.com/currencies/tether/historical-data/?start=20130428&end=20171209")
Ripp <- html_text(html_nodes(Rip, "td"))
Rippl <- matrix(Ripp, ncol = 7, byrow = TRUE)
Ripple <- data.frame(Rippl[2:1588,], stringsAsFactors = TRUE)
colnames(Ripple) <- c("Date","Open","High","Low","Close","Volume","Market Cap")
head(Ripple)
##           Date     Open     High      Low    Close      Volume
## 1 Dec 08, 2017 0.223636 0.278673 0.222168 0.252125 660,172,000
## 2 Dec 07, 2017 0.232623 0.233760 0.221340 0.222823 275,205,000
## 3 Dec 06, 2017 0.245416 0.245705 0.227742 0.232544 274,526,000
## 4 Dec 05, 2017 0.253598 0.253988 0.245234 0.246101 174,591,000
## 5 Dec 04, 2017 0.252919 0.255362 0.247160 0.253571 104,650,000
## 6 Dec 03, 2017 0.255530 0.263072 0.247391 0.252558 134,710,000
##      Market Cap
## 1 8,663,460,000
## 2 9,011,630,000
## 3 9,507,190,000
## 4 9,815,990,000
## 5 9,768,480,000
## 6 9,869,310,000
Teth <- html_text(html_nodes(Tet, "td"))
Tethe <- matrix(Teth, ncol = 7, byrow = TRUE)
Tether <- data.frame(Tethe[2:1013,], stringsAsFactors = TRUE)
colnames(Tether) <- c("Date","Open","High","Low","Close","Volume","Market Cap")
head(Tether)
##           Date     Open High      Low Close        Volume  Market Cap
## 1 Dec 08, 2017     1.04 1.06 0.986563  1.02 1,993,030,000 843,587,000
## 2 Dec 07, 2017     1.01 1.08     1.00  1.03 1,671,610,000 819,775,000
## 3 Dec 06, 2017 0.999760 1.02 0.995840  1.01 1,281,490,000 813,822,000
## 4 Dec 05, 2017     1.00 1.01 0.996458  1.00   814,146,000 816,872,000
## 5 Dec 04, 2017     1.00 1.01 0.992132  1.00   668,510,000 816,012,000
## 6 Dec 03, 2017     1.00 1.03 0.985320  1.00   946,749,000 814,847,000

Cleaning the data

I want all the to be in the same format as the Bitcoin API. For the CSV files, I first wanted to reorder the data since it starts from the earliest point.

Bc_cash <-Bc_cash[130:1,]
Dash <-Dash[1393:1,]
Ethereum <-Ethereum[854:1,]
Litecoin <-Litecoin[1685:1,]

The Bitcoin API is in a YYYY-MM-DD format. So I needed to change that all the dates to that format.

Bc_cash$date <- as.Date(Bc_cash$date, format = "%m/%d/%Y")
Ethereum$date <- as.Date(Ethereum$date, format = "%m/%d/%Y")
Dash$date <- as.Date(Dash$date, format = "%m/%d/%Y")
Litecoin$date <- as.Date(Litecoin$date, format = "%m/%d/%Y")

Change column name ‘date’ to ‘Date’

colnames(Bc_cash)[1] <- "Date"
colnames(Ethereum)[1] <- "Date"
colnames(Dash)[1] <- "Date"
colnames(Litecoin)[1] <- "Date"

Since the two scraped data is from a html website, all of it needs to be reformatted.

Tether[,2:5] %<>% 
      mutate_each(funs(if(is.factor(.)) as.character(.) else .)) %<>% 
       mutate_each(funs(if(is.character(.)) as.numeric(.) else .))
Ripple[,2:5] %<>% 
      mutate_each(funs(if(is.factor(.)) as.character(.) else .)) %<>% 
       mutate_each(funs(if(is.character(.)) as.numeric(.) else .))

Also, change the date format to YYYY-MM-DD format.

Tether$Date <- as.Date(Tether$Date, format = "%b %d, %Y")
Ripple$Date <- as.Date(Ripple$Date, format = "%b %d, %Y")

Combining the data

Now that all of our data is in consistent format, lets get all the variables needed for analysis.

I want to average out the highs and lows to get one price for the day.

require(magrittr)
Bitcoin %<>% mutate(bitcoin_price = (High + Low)/2,
                bitcoin_gain = Close - Open)
Ripple %<>% mutate(ripple_price = (High + Low)/2,
                ripple_gain  = Close - Open)
Tether %<>% mutate(tether_price = (High + Low)/2, 
                tether_gain =  Close - Open)

Selecting the data we need.

Bitcoin_data <- select(Bitcoin, Date,bitcoin_price)
Ripple_data <- select(Ripple, Date, ripple_price)
Tether_data <- select(Tether, Date, tether_price)
Ethereum_data <- select(Ethereum, Date, price.USD.)
Dash_data <- select(Dash, Date, price.USD.)
Bc_cash_data <- select(Bc_cash, Date, price.USD.)
Litecoin_data <- select(Litecoin, Date, price.USD.)

Since, all of our data is in the same format, we join them using the ‘Date’ field as the identifier. Bitcoin data has the most rows, so I used it as the left part of left join.

cryptocurrencies <-
left_join(
left_join(
left_join(
left_join(
  left_join(
    left_join(Bitcoin_data, Ripple_data, "Date"),
    Tether_data, "Date"),
    Ethereum_data, "Date"),
    Dash_data, "Date"),
    Bc_cash_data, "Date"),
    Litecoin_data, "Date")
colnames(cryptocurrencies) <- c("Date","bitcoin_price","ripple_price","tether_price",
                                 "ethereum_price", "dash_price","bc_cash_price","litecoin_price")
head(cryptocurrencies)
##         Date bitcoin_price ripple_price tether_price ethereum_price
## 1 2017-12-07      14850.76    0.2275500     1.040000         434.41
## 2 2017-12-06      12679.90    0.2367235     1.007920         428.59
## 3 2017-12-05      11617.12    0.2496110     1.003229         463.28
## 4 2017-12-04      11231.53    0.2512610     1.001066         470.20
## 5 2017-12-03      11156.58    0.2552315     1.007660         465.85
## 6 2017-12-02      10918.85    0.2549870     1.007805         463.45
##   dash_price bc_cash_price litecoin_price
## 1     697.90       1330.93          98.29
## 2     700.07       1430.10         100.35
## 3     756.36       1501.85         102.40
## 4     774.01       1576.92         104.24
## 5     768.88       1559.93         101.26
## 6     778.43       1434.98         100.28

Visualizing the currencies

ggplot(data = cryptocurrencies[which(cryptocurrencies$bitcoin_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= bitcoin_price), position = "jitter", color = "Blue") +
  ggtitle("Bitcoin Price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$ethereum_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= ethereum_price), position = "jitter", color = "Green")+
  ggtitle("Ethereum_price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$litecoin_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= litecoin_price), position = "jitter", color = "Red")+
  ggtitle("Litecoin Price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$dash_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= dash_price), position = "jitter", color = "Orange")+
  ggtitle("Dash Price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$bc_cash_price != 0),]) +
  geom_line(mapping = aes(x= Date, y=bc_cash_price), position = "jitter")+
  ggtitle("Bit Coin Cash Price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$ripple_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= ripple_price), position = "jitter", color = "Pink")+
  ggtitle("Ripple Price")

ggplot(data = cryptocurrencies[which(cryptocurrencies$tether_price != 0),]) +
  geom_line(mapping = aes(x= Date, y= tether_price), position = "jitter", color = "Purple")+
  ggtitle("Tether Price")

Since the prices of the currencies between each other have huge differences, it would be hard to compare them. I decided to calculate the relative change between them.

I created another table with the previous day’s price, added 1 to the day and did a left join to match the day.

Prior_day <- cryptocurrencies[2:2278,] 
Prior_day$Date <- as.Date(Prior_day$Date)+1
colnames(Prior_day) <- c("Date","yes_bitcoin_price","yes_ripple_price","yes_tether_price",
                                 "yes_ethereum_price", "yes_dash_price","yes_bc_cash_price","yes_litecoin_price")

r_cyptocurrencies <- left_join(cryptocurrencies, Prior_day, by ="Date")
r_cyptocurrencies <- select(r_cyptocurrencies, Date, bitcoin_price, yes_bitcoin_price, ripple_price, yes_ripple_price, tether_price, yes_tether_price,
                            ethereum_price, yes_ethereum_price, dash_price, yes_dash_price, bc_cash_price, yes_bc_cash_price, litecoin_price, yes_litecoin_price)
head(r_cyptocurrencies)
##         Date bitcoin_price yes_bitcoin_price ripple_price yes_ripple_price
## 1 2017-12-07      14850.76          12679.90    0.2275500        0.2367235
## 2 2017-12-06      12679.90          11617.12    0.2367235        0.2496110
## 3 2017-12-05      11617.12          11231.53    0.2496110        0.2512610
## 4 2017-12-04      11231.53          11156.58    0.2512610        0.2552315
## 5 2017-12-03      11156.58          10918.85    0.2552315        0.2549870
## 6 2017-12-02      10918.85          10160.00    0.2549870        0.2491330
##   tether_price yes_tether_price ethereum_price yes_ethereum_price
## 1     1.040000         1.007920         434.41             428.59
## 2     1.007920         1.003229         428.59             463.28
## 3     1.003229         1.001066         463.28             470.20
## 4     1.001066         1.007660         470.20             465.85
## 5     1.007660         1.007805         465.85             463.45
## 6     1.007805         1.006522         463.45             466.54
##   dash_price yes_dash_price bc_cash_price yes_bc_cash_price litecoin_price
## 1     697.90         700.07       1330.93           1430.10          98.29
## 2     700.07         756.36       1430.10           1501.85         100.35
## 3     756.36         774.01       1501.85           1576.92         102.40
## 4     774.01         768.88       1576.92           1559.93         104.24
## 5     768.88         778.43       1559.93           1434.98         101.26
## 6     778.43         797.53       1434.98           1462.68         100.28
##   yes_litecoin_price
## 1             100.35
## 2             102.40
## 3             104.24
## 4             101.26
## 5             100.28
## 6              99.00

Calculate relative change

r_cyptocurrencies %<>% mutate(bitcoin_change = ((bitcoin_price - yes_bitcoin_price)/yes_bitcoin_price)*100,
                              ripple_change =  ((ripple_price - yes_ripple_price)/yes_ripple_price)*100,
                              tether_change =  ((tether_price - yes_tether_price)/yes_tether_price)*100,
                              ethereum_change =  ((ethereum_price - yes_ethereum_price)/yes_ethereum_price)*100,
                              dash_change =  ((dash_price - yes_dash_price)/yes_dash_price)*100,
                              bc_cash_change =  ((bc_cash_price - yes_bc_cash_price)/yes_bc_cash_price)*100,
                              litecoin_change =  ((litecoin_price - yes_litecoin_price)/yes_litecoin_price)*100)

Create a new table with just the relative changes.

relative_change <- select(r_cyptocurrencies, "Date", ends_with("change"))
relative_change_ <- relative_change

relative_change_[is.na(relative_change_ <- relative_change)] <- 0
head(relative_change_)
##         Date bitcoin_change ripple_change tether_change ethereum_change
## 1 2017-12-07     17.1204820   -3.87519617    3.18279229       1.3579412
## 2 2017-12-06      9.1483478   -5.16303368    0.46759015      -7.4879123
## 3 2017-12-05      3.4331015   -0.65668767    0.21606967      -1.4717142
## 4 2017-12-04      0.6718006   -1.55564654   -0.65438739       0.9337770
## 5 2017-12-03      2.1773365    0.09588724   -0.01443731       0.5178552
## 6 2017-12-02      7.4689469    2.34974893    0.12756806      -0.6623226
##   dash_change bc_cash_change litecoin_change
## 1  -0.3099690      -6.934480      -2.0528151
## 2  -7.4422233      -4.777441      -2.0019531
## 3  -2.2803323      -4.760546      -1.7651573
## 4   0.6672042       1.089151       2.9429192
## 5  -1.2268284       8.707438       0.9772637
## 6  -2.3948942      -1.893784       1.2929293

Round the percentages

relative_change_$bitcoin_change <- round(relative_change_$bitcoin_change, 4)
relative_change_$ripple_change <- round(relative_change_$ripple_change ,4)
relative_change_$tether_change <- round(relative_change_$tether_change ,4)
relative_change_$ethereum_change <- round(relative_change_$ethereum_change ,4)
relative_change_$dash_change <- round(relative_change_$dash_change ,4)
relative_change_$bc_cash_change <- round(relative_change_$bc_cash_change ,4)
relative_change_$litecoin_change <- round(relative_change_$litecoin_change ,4)

Analysis

Relative Change

Tidy the data for analysis. The mean relative change shows the average changes.

tidy_change <- (gather(relative_change_, key= Currency, "relative_change", desc("Date")))
tidy_change %>% filter(relative_change !=0, is.finite(relative_change) == TRUE) %>% 
  group_by(Currency) %>%  summarise(mean = mean(relative_change), count = n())
## # A tibble: 7 x 3
##          Currency        mean count
##             <chr>       <dbl> <int>
## 1  bc_cash_change  1.97931085   129
## 2  bitcoin_change -0.05745514  2249
## 3     dash_change  0.98085324  1360
## 4 ethereum_change  0.93719399   849
## 5 litecoin_change  0.44710412  1602
## 6   ripple_change  0.44249058  1581
## 7   tether_change  0.02202607   399

From the data, we can see that Bitcoin prices has a negative relative change. This might be due to some skews in our data. Lets limit high percentage changes.

tidy_change <- (gather(relative_change_, key= Currency, "relative_change", desc("Date")))
tidy_change %>% filter(relative_change !=0, is.finite(relative_change) == TRUE, relative_change < 50, relative_change > -50) %>% 
  group_by(Currency) %>%  summarise(mean = mean(relative_change), count = n())
## # A tibble: 7 x 3
##          Currency       mean count
##             <chr>      <dbl> <int>
## 1  bc_cash_change 1.16503465   127
## 2  bitcoin_change 0.41088594  2233
## 3     dash_change 0.57381055  1355
## 4 ethereum_change 0.90714835   846
## 5 litecoin_change 0.22708623  1598
## 6   ripple_change 0.29571635  1578
## 7   tether_change 0.02202607   399

We see that when we limit the relative changes between -50 and 50, there is significant changes to the mean, but not the count.

tidy_change <- (gather(relative_change_, key= Currency, "relative_change", desc("Date")))
tidy_change %>% filter(relative_change !=0, is.finite(relative_change) == TRUE, relative_change < 10, relative_change > -10) %>% 
  group_by(Currency) %>%  summarise(mean = mean(relative_change), count = n())
## # A tibble: 7 x 3
##          Currency        mean count
##             <chr>       <dbl> <int>
## 1  bc_cash_change -0.71430515    97
## 2  bitcoin_change  0.36466306  2152
## 3     dash_change -0.12132624  1189
## 4 ethereum_change -0.12964241   738
## 5 litecoin_change -0.08903114  1461
## 6   ripple_change -0.14734908  1465
## 7   tether_change  0.02202607   399

When we limit the relative change to between -10 and 10, most of the mean is negative. This means for 5/7 observations, the increases are due to high shifts in the market in the day, but most days, they are losing value. This is opposite for Bitcoin, which is increasing on most days and its losses come from large drops in the market. For Tether, the market looks surprisingly stable. This may be due to it being a small player in the market.

Visualization

ggplot(data =tidy_change[which(tidy_change$relative_change != 0),]) +
  geom_line(mapping = aes(x= Date, y= relative_change, color= Currency,fill = Currency), position = "jitter")+
  facet_wrap(~ Currency, nrow = 2)+
  ylim(-100,100)
## Warning: Ignoring unknown aesthetics: fill

From the visualization, we can see that the spread is very large for most of the currencies with exception to tether. Visualizing the relative changes together.

ggplot(data =tidy_change[which(tidy_change$relative_change != 0),]) +
  geom_point(mapping = aes(x= Date, y= relative_change, color= Currency,fill = Currency),alpha = 1/5, position = "jitter") +
  ylim(-50,50)+
  theme(panel.background = element_rect(fill = 'white'))+
  ggtitle("Cryptocurrencies from 2011 to 2017",subtitle = "Relative Changes")
## Warning: Removed 33 rows containing missing values (geom_point).

Gain to Loss Ratio

Let’s dive deeper into how the currencies fluctuate and look at gain to loss ratio. Categorize for each relative change percentage to ‘gain’, ‘loss’ or ‘no change’.

tidy_change$lost_gain <- ifelse(tidy_change$relative_change > 0, "gain", 
                                ifelse(tidy_change$relative_change ==0,"no,change","loss"))
tidy_change[sample(nrow(iris), 10), ]
##           Date       Currency relative_change lost_gain
## 74  2017-09-25 bitcoin_change          3.0917      gain
## 27  2017-11-11 bitcoin_change         -5.3339      loss
## 11  2017-11-27 bitcoin_change          6.0537      gain
## 56  2017-10-13 bitcoin_change          9.6475      gain
## 114 2017-08-16 bitcoin_change          1.2907      gain
## 137 2017-07-24 bitcoin_change          0.1092      gain
## 55  2017-10-14 bitcoin_change          1.3287      gain
## 128 2017-08-02 bitcoin_change         -2.4200      loss
## 131 2017-07-30 bitcoin_change         -1.3529      loss
## 105 2017-08-25 bitcoin_change          3.4083      gain

Untidy the data for analysis

#All
ratio<- tidy_change %>% group_by(Currency) %>% count(lost_gain)
gain_loss_ratio <- ratio %<>% spread(lost_gain, n) %<>%   mutate(ratio = gain/loss)
gain_loss_ratio <- select(gain_loss_ratio,Currency,ratio)
#Limit -100 to 100
ratio100<- tidy_change %>% filter(relative_change < 100, relative_change > -100) %>% group_by(Currency) %>% count(lost_gain)
gain_loss_ratio100 <- ratio100 %<>% spread(lost_gain, n) %<>%   mutate(ratio100 = gain/loss)
gain_loss_ratio100 <- select(gain_loss_ratio100,Currency,ratio100)

#Limit -10 to 10
ratio1<- tidy_change %>% filter(relative_change < 10, relative_change > -10) %>% group_by(Currency) %>% count(lost_gain)
gain_loss_ratio1 <- ratio1 %<>% spread(lost_gain, n) %<>%   mutate(ratio10 = gain/loss)
gain_loss_ratio1 <- select(gain_loss_ratio1,Currency,ratio10)
All_ratios <- inner_join(gain_loss_ratio,inner_join(gain_loss_ratio100,gain_loss_ratio1,by = "Currency"),by= "Currency")
All_ratios$Currency <- c("Bitcoin Cash", "Bitcoin", "Dash", "Ethereum","Litecoin","Ripple","Tether")
All_ratios
## # A tibble: 7 x 4
## # Groups:   Currency [?]
##       Currency     ratio  ratio100   ratio10
##          <chr>     <dbl>     <dbl>     <dbl>
## 1 Bitcoin Cash 0.8970588 0.8970588 0.7017544
## 2      Bitcoin 1.2507463 1.2520161 1.2510460
## 3         Dash 0.9209040 0.9180791 0.8405573
## 4     Ethereum 0.9698376 0.9698376 0.8403990
## 5     Litecoin 0.9418182 0.9406061 0.9023438
## 6       Ripple 0.8843862 0.8831943 0.8381430
## 7       Tether 1.1336898 1.1336898 1.1336898

The gain to loss ratio table shows that Bitcoin has the highest ratio with over +25% margin over total losses. 2ND is Tether with a +12% margin over total losses. The rest have more losses than gains in their history.

Lets visualize the results.

ratio_bar <- ggplot(All_ratios, mapping = aes(y= ratio,x= Currency, fill=Currency))  +
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Gain to Loss Ratio")+
  labs(y= "Ratio")
ratio_bar + coord_flip()

ratio_bar100 <- ggplot(All_ratios, mapping = aes(y= ratio100,x= Currency, fill=Currency))  +
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Gain to Loss Ratio: Relative Change Limit (-100:100)")+
  labs(y= "Ratio")
ratio_bar100 + coord_flip()

ratio_bar1 <- ggplot(All_ratios, mapping = aes(y= ratio10,x= Currency, fill=Currency))+
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Gain to Loss Ratio: Relative Change Limit (-10:10)")+
  labs(y= "Ratio")
ratio_bar1 + coord_flip()

When we compare the overall gains to losses, we can see that there is a significant drop for Ripple, Litecoin, Ethereum, Dash and Bitcoin cash.

Probability

Lets look at the probability of currencies increasing on a given day, based on the count of gains and losses.

gain_loss_counts <- tidy_change %>% group_by(Currency, lost_gain) %>% summarise(n=n())
gain_loss_counts <- spread(gain_loss_counts, lost_gain, n)
gain_loss_counts <- mutate(gain_loss_counts, probability = (gain/(gain+loss))) 
probability <- select(gain_loss_counts, Currency, probability)

gain_loss_counts100 <- tidy_change %>% filter(relative_change < 100, relative_change > -100) %>% 
                      group_by(Currency, lost_gain) %>% summarise(n=n())
gain_loss_counts100 <- spread(gain_loss_counts100, lost_gain, n)
gain_loss_counts100 <- mutate(gain_loss_counts100, probability100 = (gain/(gain+loss)))
probability100 <- select(gain_loss_counts100, Currency, probability100)

gain_loss_counts10 <- tidy_change %>% filter(relative_change < 10, relative_change > -10) %>%
             group_by(Currency, lost_gain) %>% summarise(n=n())
gain_loss_counts10 <- spread(gain_loss_counts10, lost_gain, n)
gain_loss_counts10 <- mutate(gain_loss_counts10, probability10 = (gain/(gain+loss)))
probability10 <- select(gain_loss_counts10, Currency, probability10)

All_probability <- inner_join(probability,inner_join(probability100,probability10,by = "Currency"),by= "Currency")
All_probability$Currency <- c("Bitcoin Cash", "Bitcoin", "Dash", "Ethereum","Litecoin","Ripple","Tether")
All_probability
## # A tibble: 7 x 4
## # Groups:   Currency [?]
##       Currency probability probability100 probability10
##          <chr>       <dbl>          <dbl>         <dbl>
## 1 Bitcoin Cash   0.4728682      0.4728682     0.4123711
## 2      Bitcoin   0.5557029      0.5559534     0.5557621
## 3         Dash   0.4794118      0.4786451     0.4566863
## 4     Ethereum   0.4923439      0.4923439     0.4566396
## 5     Litecoin   0.4850187      0.4846971     0.4743326
## 6       Ripple   0.4693232      0.4689873     0.4559727
## 7       Tether   0.5313283      0.5313283     0.5313283

The probability table shows that Bitcoin has the highest chance of increase on a given day with 55% and that stays consistent when outliers are removed. Second is Tether with 53%. The rest is in the 47%-49% range, but decreases when outliers are removed. out of the remaining 5, Litecoin is most consistent, dropping from 48.5% to 47.4%.

Lets visualize the results.

probability_bar <- ggplot(All_probability, mapping = aes(y=probability,x= Currency, fill=Currency))  +
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Probability of Increase within a day")+
  labs(y= "Probability") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

probability_bar

probability_bar100 <- ggplot(All_probability, mapping = aes(y= probability100,x= Currency, fill=Currency))  +
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Probability of Increase within a day: Relative Change Limit (-100:100)")+
  labs(y= "Probability") +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))
probability_bar100

probability_bar1 <- ggplot(All_probability, mapping = aes(y=probability10,x= Currency, fill=Currency))+
  geom_bar(stat = "identity") +
  ggtitle("Cryptocurrencies from 2011-2017", subtitle = "Probability of Increase within a day: Relative Change Limit (-10:10)")+
  labs(y= "Probability")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
probability_bar1

Conclusion

Based on the day to day relative change of the currencies, we have some insight on how the market fluctuates. We saw that the relative change showed that there was an inverse relationship between Bitcoin and most of the currencies. Bitcoin Cash, Dash, Ethereum, Litecoin, and Ripple’s decrease most days with few large increases making up for the losses. While Bitcoin and Tether increase most days, with large drops evening it out.

When we looked at the gain to loss ratio, Bitcoin and Tether had a surplus in increases over losses, while the other 5 had a ratio of below 1. This supports the previous conclusion that there are more gains than losses and when we look at probabilities of this. We see that Bitcoin and Tether have a chance of increase more than 50% in a given day.

For someone that is interested in investing in Cryptocurrency, based on the analysis above, their best bet would be to invest in Bitcoin since. Since there isn’t much more information on Tether, more in depth research needs to be done for a conclusion. If someone is looking to invest in another currency, the data tells us that Litecoin has the highest upside.

Overall, since the market moves so fast, a day to day analysis might not be enough to come up with an accurate assumption.