Introduction


Following the packages is some code you will also need to copy if you wish to do this on your computer.

Initial Look


Lets take a look at our time series:

Note that the first half of the data shows nearly no variation in price while the second half takes a completely different look. This is important to point out because when we try to fit a model for this data, any outside forces that could have caused the price variation to look the way it does for the later half were not at play before the year 2017. For this reason, it is probably safe to say to only look at data after 2017 because whatever was happening to Bitcoin before then is not happening anymore. The second plot shows the time series we will be studying.


All The Data

Date Range: 08/28/2011 - 10/05/2019

Starting 01/01/2017

Date Range: 01/01/2017 - 10/05/2019

Initial Analysis


Now that we have the data we want to look at, let us see what it is trying to tell us.



Scatter Plot

This first plot below helps show the correlation between the prices of Bitcoin with lagged versions of itself. As you can see, it is clear that all the data is highly correlated with itself for at least the first few lags. This plot is helpful, but we can’t conclude anything until we look at the autocorrelation (ACF) and partial autocorrelation (PACF) plots…

ACF and PACF

The ACF and PACF suggest this data is taking a Random Walk, so a single difference should be applied, and the two plots should be reviewed again.





After differencing the data one time our data looks a little like noise, but not quite. Let’s take a closer look at what the data looks like after it’s been differenced once.



The First Difference









Taking a look at the data after it has been differenced once shows that our data is not stationary because it’s variance is nowhere near constant. This explains why the ACF and PACF did not initially produce white noise. A transformation must be performed on the data to obtain a constant variance.

Transformation



The Box Cox Transformation has been used to solve many problems exactly like this one.


Box Cox Transformation






As you can see from the formulas above, in order to find the value of y, you first have to find the value of λ. Finding the value for λ can be done multiple ways such as trial and error, but a more efficient way is to use the function BoxCox.lambda(). When used on our data we get the following value for λ:

## [1] -0.09363108

When this function is used along with the function BoxCox(), we can transform the entire data into something that hopefully has a constant variance.




Before Transformation

After Transformation

Model Fitting


Although there are slightly fewer lags that have significant correlation values, our data still does not look completely stationary after it has been differenced once. This can be confirmed when we try to fit a model on our transformed data. Here, the auto.arima() function comes in very handy. When applied to data, it can tell you what ARIMA model is best suited by minimizing AIC and BIC values. When this function is applied to our transformed data, here is the result:

## Series: bit_ts_tran 
## ARIMA(3,1,2) with drift 
## 
## Coefficients:
##          ar1      ar2      ar3      ma1     ma2  drift
##       0.0328  -0.9080  -0.0220  -0.0528  0.9467  1e-03
## s.e.  0.0456   0.0824   0.0329   0.0331  0.0657  6e-04
## 
## sigma^2 estimated as 0.00039:  log likelihood=2523.74
## AIC=-5033.48   AICc=-5033.36   BIC=-4999.08

Apparently, an ARIMA(3, 1, 2) model is best at the moment. When we use this model to fit our data, checkresiduals() is another useful function that will show us whether or not it’s a good fit. Let’s fit an ARIMA(3, 1, 2) model and see how it looks.




So this function shows us a number of valuable things. At the top, you get a plot of the residuals. This allows you to visually inspect the outcome to see if it at least resembles something similar to white noise. Here we can see there is a constant mean, and possibly a constant variance.

Under that and to the left we see the ACF ploted out for the residuals. Of all the ACF plots we’ve seen so far, this one most resembles white noise.

To the right we are shown a histogram of the residuals. Here our residuals look close to a normal distribution, which is the desired result, but there are still a good number of outliers.

Finally, a Ljung-Box test is performed and the results appear at the very bottom. The most important result for this test is the p-value. The p-value obtained for this model is about 6.1%. The typical threshold for a bad model is a value of 5% or lower. Albeit our p-value is above this standard, it is still pretty low.

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,2) with drift
## Q* = 8.9981, df = 4, p-value = 0.06115
## 
## Model df: 6.   Total lags used: 10


Overall, this model is a decent fit for our data once it has been transformed, but we may be able to do a little better.



One approach to finding a better model is to re-question the stationarity of our transformed data. Taking another look at the difference, claiming that the variance is constant throughout may, in fact, be a stretch.



Here we can see the left half of the data may have a larger variance than the right half meaning in 2017 and 2018 the price for Bitcoin was much more volatile than it has been recently. This explains why the ACF and PACF for this differenced data alone does not yet look like noise. We already removed a significant part of the data once before. In the beginning, we decided to to remove the first few years of data because it did not at all resemble the current data. We removed it mainly in fear of possibly over fitting any future model by providing it with data that doesn’t necessarily need to be analyzed. We may be able to remove some past data once more.


Removing data may sound counter intuitive, but it is logical. The argument is; if this data represents the observations of some underlying model, then that model should be present in both past and current data. In other words, if we only look at the data for this year and it too is a good fit for the model created using the entire data, in our case an ARIMA(3, 1, 2) model, then we found our model. If a different model is more present throughout bits of the data, the more frequent model may be the best model for the entire data.


Let’s look at the data for only 2019, and let’s transform it using the same method as before:

Just The Data

The Data Differenced

Results


The ACF and PACF of our differenced data may suggest that after we transform and difference the data one time, white noise may finally be what we are left with. To be certain, we can use the auto.arima() function again to see which model is recommended.

## Series: bit_ts_tran2 
## ARIMA(0,1,0) 
## 
## sigma^2 estimated as 8.162e-05:  log likelihood=910.85
## AIC=-1819.7   AICc=-1819.69   BIC=-1816.08

As suspected, a Random Walk may be the model we need to fit the Bitcoin data. Before we can conclude this we must look at a few things.

First, let’s check the residuals:

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,0)
## Q* = 15.771, df = 10, p-value = 0.1064
## 
## Model df: 0.   Total lags used: 10

All of the residual plots seem to look good, and our p-value looks a little better than before as well. This is a great sign because a Random Walk is much simpler than an ARIMA(3, 1, 2), and it is always suggested that a model should be as simple as possible.

Albeit we have reason to believe 2019’s Bitcoin price takes a Random Walk, we should check to see if the past price data shows similar signs.


Everything After 2017

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,0) with drift
## Q* = 15.445, df = 9, p-value = 0.07943
## 
## Model df: 1.   Total lags used: 10

Conclusion


A quick look at the residuals above provide strong evidence that although the entire data may show signs of a complex trend, Bitcoin is intrinsically taking a Random Walk. This is because at first it was suggested by the algorithm within the function auto.arima() that Bitcoin was following an ARIMA(3, 1, 2) model, but a Random Walk was consistently present for every year following 2017. To be specific, a Random Walk with a drift. Let’s take another look at this model:

## Series: bit_ts_tran 
## ARIMA(0,1,0) with drift 
## 
## Coefficients:
##       drift
##       1e-03
## s.e.  6e-04
## 
## sigma^2 estimated as 0.0003913:  log likelihood=2519.61
## AIC=-5035.22   AICc=-5035.21   BIC=-5025.39
## 
## Training set error measures:
##                        ME       RMSE        MAE          MPE      MAPE
## Training set 5.059098e-06 0.01976218 0.01359546 0.0002911483 0.2328272
##                   MASE        ACF1
## Training set 0.9946843 -0.02040916


Our drift is small, but positive. This suggests if a linear trend is present, it is very small and upward. A more formal way of presenting this model would look something like this:

It is important to remember that this formula only works after the data has been transformed. In the Forecasting section, I will create a function that does all this automatically. To use this simple model to create better predictions, we can learn a few things about the “Random Noise” that plays a part.

## Standard Deviation =  0.019772
## Mean = 5.059098e-06


The average of the noise should be zero if it is white noise, which is what we were going for, but the standard deviation is deceivingly small. This is because this value doesn’t mean anything unless it has been re-transformed. Let’s create an inverse BoxCox function. Then we can see what the standard deviation of the noise is in units of dollars.

## [1] 1.019987

This means the “Random Noise” has a standard deviation of about $1





The Efficient Market Hypothesis states that a financial asset’s value reflects all the information available and instantly responds only to recent and unexpected news. To put simply, a stock’s price can always be viewed as the true value of the company it represents and only changes over time similar to that of a Random Walk. Like stocks, Bitcoin is famous for trading with hopes of trying to make a profit. Unlike stocks, however, Bitcoin does not represent any company and doesn’t produce any goods. This may lead one to assume that its price changes due to other factors that this infamous hypothesis has failed to point out, but it seems as though that is not the case. From this analysis, it is safe to assume that Bitcoin is just another asset taking a Random Walk down Wall Street.

Forecasting


So what has happened since I started this project?



Well 40 trading days to be exact. This means the price of Bitcoin has changed at most 40 times, so we can use our model to see if it successfully predicted the past 40 days. If you recall, our data stopped on October 5th, so that is where we will pick up.

Date Range: 08/28/2011 - 11/14/2019

Just for fun, let’s see how the next 100 day prediction compares to the past 40 days of actual data. Feel free to zoom into the plot so you can get a better look.







 




A work by Timothy Sumner