Forecasting Bitcoin Prices with Time Series

Natália Faraj Murad

The goal here is to predict bitcoin prices using time series models. The bitcoin data was downloaded using the Yahoo Finance API. Data were explored and processed to be ready as input to the models. Four models were adjusted and evaluated: Naive Bayes, ARIMA, SARIMAX and Prophet. All of them presented a good result, unless Naive Bayes. The models were created considering a daily period.

Load the data

Using Yahoo Finance API to get the Bitcoin Prices Data.

Setting Date as index.

Stationarity

The serie must be stationary to be passed to the model. A serie is stationary when its properties like mean and variance remain constant over time. This series will be differentiated in order to get stationary.

Decomposition

This serie presents a trend of increasing since 2020. It also have a annual seasonal component and random residuals.

Autocorrelation

This serie is strongly autocorrelated.

Box Cox Transformation

Using Box Transformation in the serie because the series does not have constant variance.

Differentiation

Defining functions to differenciate the serie to transform it in stationary and a function to get the real values back later.

Function to evaluate performance of the model

Split Train and Test

Naive Bayes

ARIMA

Plot of predictions and original serie considering transformed values.

Getting real values back to plot the graphic and compare predictions.

ARIMA Model Performance

SARIMA - Seasonal Arima

Grid Search to search best parameters to adjust the model.

Tranforming back to the original values. For this, it is used the functions previously defined. First, we invert the differentiation and then the Box Cox transformation.

SARIMA Performance

SARIMA MAPE (99.7) was greater than ARIMA MAPE (98.06), indicating that ARIMA has minor error.

Prophet

Plot with transformed values.

Prophet MAPE is also greater than ARIMA and SARIMA MAPE, indicating that ARIMA is the model with minor percentage error.

Plot of train set, validation set, predictions in red and a period of 60 forecasting.

Plotting the real values