Time Series Analysis Project - Bitcoin Price

Aim

To Forecast Bitcoin Price for the next 10 days from the data.

Introduction

Bitcoin is a decentralized digital currency (cryptocurrency) used to securely store and transfer any amount of value anywhere in the world without authority or central intermediary. It was invented in 2009 by an anonymous person, or group of people, who referred to themselves as Satoshi Nakamoto. In 2011, the price started at 0.30 per bitcoin, growing to 5.27 for the year and kept fluctuating over the years with an all time high of 19,783.06 on 17 December 2017. As of 29 May 2019 the price of Bitcoin is 8,654.81. Bitcoin is highly volatile in nature which has made it quite challenging to predict the pattern. https://en.wikipedia.org/wiki/Bitcoin

Our dataset is sourced from https://coinmarketcap.com/ comprising of daily closing price of bitcoin from the 27th of April 2013 to the 24th of February 2019. We will analyse the data using methods learned from Time Series Analysis course to best predict the Bitcoin price for the next 10days using the best model from our candidate models.

Our aim is to apply Time Series Analysis on the given dataset. We focus on to finding the best fitted model among the set of candidate models for this dataset and to forecast the variation in Bitcoin price for given period of time.

Data

1.Source - https://coinmarketcap.com/
2.Direct Link - https://coinmarketcap.com/currencies/bitcoin/historical-data/

3.The data has 2 variables and 2130 Observation
4.The data is daily time series
5.Variable 'Date' is a character variables having dates from 27-04-2013 to 24-02-2019.
6. Variable 'Close' is a continuos variable and it has closing price corresponding to the date.

Method

The methodology used for the Time Series Analysis can be described as follows;

1. Pre-processing the data
2. Visualising the common features in the time series plot. 
3. Checking correlation in the observations.
4. Check model assumptions and make decisions about further analysis
5. Do data transformation if necessary
6. Fit models on transformed data (ARIMA/GARCH)
7. Check model assumptions and make decisions about further analysis
8. Try to fit a reduced model and check its corresponding statistics
9. Check assumptions for the reduced model and make decisions about further analysis.
10. Find out forecasts and make conclusions.

Setup & Preprocessing

#Reading the Data
BPH<- read_csv("Bitcoin_Historical_Price.csv")

## Parsed with column specification:
## cols(
##   Date = col_character(),
##   Close = col_double()
## )

head(BPH)

## # A tibble: 6 x 2
##   Date     Close
##   <chr>    <dbl>
## 1 27-04-13 134. 
## 2 28-04-13 145. 
## 3 29-04-13 139  
## 4 30-04-13 117. 
## 5 01-05-13 105. 
## 6 02-05-13  97.8

#Checking for NA values
which(is.na(BPH$Close))

## integer(0)

# Convert data into a time series object
class(BPH)

## [1] "tbl_df"     "tbl"        "data.frame"

BPH.ts<- ts(as.vector(BPH$Close),start=c(2013 ,as.numeric(format(as.Date("2013-04-27"), "%j"))),frequency = 365)
class(BPH.ts)

## [1] "ts"

Time Series Plot and Correlation

PLOT 1 - TIME SERIES PLOT

1.we observe a positive(upward) trend in the Time Series plot which implies that over the years Bit Coin Close Price increases.
2.There is considerable variation in the Bit Coin Close Price over the years especially in the last two years of the series.
3.There is no seasonality in the time series plot because concept of seasonality views the dataset in such a manner that after a period of time(less than an year) values are related to one another in some manner but no such evidence is found in this plot to support the arguement.
4.Autocorrelation can be seen in the plot because the succeeding data values are related to one another in such a way that neighbouring values in time tend to have similar size.
5.There is no point of change(no intervention) as there is no abrupt drop or rise in the time series plot.

## Plot 1- Time Series Plot
plot(BPH.ts,ylab='Closing price of Bitcoin in Dollars',xlab='Year',type='o' , main = "Plot 1:Time series Plot of Bitcoin Series")

PLOT 2 - SCATTER PLOT Scatter plot is used to find relationship between pair of consecutive Egg Deposition.

1.We observe that there issome sort of relationship between neighbouring values as they are not randomly spread over the plot which implies that there is correlation between Bit Coin Close Price of consecutive years.
2.We observe an upward trend in the plot which indicates that there is a strong positive correlation between neighbouring values.

## Plot 2 - Scatter Plot 
plot(y=BPH.ts,x=zlag(BPH.ts),ylab='Close Price', xlab='Previous Year Close Price', main ="Plot 2:Scatter Plot")

CORRELATION -

The amount of Correlation between Bit Coin Close Price and previous year’s Close Price is found to be 0.9976471 which shows a strong positive correlation between neighbouring values.

##Correlation
y1 = BPH.ts             
x1 = zlag(BPH.ts)        
index = 2:length(x1)   
cor(y1[index],x1[index])

## [1] 0.9976471

McLeod-Li Test to check existance of ARCH

To test the presence of ARCH in the series this test is used. PLOT 3 - McLeod-Li Test

All the lags for McLeod-Li test are significant at 5% level of significance suggesting existence of strong ARCH and volatility clustering.

## PLOT 3 - McLeod-Li Test
McLeod.Li.test(y=BPH.ts,main="Plot 3:McLeod-Li Test Statistics for Bit Coin Close Price series")

QQPlot and Shapiro-Wilk Test

PLOT 4 - QQPlot

1.QQPlot suggests deviations from straight line and does not exhibit the characteristic of Normality Assumptions.
2.The fat tails suggests Volatile Clustering in series.

## PLOT 4 - QQPlot 
qqnorm(BPH.ts,main="Plot 4:QQ Plot for Bitcoin close price series")
qqline(BPH.ts)

SHAPIRO-WILK TEST-

Shapiro-Wilk Normaility Test is a hypothesis test to check the Normality. It finds out the correlation amidst residual and normal quantities. Following are the R codes used:

According to this test, p-value is less than 5% level of significance and concludes that we reject the null hypothesis which states that the stochastic component is normally distributed.

## Shapiro Test
shapiro.test(BPH.ts)

## 
##  Shapiro-Wilk normality test
## 
## data:  BPH.ts
## W = 0.68136, p-value < 2.2e-16

ACF and PACF

PLOT 5 : ACF

To understand the dependence in Stochastic component, Autocorrelation Function is of use.

1.The plot suggests that there is a slowly decaying pattern in ACF with all the lags significant.
2.The pattern or trend seen in the plot suggests non-stationarity.

PLOT 6 : PACF

To understand the dependence in Stochastic component,Partial Autocorrelation Function is of use.

1.The plot suggests that value at lag 1 is highly significant implying the existence of non-stationarity in series.

CONCLUSION:

1.There is a visible trend seen in ACF and 1 significant lag in PACF.This suggests non-stationarity in series.
2.ACF and PACF does not suggest white noise in series and McLeod-Li Test suggests Volatile Clustering.
Therefore, considering ARIMA + GARCH models for further analysis.

par(mfrow = c(1,2))
#PLOT5- ACF
acf(BPH.ts, main="Plot 5:Sample ACF")
#PLOT6 - PACF
pacf(BPH.ts, main="Plot 6:Sample PACF")

Dickey-Fuller Unit-Root Test(ADF Test)

Dickey-Fuller Unit-Root Test is used to check for non-stationarity in data. This test assumes:

HO: The process is difference nonstationary.
HA: The process is stationary.

ADF TEST RESULT:

1.The test suggests that the p- value is greater than 5% level of significance and hence we fail to reject null hypothesis stating non stationarity.

## ADF test 
orders <- ar(diff(BPH.ts))$order
adfTest(BPH.ts, lags = orders, title = NULL,description = NULL)

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 33
##   STATISTIC:
##     Dickey-Fuller: -1.2548
##   P VALUE:
##     0.2164 
## 
## Description:
##  Thu Aug 08 14:08:50 2019 by user: Parvi

BoxCox Curve

PLOT 7

The plot suggests the value of lambda as 0 which in turn suggests Log Transformation.

#PLOT 7
box <- BoxCox.ar(BPH.ts, method = "yule-walker")

Log Transformation

PLOT 8 : QQPLOT

1.QQPlot suggests deviations from straight line and does not exhibit the characteristic of Normality Assumptions.
2.We can confirm this with Shapiro Test.

log.BPH.ts <- log(BPH.ts)
#PLOT 8 : QQPlot Egg series
qqnorm(log.BPH.ts, main = "Plot 8:QQPlot Log Transformed Series")
qqline(log.BPH.ts, col = 2, lwd = 1, lty = 2)

SHAPIRO-WILK TEST:

According to this test, p-value is less than 5% level of significance and concludes that we reject the null hypothesis which states that the stochastic component is normally distributed.

# Applying shapiro test
shapiro.test(log.BPH.ts)

## 
##  Shapiro-Wilk normality test
## 
## data:  log.BPH.ts
## W = 0.92816, p-value < 2.2e-16

ADF test on log transformed series

The test suggests that the p- value is greater than 5% level of significance and hence we fail to reject null hypothesis stating non stationarity.
Hence, we need to go for differencing the transformed series.

# ADF TEST ON LOG TRANSFORMED SERIES
orders <- ar(diff(log.BPH.ts))$order
adfTest(log.BPH.ts, lags = orders, title = NULL,description = NULL)

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 31
##   STATISTIC:
##     Dickey-Fuller: 1.0713
##   P VALUE:
##     0.9232 
## 
## Description:
##  Thu Aug 08 14:08:51 2019 by user: Parvi

First Differencing

PLOT 9 : TIME SERIES PLOT

1.No trend can be seen in the series which suggests stationarity.
2.The Plot displays volatile Clustering in this series.

#PLOT 9 : TIME SERIES PLOT
diff.BPH.ts <- diff(log.BPH.ts, differences = 1) 
plot(diff.BPH.ts,ylab='Bitcoin Close Price',xlab='Year',type='o' , main = "Plot 9: Time series Plot of First differencing")

ADF Test for First Differencing

ADF TEST:

The test suggests that the p-value is less than 5% level of significance hence we reject the null hypothesis stating non stationarity.

#ADF TEST:
orders <- ar(diff(diff.BPH.ts))$order
adfTest(diff.BPH.ts, lags = orders, title = NULL,description = NULL)

## Warning in adfTest(diff.BPH.ts, lags = orders, title = NULL, description =
## NULL): p-value smaller than printed p-value

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 32
##   STATISTIC:
##     Dickey-Fuller: -7.4259
##   P VALUE:
##     0.01 
## 
## Description:
##  Thu Aug 08 14:08:51 2019 by user: Parvi

ACF and PACF for First Differencing

PLOT 10 : ACF for first differencing

To understand the dependence in Stochastic component, Autocorrelation Function is of use.

1.The plot suggests many highly significant lags.
2.These highly significant lags can be seen due to volatile clustering in the series.

PLOT 11 : PACF for first differencing

To understand the dependence in Stochastic component, Partial Autocorrelation Function is of use.

1.The plot suggests many highly significant lags.
2.These highly significant lags can be seen due to volatile clustering in the series.

CONCLUSION:

1.These plots depict many high significant lags suggesting ARCH effect and volatile clustering in series.
2.We check further by plotting EACF.

par(mfrow = c(1,2))
#PLOT 10 : ACF for the First Difference
acf(diff.BPH.ts,xaxp=c(0,24,12), main="PLOT 10 : ACF ")
#PLOT 11 : PACF for the First Difference
pacf(diff.BPH.ts,xaxp=c(0,24,12), main="PLOT 11 : PACF")

EACF for First Differencing

EACF :

Possible models: ARIMA(6,1,6), ARIMA(6,1,7), ARIMA(7,1,7)

CONCLUSION:

Candidate Models :ARIMA(6,1,6), ARIMA(6,1,7), ARIMA(7,1,7)

#EACF
eacf(diff.BPH.ts, ar.max = 15, ma.max = 15)

## AR/MA
##    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
## 0  o o o o x x o o o x x  o  o  o  o  o 
## 1  x o o o o x o o o o x  o  o  o  o  o 
## 2  o x o o o x o o o o x  o  o  o  o  o 
## 3  o x o o o x o o o o x  o  o  o  o  o 
## 4  x x o x o x o o o o x  o  o  o  o  o 
## 5  x x x x x o o o o o x  o  o  o  o  o 
## 6  x x x x x o o o o o o  o  o  o  o  o 
## 7  x x o x x x x o o o o  o  o  o  o  o 
## 8  x x x x x x x x o o o  o  o  o  o  o 
## 9  o x x o x x x x o o o  o  o  o  o  o 
## 10 x x x x x x x o x x o  o  o  o  o  o 
## 11 x o x o x x x x x x x  o  o  o  o  o 
## 12 x x x x o x x x x x x  o  o  o  o  o 
## 13 x x o o o x x x o o x  o  x  o  o  o 
## 14 x x o o o x x x x o x  o  x  o  o  o 
## 15 x o x x o x x o o o x  o  x  o  x  o

BIC Table For First Differencing

PLOT 12 : BIC table

BIC table suggests AR(6) and MA(6) as shaded columns.
Candidate Models:ARIMA(6,1,6), ARIMA(6,1,7), ARIMA(7,1,7)

#PLOT 12 : BIC Table:
res = armasubsets(y=diff.BPH.ts,nar=10,nma=10,y.name='test',ar.method='ols')
plot(res)

Fitting model, finding Parameter Estimation and Residual Analysis

Candidate Models:ARIMA(6,1,6), ARIMA(6,1,7), ARIMA(7,1,7)

1)ARIMA(6,1,6)

FITTING MODEL AND FINDING PARAMETER ESTIMATION :

#Fitting model - CSS method
model.616.css = stats::arima(log.BPH.ts,order=c(6,1,6),method='CSS')
#Parameter Estimation
coeftest(model.616.css)

## 
## z test of coefficients:
## 
##      Estimate Std. Error z value  Pr(>|z|)    
## ar1 -0.335753   0.216210 -1.5529 0.1204466    
## ar2 -0.102417   0.168454 -0.6080 0.5431969    
## ar3  0.134972   0.149552  0.9025 0.3667866    
## ar4 -0.140161   0.108569 -1.2910 0.1967052    
## ar5  0.193837   0.134344  1.4428 0.1490659    
## ar6  0.441613   0.123689  3.5704 0.0003565 ***
## ma1  0.331001   0.224399  1.4751 0.1401979    
## ma2  0.088079   0.175442  0.5020 0.6156375    
## ma3 -0.132146   0.147344 -0.8969 0.3697984    
## ma4  0.180447   0.113207  1.5940 0.1109444    
## ma5 -0.119988   0.139784 -0.8584 0.3906800    
## ma6 -0.358845   0.121888 -2.9441 0.0032394 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#Fitting model - ML method
model.616.ml = stats::arima(log.BPH.ts,order=c(6,1,6),method='ML')

## Warning in stats::arima(log.BPH.ts, order = c(6, 1, 6), method = "ML"):
## possible convergence problem: optim gave code = 1

#Parameter Estimation
coeftest(model.616.ml)

## Warning in sqrt(diag(se)): NaNs produced

## 
## z test of coefficients:
## 
##      Estimate Std. Error z value Pr(>|z|)
## ar1 -0.221841         NA      NA       NA
## ar2 -0.273736         NA      NA       NA
## ar3  0.099923         NA      NA       NA
## ar4 -0.111787         NA      NA       NA
## ar5  0.495063         NA      NA       NA
## ar6  0.654386         NA      NA       NA
## ma1  0.223291         NA      NA       NA
## ma2  0.281240         NA      NA       NA
## ma3 -0.094127         NA      NA       NA
## ma4  0.131769         NA      NA       NA
## ma5 -0.441063         NA      NA       NA
## ma6 -0.595185         NA      NA       NA

1.1)RESIDUAL ANALYSIS of ARIMA(6,1,6)

PLOT 13-

TIME SERIES PLOT:

No trend can be seen in time series plot.

QQPLOT:

QQPlot suggests deviations from straight line and it does not exhibit the characteristics of Normality Assumption.

ACF PLOT:

The plot suggests that value at lag 5 is significant whereas rest of the lags are insignificant.

PACF PLOT:

The plot suggests that value at lag 4 is significant whereas rest of the lags are insignificant.

LJUNG - Box TEST:

One of the data points slightly touches the confidence line whereas rest of the data points fall above the confidence line and hence plot suggests no autocorrelation.

residual.analysis <- function(model, std = TRUE){
  library(TSA)
  library(FitAR)
  if (std == TRUE){
    res.model = rstandard(model)
  }else{
    res.model = residuals(model)
  }
  par(mfrow=c(3,2))
  plot(res.model,type='o',ylab='Standardised residuals', main="Time series plot of standardised residuals")
  abline(h=0)
  qqnorm(res.model,main="QQ plot of standardised residuals")
  qqline(res.model, col = 2)
  acf(res.model,main="ACF of standardised residuals")
  pacf(res.model,main="PACF of standardised residuals")
  print(shapiro.test(res.model))
  k=0
  LBQPlot(res.model, lag.max = length(model$residuals)-1 , StartLag = k + 1, k = 0, SquaredQ = FALSE)
  par(mfrow=c(1,1))
}

residual.analysis(model = model.616.ml)

## Loading required package: lattice

## Loading required package: leaps

## Loading required package: ltsa

## Loading required package: bestglm

## 
## Attaching package: 'FitAR'

## The following object is masked from 'package:forecast':
## 
##     BoxCox

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.89396, p-value < 2.2e-16

par(mfrow=c(1,1))

Box.test(residuals(model.616.ml), lag = 6, type = "Ljung-Box", fitdf = 0)

## 
##  Box-Ljung test
## 
## data:  residuals(model.616.ml)
## X-squared = 1.4835, df = 6, p-value = 0.9606

2)ARIMA(6,1,7)

FITTING MODEL AND FINDING PARAMETER ESTIMATION :

#Fitting model - CSS method
model.617.css = stats::arima(log.BPH.ts,order=c(6,1,7),method='CSS')
#Parameter Estimation
coeftest(model.617.css)

## 
## z test of coefficients:
## 
##      Estimate Std. Error z value  Pr(>|z|)    
## ar1  0.590284   0.180735  3.2660 0.0010907 ** 
## ar2  0.028782   0.340432  0.0845 0.9326217    
## ar3 -0.099355   0.370611 -0.2681 0.7886350    
## ar4  0.239830   0.188222  1.2742 0.2025974    
## ar5  0.168719   0.248206  0.6798 0.4966588    
## ar6 -0.033442   0.201692 -0.1658 0.8683090    
## ma1 -0.598170   0.181388 -3.2977 0.0009747 ***
## ma2 -0.036073   0.342319 -0.1054 0.9160747    
## ma3  0.117964   0.371289  0.3177 0.7507006    
## ma4 -0.219025   0.188759 -1.1603 0.2459095    
## ma5 -0.134728   0.250625 -0.5376 0.5908753    
## ma6  0.081483   0.196391  0.4149 0.6782148    
## ma7 -0.060941   0.024998 -2.4379 0.0147741 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#Fitting model - CSS-ML method
model.617.ml = stats::arima(log.BPH.ts,order=c(6,1,7),method='ML')

## Warning in stats::arima(log.BPH.ts, order = c(6, 1, 7), method = "ML"):
## possible convergence problem: optim gave code = 1

#Parameter Estimation
coeftest(model.617.ml)

## Warning in sqrt(diag(se)): NaNs produced

## 
## z test of coefficients:
## 
##       Estimate Std. Error z value Pr(>|z|)
## ar1 -0.2337523         NA      NA       NA
## ar2 -0.2580872         NA      NA       NA
## ar3  0.0831107         NA      NA       NA
## ar4 -0.0937680         NA      NA       NA
## ar5  0.4937506         NA      NA       NA
## ar6  0.6811965         NA      NA       NA
## ma1  0.2295018         NA      NA       NA
## ma2  0.2631229         NA      NA       NA
## ma3 -0.0752452         NA      NA       NA
## ma4  0.1121979         NA      NA       NA
## ma5 -0.4405281         NA      NA       NA
## ma6 -0.6213112         NA      NA       NA
## ma7  0.0065774  0.0221423  0.2971   0.7664

2.1)RESIDUAL ANALYSIS of ARIMA(6,1,7)

PLOT 14-

TIME SERIES PLOT:

No trend can be seen in time series plot.

QQPLOT:

QQPlot suggests deviations from straight line and it does not exhibit the characteristics of Normality Assumption.

ACF PLOT:

The plot suggests that value at lag 5 is significant whereas rest of the lags are insignificant.

PACF PLOT:

The plot suggests that value at lag 4 is significant whereas rest of the lags are insignificant.

LJUNG - Box TEST:

One of the data points slightly touches the confidence line whereas rest of the data points fall above the confidence line and hence plot suggests no autocorrelation.

residual.analysis <- function(model, std = TRUE){
  library(TSA)
  library(FitAR)
  if (std == TRUE){
    res.model = rstandard(model)
  }else{
    res.model = residuals(model)
  }
  par(mfrow=c(3,2))
  plot(res.model,type='o',ylab='Standardised residuals', main="Time series plot of standardised residuals")
  abline(h=0)
  qqnorm(res.model,main="QQ plot of standardised residuals")
  qqline(res.model, col = 2)
  acf(res.model,main="ACF of standardised residuals")
  pacf(res.model,main="PACF of standardised residuals")
  print(shapiro.test(res.model))
  k=0
  LBQPlot(res.model, lag.max = length(model$residuals)-1 , StartLag = k + 1, k = 0, SquaredQ = FALSE)
  par(mfrow=c(1,1))
}

residual.analysis(model = model.617.ml)

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.894, p-value < 2.2e-16

par(mfrow=c(1,1))

Box.test(residuals(model.617.ml), lag = 6, type = "Ljung-Box", fitdf = 0)

## 
##  Box-Ljung test
## 
## data:  residuals(model.617.ml)
## X-squared = 1.4102, df = 6, p-value = 0.9652

3)ARIMA(7,1,7)

FITTING MODEL AND FINDING PARAMETER ESTIMATION :

#Fitting model - CSS method
model.717.css = stats::arima(log.BPH.ts,order=c(7,1,7),method='CSS')
#Parameter Estimation
coeftest(model.717.css)

## 
## z test of coefficients:
## 
##     Estimate Std. Error z value Pr(>|z|)   
## ar1  0.35893    0.24367  1.4730 0.140747   
## ar2  0.24750    0.14168  1.7469 0.080659 . 
## ar3  0.23185    0.11266  2.0579 0.039601 * 
## ar4 -0.17052    0.13370 -1.2754 0.202156   
## ar5  0.26127    0.13735  1.9021 0.057152 . 
## ar6  0.31912    0.15495  2.0595 0.039444 * 
## ar7 -0.37515    0.13980 -2.6836 0.007284 **
## ma1 -0.36534    0.24880 -1.4684 0.141997   
## ma2 -0.26490    0.14889 -1.7791 0.075222 . 
## ma3 -0.22242    0.11938 -1.8631 0.062447 . 
## ma4  0.21025    0.13851  1.5179 0.129041   
## ma5 -0.21466    0.14672 -1.4630 0.143463   
## ma6 -0.28635    0.15771 -1.8157 0.069416 . 
## ma7  0.31666    0.13685  2.3140 0.020670 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#Fitting model - ML method
model.717.ml = stats::arima(log.BPH.ts,order=c(7,1,7),method='ML')

## Warning in stats::arima(log.BPH.ts, order = c(7, 1, 7), method = "ML"):
## possible convergence problem: optim gave code = 1

#Parameter Estimation
coeftest(model.717.ml)

## Warning in sqrt(diag(se)): NaNs produced

## 
## z test of coefficients:
## 
##      Estimate Std. Error z value Pr(>|z|)
## ar1 -0.269044         NA      NA       NA
## ar2  0.074530         NA      NA       NA
## ar3 -0.509925         NA      NA       NA
## ar4  0.174441   0.247872  0.7038   0.4816
## ar5  0.690590         NA      NA       NA
## ar6  0.153987         NA      NA       NA
## ar7  0.300577         NA      NA       NA
## ma1  0.267329         NA      NA       NA
## ma2 -0.082508         NA      NA       NA
## ma3  0.516076         NA      NA       NA
## ma4 -0.133407   0.255599 -0.5219   0.6017
## ma5 -0.660232         NA      NA       NA
## ma6 -0.083916         NA      NA       NA
## ma7 -0.278383         NA      NA       NA

3.1)RESIDUAL ANALYSIS of ARIMA(7,1,7)

PLOT 15-

TIME SERIES PLOT:

No trend can be seen in time series plot.

QQPLOT:

QQPlot suggests deviations from straight line and it does not exhibit the characteristics of Normality Assumption.

ACF PLOT:

The plot suggests that value at lag 5 is significant whereas rest of the lags are insignificant.

PACF PLOT:

The plot suggests that value at lag 4 is significant whereas rest of the lags are insignificant.

LJUNG - Box TEST:

One of the data points slightly touches the confidence line whereas rest of the data points fall above the confidence line and hence plot suggests no autocorrelation.

residual.analysis <- function(model, std = TRUE){
  library(TSA)
  library(FitAR)
  if (std == TRUE){
    res.model = rstandard(model)
  }else{
    res.model = residuals(model)
  }
  par(mfrow=c(3,2))
  plot(res.model,type='o',ylab='Standardised residuals', main="Time series plot of standardised residuals")
  abline(h=0)
  qqnorm(res.model,main="QQ plot of standardised residuals")
  qqline(res.model, col = 2)
  acf(res.model,main="ACF of standardised residuals")
  pacf(res.model,main="PACF of standardised residuals")
  print(shapiro.test(res.model))
  k=0
  LBQPlot(res.model, lag.max = length(model$residuals)-1 , StartLag = k + 1, k = 0, SquaredQ = FALSE)
  par(mfrow=c(1,1))
}

residual.analysis(model = model.717.ml)

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.893, p-value < 2.2e-16

## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length

par(mfrow=c(1,1))

Box.test(residuals(model.717.ml), lag = 6, type = "Ljung-Box", fitdf = 0)

## 
##  Box-Ljung test
## 
## data:  residuals(model.717.ml)
## X-squared = 1.1262, df = 6, p-value = 0.9804

AIC and BIC values

Both AIC and BIC suggests ARIMA(6,1,6) model to consider.
Further GARCH model will be applied on this model.

##AIC
stats::AIC(model.616.ml)

## [1] -7329.465

stats::AIC(model.617.ml)

## [1] -7326.973

stats::AIC(model.717.ml)

## [1] -7326.661

## BIC 
stats::BIC(model.616.ml)

## [1] -7255.841

stats::BIC(model.617.ml)

## [1] -7247.685

stats::BIC(model.717.ml)

## [1] -7241.71

ARIMA(6,1,6) for further Analysis

m616_residuals = model.616.ml$residuals
abs.BPH = abs(m616_residuals)
sq.BPH = (m616_residuals)^2

Absolute Values Approach

1.EACF does not indicates ARMA models 
2.ARMA(2,2) ARMA(2,3),ARMA(3,3)-These models correspond to parameter settings of [max(2,2),2], [max(2,3),2],[max(3,3),3]. 

So the corresponding tentative GARCH models are GARCH(2,2), GARCH(3,2),GARCH(3,3)

par(mfrow=c(1,2))
acf(abs.BPH, ci.type="ma",main="Plot 16: The sample ACF plot for abs series")
pacf(abs.BPH, main="Plot 17:The sample PACF plot for abs series")

eacf(abs.BPH)

## AR/MA
##   0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x x x x x x x x x x  x  x  x 
## 1 x o o o x o o o o x o  o  x  o 
## 2 x x o o o o o o o o o  o  o  o 
## 3 x x x o o o o o o o o  o  o  o 
## 4 x x x x o o o o o o o  o  o  o 
## 5 x x x x x o o o o o o  o  o  o 
## 6 x o x x x x o o o o o  o  o  o 
## 7 x x x x x x x o o o o  o  o  o

Squared Values Approach

1.ARMA (4,5),ARMA(4,6) ,ARMA(5,6) - [max(4,5),4], [max(4,6),4], [max(5,6),5]
2.So the corresponding tentative GARCH models are GARCH(5,4),GARCH(6,4),GARCH(6,5)

par(mfrow=c(1,2))
acf(sq.BPH, ci.type="ma",main="Plot 18:The sample ACF plot for squared Bitcoin series")
pacf(sq.BPH, main="Plot 19:The sample PACF plot for squared Bitcoin series")

eacf(sq.BPH)

## AR/MA
##   0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x x x x x x x x x x  x  x  x 
## 1 x x x x x x o o o x o  o  x  x 
## 2 x x x o o x o o o x o  o  x  o 
## 3 o x x x o x o o o o o  o  x  o 
## 4 o x x o o o o o o o o  o  x  o 
## 5 x x x x x x o o o o o  o  o  o 
## 6 x x x x x x o o o o o  o  o  o 
## 7 x x x x x x o x o o o  o  o  o

Possible models:GARCH(2,2), GARCH(3,2),GARCH(3,3),GARCH(5,4),GARCH(6,4),GARCH(6,5)

1) GARCH(2,2)

Method 1:

1.This method suggests that alpha(2) coefficient of GARCH(2,2) is insignificant at 5% level of significance.
2.Jarque Bera Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

## Method 1
m.22 = garch(m616_residuals,order=c(2,2),trace = FALSE)
summary(m.22)

## 
## Call:
## garch(x = m616_residuals, order = c(2, 2), trace = FALSE)
## 
## Model:
## GARCH(2,2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -8.25160 -0.38909  0.05634  0.48533  4.96972 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)    
## a0 6.023e-05   6.763e-06    8.906  < 2e-16 ***
## a1 1.809e-01   1.639e-02   11.038  < 2e-16 ***
## a2 6.076e-03   2.061e-02    0.295  0.76819    
## b1 2.076e-01   7.627e-02    2.722  0.00649 ** 
## b2 5.866e-01   6.456e-02    9.086  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 4352.4, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.094777, df = 1, p-value = 0.7582

Method 2:

1.This method suggests that alpha(2) coefficient of GARCH(2,2) is insignificant at 5% level of significance.
2.Jarque Bera Test and Shapiro-Wilk Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

## Method 2
m.22_2 = garchFit(formula = ~garch(2,2), data =m616_residuals,trace = FALSE  )
summary(m.22_2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(2, 2), data = m616_residuals, trace = FALSE) 
## 
## Mean and Variance Equation:
##  data ~ garch(2, 2)
## <environment: 0x00000000238364b8>
##  [data = m616_residuals]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1      alpha2       beta1       beta2  
## 5.0840e-04  5.9747e-05  1.8117e-01  6.2290e-03  2.0916e-01  5.8538e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.084e-04   6.752e-04    0.753   0.4514    
## omega  5.975e-05   1.427e-05    4.187 2.83e-05 ***
## alpha1 1.812e-01   2.540e-02    7.132 9.93e-13 ***
## alpha2 6.229e-03   2.686e-02    0.232   0.8166    
## beta1  2.092e-01   8.642e-02    2.420   0.0155 *  
## beta2  5.854e-01   7.248e-02    8.076 6.66e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  4009.999    normalized:  1.882629 
## 
## Description:
##  Thu Aug 08 14:09:04 2019 by user: Parvi 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value   
##  Jarque-Bera Test   R    Chi^2  4320.163  0         
##  Shapiro-Wilk Test  R    W      0.9163967 0         
##  Ljung-Box Test     R    Q(10)  18.33937  0.04950169
##  Ljung-Box Test     R    Q(15)  23.10033  0.08203284
##  Ljung-Box Test     R    Q(20)  30.42738  0.0632183 
##  Ljung-Box Test     R^2  Q(10)  8.141374  0.6150299 
##  Ljung-Box Test     R^2  Q(15)  10.39422  0.7942374 
##  Ljung-Box Test     R^2  Q(20)  12.38971  0.9020098 
##  LM Arch Test       R    TR^2   8.383625  0.754479  
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -3.759624 -3.743669 -3.759640 -3.753784

CONCLUSION:

1.Both methods suggests alpha(2) coefficient of GARCH(2,2) is insignificant at 5% level of significance.

2.Hence Considering GARCH(2,1) model as candidate model instead of GARCH(2,2).

2) GARCH(2,1)

Method 1:

1.This method suggests that all coefficients of GARCH(2,1) are significant at 5% level of significance.
2.Jarque Bera Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

## Method 1
m.21 = garch(m616_residuals,order=c(2,1),trace = FALSE)
summary(m.21) # All the coefficients are significant at 5% level of significance.

## 
## Call:
## garch(x = m616_residuals, order = c(2, 1), trace = FALSE)
## 
## Model:
## GARCH(2,1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -7.86744 -0.39132  0.05688  0.48591  5.18047 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)    
## a0 1.288e-04   1.222e-05   10.543  < 2e-16 ***
## a1 2.216e-01   1.889e-02   11.734  < 2e-16 ***
## b1 3.662e-01   7.168e-02    5.109 3.24e-07 ***
## b2 3.460e-01   6.074e-02    5.697 1.22e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 3878.3, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.0095362, df = 1, p-value = 0.9222

Method 2:

1.This method suggests that all coefficients of GARCH(2,1) are significant at 5% level of significance.
2.Jarque Bera Test and Shapiro-Wilk Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

## Method 2
m.21_2 = garchFit(formula = ~garch(1,2), data =m616_residuals,trace = FALSE ,include.mean = TRUE )
summary(m.21_2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 2), data = m616_residuals, include.mean = TRUE, 
##     trace = FALSE) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 2)
## <environment: 0x0000000020088ec0>
##  [data = m616_residuals]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1       beta1       beta2  
## 5.0422e-04  5.8289e-05  1.8390e-01  2.2383e-01  5.7474e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.042e-04   6.748e-04    0.747 0.454966    
## omega  5.829e-05   1.255e-05    4.645 3.40e-06 ***
## alpha1 1.839e-01   2.257e-02    8.149 4.44e-16 ***
## beta1  2.238e-01   6.378e-02    3.510 0.000449 ***
## beta2  5.747e-01   5.997e-02    9.584  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  4009.973    normalized:  1.882616 
## 
## Description:
##  Thu Aug 08 14:09:04 2019 by user: Parvi 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value   
##  Jarque-Bera Test   R    Chi^2  4313.327  0         
##  Shapiro-Wilk Test  R    W      0.9164093 0         
##  Ljung-Box Test     R    Q(10)  18.26935  0.05058646
##  Ljung-Box Test     R    Q(15)  23.03124  0.08347874
##  Ljung-Box Test     R    Q(20)  30.35151  0.06435467
##  Ljung-Box Test     R^2  Q(10)  8.186071  0.6106673 
##  Ljung-Box Test     R^2  Q(15)  10.41302  0.7929875 
##  Ljung-Box Test     R^2  Q(20)  12.42819  0.9005503 
##  LM Arch Test       R    TR^2   8.424648  0.7511278 
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -3.760538 -3.747242 -3.760549 -3.755671

CONCLUSION:

This model can be considered as a good model.

3) GARCH(3,2)

Method 1:

1.This method suggests that alpha(2),beta(1) and beta(3) coefficient of GARCH(3,2) are insignificant at 5% level of significance.
2.Jarque Bera Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

#Method 1
m.32 = garch(m616_residuals,order=c(3,2),trace = FALSE)

## Warning in garch(m616_residuals, order = c(3, 2), trace = FALSE): singular
## information

summary(m.32)# All the coefficients but aplha_2 are significant at 5% level of significance.

## 
## Call:
## garch(x = m616_residuals, order = c(3, 2), trace = FALSE)
## 
## Model:
## GARCH(3,2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -8.29527 -0.38204  0.05534  0.48612  4.85771 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)
## a0 7.283e-05          NA       NA       NA
## a1 2.002e-01          NA       NA       NA
## a2 2.774e-02          NA       NA       NA
## b1 9.396e-02          NA       NA       NA
## b2 6.293e-01          NA       NA       NA
## b3 3.200e-02          NA       NA       NA
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 4547.9, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.0051768, df = 1, p-value = 0.9426

Method 2:

1.This method suggests that alpha(1) coefficient of GARCH(3,2) is significant at 5% level of significance whereas rest of the coefficients are NA.
2.Jarque Bera Test and Shapiro-Wilk Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

#Method 2
m.32_2 = garchFit(formula = ~garch(2,3), data =m616_residuals, trace = FALSE )
summary(m.32_2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(2, 3), data = m616_residuals, trace = FALSE) 
## 
## Mean and Variance Equation:
##  data ~ garch(2, 3)
## <environment: 0x0000000027fb4498>
##  [data = m616_residuals]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1      alpha2       beta1       beta2  
## 5.0538e-04  7.1772e-05  1.8236e-01  4.3652e-02  1.0000e-08  6.2984e-01  
##      beta3  
## 1.2269e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.054e-04   6.768e-04    0.747    0.455    
## omega  7.177e-05   4.463e-05    1.608    0.108    
## alpha1 1.824e-01   2.787e-02    6.542 6.05e-11 ***
## alpha2 4.365e-02   1.252e-01    0.349    0.727    
## beta1  1.000e-08   7.395e-01    0.000    1.000    
## beta2  6.298e-01   1.500e-01    4.200 2.67e-05 ***
## beta3  1.227e-01   4.494e-01    0.273    0.785    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  4010.115    normalized:  1.882683 
## 
## Description:
##  Thu Aug 08 14:09:04 2019 by user: Parvi 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value   
##  Jarque-Bera Test   R    Chi^2  4321.698  0         
##  Shapiro-Wilk Test  R    W      0.9164344 0         
##  Ljung-Box Test     R    Q(10)  18.36087  0.04917294
##  Ljung-Box Test     R    Q(15)  23.12752  0.08146964
##  Ljung-Box Test     R    Q(20)  30.45984  0.06273736
##  Ljung-Box Test     R^2  Q(10)  8.1595    0.6132605 
##  Ljung-Box Test     R^2  Q(15)  10.40759  0.7933486 
##  Ljung-Box Test     R^2  Q(20)  12.41391  0.9010932 
##  LM Arch Test       R    TR^2   8.400992  0.7530619 
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -3.758793 -3.740180 -3.758815 -3.751981

4) GARCH(3,3)

Method 1:

1.This method suggests that alpha(2),beta(1) and beta(3) coefficient of GARCH(3,3) are insignificant at 5% level of significance.
2.Jarque Bera Test for normality suggests p-value is less than 5% level of significance. Hence we reject null hypothesis stating Normality Assumption.
3.Box-Ljung Test suggests that p-value is greater than 5% level of significance. Hence we fail to reject null hypothesis stating error terms are uncorrelated.

m.33 = garch(m616_residuals,order=c(3,3),trace = FALSE)
summary(m.33) # Higher order parameters are insignificant

## 
## Call:
## garch(x = m616_residuals, order = c(3, 3), trace = FALSE)
## 
## Model:
## GARCH(3,3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -8.14551 -0.37874  0.05557  0.48234  4.87245 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)    
## a0 9.618e-05   2.004e-05    4.800 1.59e-06 ***
## a1 1.834e-01   1.950e-02    9.401  < 2e-16 ***
## a2 4.604e-02   4.898e-02    0.940  0.34723    
## a3 7.970e-02   2.740e-02    2.908  0.00363 ** 
## b1 4.830e-07   2.481e-01    0.000  1.00000    
## b2 5.429e-01   5.214e-02   10.412  < 2e-16 ***
## b3 1.350e-01   1.670e-01    0.808  0.41886    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 4373.8, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.08467, df = 1, p-value = 0.7711

m.33_2 = garchFit(formula = ~garch(3,3), data =m616_residuals, trace = FALSE)

## Warning in sqrt(diag(fit$cvar)): NaNs produced

summary(m.33_2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(3, 3), data = m616_residuals, trace = FALSE) 
## 
## Mean and Variance Equation:
##  data ~ garch(3, 3)
## <environment: 0x0000000020aba240>
##  [data = m616_residuals]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1      alpha2      alpha3       beta1  
## 5.0128e-04  7.2521e-05  1.8103e-01  4.3500e-02  3.6489e-03  1.0000e-08  
##      beta2       beta3  
## 6.2520e-01  1.2506e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.013e-04   6.741e-04    0.744    0.457    
## omega  7.252e-05          NA       NA       NA    
## alpha1 1.810e-01   2.641e-02    6.854 7.17e-12 ***
## alpha2 4.350e-02          NA       NA       NA    
## alpha3 3.649e-03          NA       NA       NA    
## beta1  1.000e-08          NA       NA       NA    
## beta2  6.252e-01   6.815e-02    9.173  < 2e-16 ***
## beta3  1.251e-01          NA       NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  4010.118    normalized:  1.882685 
## 
## Description:
##  Thu Aug 08 14:09:05 2019 by user: Parvi 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value   
##  Jarque-Bera Test   R    Chi^2  4311.587  0         
##  Shapiro-Wilk Test  R    W      0.9164449 0         
##  Ljung-Box Test     R    Q(10)  18.40059  0.04857062
##  Ljung-Box Test     R    Q(15)  23.16508  0.08069717
##  Ljung-Box Test     R    Q(20)  30.49756  0.0621826 
##  Ljung-Box Test     R^2  Q(10)  8.133648  0.6157842 
##  Ljung-Box Test     R^2  Q(15)  10.38602  0.7947816 
##  Ljung-Box Test     R^2  Q(20)  12.38786  0.9020795 
##  LM Arch Test       R    TR^2   8.380515  0.7547325 
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -3.757858 -3.736585 -3.757886 -3.750072

RESIDUAL ANALYSIS

residual.analysis <- function(model, std = TRUE,start = 2, class = c("ARIMA","GARCH","ARMA-GARCH")[1]){
  # If you have an output from arima() function use class = "ARIMA"
  # If you have an output from garch() function use class = "GARCH"
  # If you have an output from ugarchfit() function use class = "ARMA-GARCH"
  library(TSA)
  library(FitAR)
  if (class == "ARIMA"){
    if (std == TRUE){
      res.model = rstandard(model)
    }else{
      res.model = residuals(model)
    }
  }else if (class == "GARCH"){
    res.model = model$residuals[start:model$n.used]
  }else if (class == "ARMA-GARCH"){
    res.model = model@fit$residuals
  }else {
    stop("The argument 'class' must be either 'ARIMA' or 'GARCH' ")
  }
  par(mfrow=c(3,2))
  plot(res.model,type='o',ylab='Standardised residuals', main="Time series plot of standardised residuals")
  abline(h=0)
  hist(res.model,main="Histogram of standardised residuals")
  acf(res.model,main="ACF of standardised residuals")
  pacf(res.model,main="PACF of standardised residuals")
  qqnorm(res.model,main="QQ plot of standardised residuals")
  qqline(res.model, col = 2)
  print(shapiro.test(res.model))
  k=0
  LBQPlot(res.model, lag.max = 30, StartLag = k + 1, k = 0, SquaredQ = FALSE)
}

residual.analysis(m.21,class="GARCH",start=3)

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.91784, p-value < 2.2e-16

## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length

residual.analysis(m.32,class="GARCH",start=4)

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.9155, p-value < 2.2e-16

## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length

residual.analysis(m.33,class="GARCH",start=4)

## 
##  Shapiro-Wilk normality test
## 
## data:  res.model
## W = 0.91556, p-value < 2.2e-16

## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length

Selected Model

On the basis of residual analysis, AIC values, BIC values and significant parameters GARCH(2,1).

1.Overfitting is done to check for any anomalies in goodness of fit. After fitting the selected model, we fit a slightly more general model by increasing each parameter at a time and checking their significance.
2.Then we find the parameter estimation of newly added parameter and it should be insignificant whereas the old parameters should be significant.
3.The selected model is GARCH(2,1)
4.Increasing each parameter at a time and checking their significance.
5.TWo possible models to check for overfitting parameters: GARCH(2,2) and GARCH(3,1)

1)GARCH(2,2)

As mentioned above: Method 1:

This method suggests that alpha(2) coefficient of GARCH(2,2) is insignificant at 5% level of significance.

Method 2:

This method suggests that alpha(2) coefficient of GARCH(2,2) is insignificant at 5% level of significance.


CONCLUSION: Hence this proves that GARCH(2,2) is overfitting GARCH(2,1).

2)GARCH(3,1)

Method 1:

This method suggests that beta(3) coefficient of GARCH(3,1) is insignificant at 5% level of significance.

#Method1
m.31 = garch(m616_residuals,order=c(3,1),trace = FALSE)
summary(m.31) # Higher order parameters are insignificant

## 
## Call:
## garch(x = m616_residuals, order = c(3, 1), trace = FALSE)
## 
## Model:
## GARCH(3,1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -8.17252 -0.38375  0.05586  0.48451  4.87090 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)    
## a0 8.336e-05   9.289e-06    8.974  < 2e-16 ***
## a1 2.292e-01   2.225e-02   10.301  < 2e-16 ***
## b1 3.027e-01   7.332e-02    4.129 3.65e-05 ***
## b2 4.443e-01   6.521e-02    6.813 9.54e-12 ***
## b3 2.993e-03   6.507e-02    0.046    0.963    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 4579.5, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.045072, df = 1, p-value = 0.8319

Method 2:

This method suggests that beta(3) coefficient of GARCH(3,1) is insignificant at 5% level of significance.


CONCLUSION: Hence this proves that GARCH(3,1) is overfitting GARCH(2,1).

#Method2
m.31_2 = garchFit(formula = ~garch(1,3), data =m616_residuals, trace = FALSE)
summary(m.31_2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 3), data = m616_residuals, trace = FALSE) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 3)
## <environment: 0x000000002878a9c0>
##  [data = m616_residuals]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1       beta1       beta2       beta3  
## 0.00050551  0.00005817  0.18374786  0.21633322  0.58237646  0.00000001  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.055e-04   6.754e-04    0.748   0.4542    
## omega  5.817e-05   1.269e-05    4.582 4.60e-06 ***
## alpha1 1.837e-01   2.516e-02    7.305 2.78e-13 ***
## beta1  2.163e-01   9.916e-02    2.182   0.0291 *  
## beta2  5.824e-01   5.798e-02   10.045  < 2e-16 ***
## beta3  1.000e-08   7.902e-02    0.000   1.0000    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  4009.58    normalized:  1.882432 
## 
## Description:
##  Thu Aug 08 14:09:06 2019 by user: Parvi 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value   
##  Jarque-Bera Test   R    Chi^2  4307.675  0         
##  Shapiro-Wilk Test  R    W      0.916383  0         
##  Ljung-Box Test     R    Q(10)  18.25678  0.05078356
##  Ljung-Box Test     R    Q(15)  23.03634  0.08337128
##  Ljung-Box Test     R    Q(20)  30.37513  0.063999  
##  Ljung-Box Test     R^2  Q(10)  8.124235  0.6167032 
##  Ljung-Box Test     R^2  Q(15)  10.31627  0.7993879 
##  Ljung-Box Test     R^2  Q(20)  12.35116  0.9034594 
##  LM Arch Test       R    TR^2   8.393505  0.7536731 
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -3.759230 -3.743275 -3.759245 -3.753390

Forecasts

FINAL MODEL: ARIMA(6,1,6) + GARCH(2,1)

From the plot it can be said that the variation in bitcoin series over period of time is going to be around the mean level.

par(mfrow=c(1,1))
plot((fitted(m.21)[,1])^2,type='l',ylab='Conditional Variance',xlab='t',main="Plot 23:Estimated Conditional Variances of the Bitcoin Series")

fGarch::predict(m.21_2,n.ahead=100,trace=FALSE,plot=TRUE)

##     meanForecast  meanError standardDeviation lowerInterval upperInterval
## 1   0.0005042219 0.04642390        0.04642390   -0.09048496    0.09149340
## 2   0.0005042219 0.03722354        0.03722354   -0.07245257    0.07346101
## 3   0.0005042219 0.04314975        0.04314975   -0.08406774    0.08507618
## 4   0.0005042219 0.04017209        0.04017209   -0.07823163    0.07924008
## 5   0.0005042219 0.04226571        0.04226571   -0.08233504    0.08334349
## 6   0.0005042219 0.04140250        0.04140250   -0.08064319    0.08165163
## 7   0.0005042219 0.04223642        0.04223642   -0.08227764    0.08328608
## 8   0.0005042219 0.04208140        0.04208140   -0.08197381    0.08298226
## 9   0.0005042219 0.04249237        0.04249237   -0.08277930    0.08378775
## 10  0.0005042219 0.04257066        0.04257066   -0.08293275    0.08394119
## 11  0.0005042219 0.04283635        0.04283635   -0.08345349    0.08446193
## 12  0.0005042219 0.04298875        0.04298875   -0.08375218    0.08476062
## 13  0.0005042219 0.04320194        0.04320194   -0.08417002    0.08517846
## 14  0.0005042219 0.04337530        0.04337530   -0.08450981    0.08551825
## 15  0.0005042219 0.04356716        0.04356716   -0.08488584    0.08589428
## 16  0.0005042219 0.04374385        0.04374385   -0.08523216    0.08624060
## 17  0.0005042219 0.04392496        0.04392496   -0.08558711    0.08659556
## 18  0.0005042219 0.04409923        0.04409923   -0.08592869    0.08693713
## 19  0.0005042219 0.04427327        0.04427327   -0.08626979    0.08727823
## 20  0.0005042219 0.04444333        0.04444333   -0.08660311    0.08761156
## 21  0.0005042219 0.04461167        0.04461167   -0.08693305    0.08794149
## 22  0.0005042219 0.04477706        0.04477706   -0.08725720    0.08826564
## 23  0.0005042219 0.04494028        0.04494028   -0.08757711    0.08858555
## 24  0.0005042219 0.04510096        0.04510096   -0.08789203    0.08890047
## 25  0.0005042219 0.04525938        0.04525938   -0.08820254    0.08921098
## 26  0.0005042219 0.04541545        0.04541545   -0.08850843    0.08951688
## 27  0.0005042219 0.04556930        0.04556930   -0.08880997    0.08981841
## 28  0.0005042219 0.04572091        0.04572091   -0.08910712    0.09011557
## 29  0.0005042219 0.04587037        0.04587037   -0.08940004    0.09040849
## 30  0.0005042219 0.04601768        0.04601768   -0.08968877    0.09069721
## 31  0.0005042219 0.04616289        0.04616289   -0.08997339    0.09098183
## 32  0.0005042219 0.04630605        0.04630605   -0.09025397    0.09126242
## 33  0.0005042219 0.04644719        0.04644719   -0.09053060    0.09153904
## 34  0.0005042219 0.04658634        0.04658634   -0.09080333    0.09181178
## 35  0.0005042219 0.04672355        0.04672355   -0.09107225    0.09208069
## 36  0.0005042219 0.04685883        0.04685883   -0.09133740    0.09234585
## 37  0.0005042219 0.04699224        0.04699224   -0.09159887    0.09260731
## 38  0.0005042219 0.04712379        0.04712379   -0.09185671    0.09286515
## 39  0.0005042219 0.04725352        0.04725352   -0.09211098    0.09311943
## 40  0.0005042219 0.04738147        0.04738147   -0.09236175    0.09337020
## 41  0.0005042219 0.04750766        0.04750766   -0.09260908    0.09361752
## 42  0.0005042219 0.04763212        0.04763212   -0.09285302    0.09386146
## 43  0.0005042219 0.04775488        0.04775488   -0.09309363    0.09410207
## 44  0.0005042219 0.04787597        0.04787597   -0.09333096    0.09433941
## 45  0.0005042219 0.04799542        0.04799542   -0.09356508    0.09457352
## 46  0.0005042219 0.04811326        0.04811326   -0.09379603    0.09480447
## 47  0.0005042219 0.04822950        0.04822950   -0.09402386    0.09503230
## 48  0.0005042219 0.04834418        0.04834418   -0.09424863    0.09525707
## 49  0.0005042219 0.04845732        0.04845732   -0.09447038    0.09547882
## 50  0.0005042219 0.04856895        0.04856895   -0.09468917    0.09569761
## 51  0.0005042219 0.04867909        0.04867909   -0.09490503    0.09591348
## 52  0.0005042219 0.04878776        0.04878776   -0.09511803    0.09612647
## 53  0.0005042219 0.04889499        0.04889499   -0.09532820    0.09633664
## 54  0.0005042219 0.04900080        0.04900080   -0.09553558    0.09654403
## 55  0.0005042219 0.04910521        0.04910521   -0.09574023    0.09674867
## 56  0.0005042219 0.04920825        0.04920825   -0.09594218    0.09695062
## 57  0.0005042219 0.04930994        0.04930994   -0.09614148    0.09714992
## 58  0.0005042219 0.04941029        0.04941029   -0.09633816    0.09734660
## 59  0.0005042219 0.04950933        0.04950933   -0.09653227    0.09754072
## 60  0.0005042219 0.04960707        0.04960707   -0.09672385    0.09773229
## 61  0.0005042219 0.04970354        0.04970354   -0.09691293    0.09792138
## 62  0.0005042219 0.04979876        0.04979876   -0.09709956    0.09810800
## 63  0.0005042219 0.04989275        0.04989275   -0.09728377    0.09829221
## 64  0.0005042219 0.04998552        0.04998552   -0.09746559    0.09847404
## 65  0.0005042219 0.05007709        0.05007709   -0.09764507    0.09865351
## 66  0.0005042219 0.05016748        0.05016748   -0.09782224    0.09883068
## 67  0.0005042219 0.05025671        0.05025671   -0.09799712    0.09900557
## 68  0.0005042219 0.05034480        0.05034480   -0.09816977    0.09917821
## 69  0.0005042219 0.05043175        0.05043175   -0.09834020    0.09934864
## 70  0.0005042219 0.05051760        0.05051760   -0.09850846    0.09951690
## 71  0.0005042219 0.05060235        0.05060235   -0.09867457    0.09968301
## 72  0.0005042219 0.05068602        0.05068602   -0.09883856    0.09984700
## 73  0.0005042219 0.05076863        0.05076863   -0.09900047    0.10000891
## 74  0.0005042219 0.05085019        0.05085019   -0.09916033    0.10016877
## 75  0.0005042219 0.05093072        0.05093072   -0.09931816    0.10032660
## 76  0.0005042219 0.05101023        0.05101023   -0.09947400    0.10048244
## 77  0.0005042219 0.05108874        0.05108874   -0.09962787    0.10063631
## 78  0.0005042219 0.05116626        0.05116626   -0.09977980    0.10078825
## 79  0.0005042219 0.05124280        0.05124280   -0.09992982    0.10093827
## 80  0.0005042219 0.05131838        0.05131838   -0.10007796    0.10108641
## 81  0.0005042219 0.05139302        0.05139302   -0.10022425    0.10123269
## 82  0.0005042219 0.05146672        0.05146672   -0.10036870    0.10137714
## 83  0.0005042219 0.05153950        0.05153950   -0.10051135    0.10151979
## 84  0.0005042219 0.05161138        0.05161138   -0.10065222    0.10166066
## 85  0.0005042219 0.05168235        0.05168235   -0.10079133    0.10179977
## 86  0.0005042219 0.05175245        0.05175245   -0.10092871    0.10193716
## 87  0.0005042219 0.05182167        0.05182167   -0.10106439    0.10207283
## 88  0.0005042219 0.05189004        0.05189004   -0.10119839    0.10220683
## 89  0.0005042219 0.05195756        0.05195756   -0.10133072    0.10233917
## 90  0.0005042219 0.05202424        0.05202424   -0.10146142    0.10246987
## 91  0.0005042219 0.05209011        0.05209011   -0.10159051    0.10259895
## 92  0.0005042219 0.05215516        0.05215516   -0.10171801    0.10272645
## 93  0.0005042219 0.05221941        0.05221941   -0.10184393    0.10285238
## 94  0.0005042219 0.05228286        0.05228286   -0.10196831    0.10297675
## 95  0.0005042219 0.05234554        0.05234554   -0.10209116    0.10309960
## 96  0.0005042219 0.05240746        0.05240746   -0.10221250    0.10322095
## 97  0.0005042219 0.05246861        0.05246861   -0.10233236    0.10334081
## 98  0.0005042219 0.05252901        0.05252901   -0.10245075    0.10345920
## 99  0.0005042219 0.05258868        0.05258868   -0.10256770    0.10357614
## 100 0.0005042219 0.05264762        0.05264762   -0.10268322    0.10369166

Time Series Project

Parvi

May 27, 2019

Time Series Analysis Project - Bitcoin Price

Aim

Introduction

Data

Method

Setup & Preprocessing

Time Series Plot and Correlation

McLeod-Li Test to check existance of ARCH

QQPlot and Shapiro-Wilk Test

ACF and PACF

Dickey-Fuller Unit-Root Test(ADF Test)

BoxCox Curve

Log Transformation

ADF test on log transformed series

First Differencing

ADF Test for First Differencing

ACF and PACF for First Differencing

EACF for First Differencing

BIC Table For First Differencing

Fitting model, finding Parameter Estimation and Residual Analysis

1)ARIMA(6,1,6)

1.1)RESIDUAL ANALYSIS of ARIMA(6,1,6)

2)ARIMA(6,1,7)

2.1)RESIDUAL ANALYSIS of ARIMA(6,1,7)

3)ARIMA(7,1,7)

3.1)RESIDUAL ANALYSIS of ARIMA(7,1,7)

AIC and BIC values

ARIMA(6,1,6) for further Analysis

Absolute Values Approach

Squared Values Approach

Possible models:GARCH(2,2), GARCH(3,2),GARCH(3,3),GARCH(5,4),GARCH(6,4),GARCH(6,5)

1) GARCH(2,2)

2) GARCH(2,1)

3) GARCH(3,2)

4) GARCH(3,3)

RESIDUAL ANALYSIS

Selected Model

1)GARCH(2,2)

2)GARCH(3,1)

Forecasts

FINAL MODEL: ARIMA(6,1,6) + GARCH(2,1)