# Seed for random number generation
set.seed(42)
knitr::opts_chunk$set(cache.extra = knitr::rand_seed)
Introduction
The microstructure noise (MN) can promote bias at estimations that are sensitive by omitted variables. Dealing with price discovery, the share information components are the focus of . And how the HFTs´ markets and non-HFT dealing with price discovery. The identification strategy is using the limit orders in order to estimate the contribution from each kind of trade to price discovery. Once again, without deal with endogeneity problem. We can reproduce its results using a different database and re-write this literature founds.
A VAR model could lead to a wrong interpretation and misleading. The current literature of this subject provides such evidence, . The mainly theoretical support for proper identification of VAR lies over its reduced form and the exogeneity of its shocks. But, as shown by DFS(2019) are a persist error, derivative of the market microstructure. This solution pass by an IV-based estimator, and in the paper lies de demonstration of asymptotical theory. The authors did by Monte Carlo simulations, comparisons between the regular methodology and its proposed estimator. The results suggest an improvement by reducing the bias for all frequency of trades.
This project is aimed at the Brazilian stock market. With the objective of measure the price discovery. This project is being divided at, introduction and justification, second chapter: literature overviews, third chapter: methodology, four chapter: database and simulations and the chronogram of the project.
Literature Overviews
The market microstructure might affect price formation. The traditional ways to do such a task are two. The first one is derived from Hasbrouck (1995), that aims at information share, that is related to impact to efficient price. The efficient price estimation will be contaminated by the noise, in case of negligence the MN. The second one is related to component share, that analysis focus at how fast markets are affected by price innovations. Using the HFT, can be one way to achieve such propose. But the MN are increased by frequency. So, there is much noise at the component share analysis of price formation. (DFS, 2019).
The consequences of MN are two, the inconsistency of LS estimators, and inconsistency of var-cov matrix of the shocks (that are a function of MN). This one is easily explained by that, the regular method ignores the MN var-cov, so, typically are estimating a lower-bound matrix, that can lead a wrong significative estimative, and mislead the conclusions.
The DFS (2019) argue that increases the frequency isn’t reduced the bias, by modelling the continuous-time vector equilibrium-correction (VEC). The DFS(2016) showed that CS isn´t be affected by frequency, just, an contrary, be affected by the number of markets. A common belief that the CS measure can be a proper estimate by reducing the frequency. But the error may persist, and IV-based estimators are necessary in order to produce unbiased estimates. The mainly and disruptive argument is that the differences at prices of an asset and its derivative can be used as a vector of instrumentals variables
The price discovery process literature appoints for some price determinants aspects. Such as the speed and lowest trading costs , that test for cointegration between price assets markets, and failing in assume any cointegration relationship. concluded that the seller’s side is more informative for the price, and using an AR model. Although, found after Johansen’s test, two cointegrating vectors for the same price vector of IBM at 3 different markets. using a VAR model for test if HFTs produces a better market quality, contributing positively for price discovery process and providing more liquidity. applied the analogues methodology in order to evaluate the HFT at the UK stock market, and they founded that HFT contributes about 6\(\%\) of the total information through their aggressive trades, and for all kinds of groups (aggressive, neutral and passive) the total amount is 14\(\%\) of trade-induced information contribution. All these papers aren’t accounted for a specific common noise factor and are possible that all conclusions could be guided by inconsistent estimations.
More literature of price discovery of stock market had found that HFTs´ limit order had twice influence at price discovery than market orders and . However, in non-HFTs markets, ordinal orders bring more informational than limit orders. And, making a comparison between HFT vis a vis non-HFT markets, these are more informational than those, as related to price discovery, and limited orders are playing a larger role. Even the submission and pos cancellated orders are informative, for HFT and non-HFTs with a respective percentage of discovery price of 30\(\%\) and 15\(\%\) , respectively. One example of a limit order´s role is when prices go high and a limit buy order isn´t be executed but will contribute to price discovery.
The big issue of HTFs of limited or market orders is paying or not the bid-ask spread. This spread is inversely related to liquidity and volume of the asset, and the uncertainty of being executed. As soon as the narrow between buyers and sellers become lower, more attractive are market orders in comparison to limit orders.
There is a certain proxy profile of traders that prefer limit orders. They are fast traders, with no intrinsic motivation to trade, and those preferences are related to less adverse selection and . discuss the possibility of market failure, case in that, the market can´t be open because of a lack of liquidity, due to an expected loss of dealers. These phenomena are related to adverse selection and increase of volatility, and because of that, users reduce HTFs limit orders, and thus reducing the contribution of it at price discovery. Concluded that HFTs´ market orders play a small role than limit orders at price discovery process. And they not found evidence that HFTs use the speed advantage over non-HFTs´ orders. These results are, consistent with the fact that volatility typically decreases market stability. But, this modelling VAR based, aren´t concerned at microstructure noise, gave by . This noise could be related to both markets, and both type of orders and a simple VAR can´t properly deal with this problem. Is possible revert this conclusion using the (DIAS, FERNANDES AND SCHERRER, 2019) modelling.
In another hand, there is a wide class of analysis that are concerned about the microstructure noise. introduced the HFT at Grossman and Miller (1998) model, using the rapid information processing and quick execution ability as a stylized fact. The argument that supports its conclusion is that HF trading strategies introduce \(``\)microstructure noise": In order to profit from intermediation HFTs buy shares from one trader at a cheap price and sell it more dearly to another trader, generating price dispersion where before there was only a single price. The newly of this topic emerge and significant argument about the impact of this technique. At high frequency, forecasting opportunities that are different from those present at lower frequencies appear, calling for new strategies and a new generation of trading algorithms. New risks associated with the speed of HFT emerge. The notion of interaction between algorithms becomes critical, requiring the careful design of electronic markets
(Fabozzi and Focardi, 2011) demonstrates five approaches to estimates price formation at the start point of an Itô process. The worst is the estimators that ignore the microstructure noise, and its realized volatility diverges approach at infinity as long the sample size increases. using an OU process modelling, with a time-dependent chocks reaction for the discrete case. When the process is permeated of noise, if the methodology ignores the common noise, the evidence of Monte Carlo simulations and data-driven appoint for a biased estimator and possible wrong conclusion that derives of this non-robust methods.
Methodology
A simple VEC intuition can be: suppose that security is traded in two separate markets, \[\begin{align}
p_{1,t} = p_{1,t-1}+w_{t} \nonumber \\
p_{2,t} = p_{1,t-2}+E_{t}
\end{align}\]
The properly notation to handle with this problem is introducing the VEC notation gave by:
\[\begin{align}
\Delta p_{2,t}= p_{1,t-2} - p_{2,t-1} + E_{t}
\end{align}\]
This relationship between these two markets could be expanded at an M dimensional VEC(0) model, like:
\[\begin{align}
\Delta p_{t_{i}} = \alpha_{\delta}\beta'p_{t-1} + v_{t_{i}}
\end{align}\]
where, \(\Delta p_{t_{i}}\) is an \(M \times 1\) vector of observed price changes, and \(v_{t_{i}}\) is a linear combination of microstructure noise at 1 lag.
Now, using \(Z_{k,t_{i-q-k}} \equiv \left( \beta ^{'}p_{t_{i-q-k}}, \ldots , \beta ^{'}p_{t_{i-q-k}} \right) ^{'}\) that is an instrument for \(\beta ^{'}p_{t_{i-1}}\) . If \(Z\) is not a strong instrument, under MMN(CS) assumptions, the alternative is to use a GMM based estimator. And the Monte Carlo simulations concluded that IV-based estimator had the lowest biased of all. The same conclusion can be attached to IRF´s.
The intuitive insight behind using VEC(0) structure is that the microstructure noise is a component error of HTFs and non-HTFs market structure. And a relationship of long-run. So, behind every market, for price formation, this error could lead to some biased estimators that lead to controversial conclusions. This problem might be solved by this methodology.
The test for valid instruments is related to lagged values of price vector, and so far, for weak instruments, the CU-GMM estimator that is \(argming^{'} \Psi g\) , and for Monte Carlo simulation presented by (DIAS, FERNANDES AND SCHERRER, 2019) has the best performs among all IV-based estimators.
The econometric framework is driven by and we will reproduce empirical results, that will be compared with already estimated results, such as , and other results for Brazilian market.
Data Base and Simulations
For this project, we will use BM\(\&\) FBovespa data. We will build a correlated strategy of but the composition will lie at common factor components of shares. explain this difference. derives the relationship between CS and IS. Our model will use the DFS(2019) CU-GMM, using as IV the difference between to assets trades, the mini and full contract of IBOV shares.
Its estimation answering the questions: 1 – What is the leading market. 2 – What is the more information strategy of trade. 3 – What informational based trade is leading the price formation process.
Using the data of HFT of assets or derivatives, we can achieve a more understanding of how price discovery process is influenced. Comparing the mini contracts of Ibovespa versus the full assets. What has more influence at price discovery? We will build an IV-based method, using as instrument the difference between the two-price market. Building a VEC model that shows the long-run relationship between the asset and derivative. And answering the question of what of those has more influence at price? The answer is achieved by IRF.
## Parsed with column specification:
## cols(
## exchange = col_character(),
## ts = col_datetime(format = ""),
## ask = col_double(),
## bid = col_double()
## )
## Warning: package 'ggthemes' was built under R version 3.6.3
## Warning: package 'gghighlight' was built under R version 3.6.3
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date


Criando as dummies para os períodos de negociação
Model
References
r_refs(file = "r-references.bib")