Group Assignment 3 for the course Financial Markets at the University of Amsterdam

(a) Compute the correlation matrix

setwd("d:/my/documents/UvA/Financial Markets/Assignment 3")
data2 <- read.csv("Ch3_ex12_data_newlyadjusted.csv")

corMatrix<-round(cor(data2[,c("p", "vp","mktcap","bas100","ami100","ret", "vola")]),digits = 3)
corMatrix

##             p     vp mktcap bas100 ami100    ret   vola
## p       1.000  0.515  0.395 -0.210 -0.088 -0.077 -0.251
## vp      0.515  1.000  0.897 -0.087 -0.051 -0.017 -0.079
## mktcap  0.395  0.897  1.000 -0.078 -0.045 -0.036 -0.098
## bas100 -0.210 -0.087 -0.078  1.000  0.596 -0.077  0.339
## ami100 -0.088 -0.051 -0.045  0.596  1.000  0.055  0.168
## ret    -0.077 -0.017 -0.036 -0.077  0.055  1.000  0.308
## vola   -0.251 -0.079 -0.098  0.339  0.168  0.308  1.000

library(ellipse)
plotcorr(corMatrix, mar = c(0.1, 0.1, 0.1, 0.1))

The corrlation between bid-ask spread and the Amihud ratio is high .59, but that is to be expected since they both represent measure of illiquidity
Market capitalzation variabe is highly correlated with trading volume, but that is only natural, since the biggest companies would be traded the most. However we never have both of these variables in one regression, so that high nuber is not a problem here.
The correlation of volatility variable with the rest is relatively low wich means that the esimates would be relatively reliable. That stems from the fact that the covariates should not related to each other, or at least as little correlated as possible. Otherwise the problem of multicollinearity would arise.

(b) Estimate OLS regressions in which the dependent variable is the bid-ask spread (bas100), in six specifcations

model_b.1 <- lm(bas100 ~ vola + mktcap, data=data2)
model_b.2 <- lm(bas100 ~ vola + vp, data=data2)
model_b.3 <- lm(bas100 ~ vola + log(mktcap), data=data2)
model_b.4 <- lm(bas100 ~ vola + log(vp), data=data2)
model_b.5 <- lm(bas100 ~ vola + I(vo/ibnosh), data=data2)
model_b.6 <- lm(bas100 ~ vola + I(vo/ibnosh) + log(p), data=data2)

	bas100 (1)	bas100 (2)	bas100 (3)	bas100 (4)	bas100 (5)	bas100 (6)
(Intercept)	-0.117211	-0.102987	6.332807***	5.69191***	0.037093	2.587469***
	-0.197726	-0.197249	-0.375668	-0.291486	-0.195262	-0.286039
vola	40.717996***	40.67673***	20.31444***	24.607052***	42.229881***	24.546809***
	-3.429569	-3.420528	-3.149655	-2.864974	-3.367864	-3.524644
mktcap	-2.00E-05
	-1.20E-05
vp		-2e-06**
		-1.00E-06
log(mktcap)			-1.048985***
			-0.054118
log(vp)				-0.758604***
				-0.031632
I(vo/ibnosh)					-0.030235***	-0.020011***
					-0.005061	-0.004862
I(vo/ibnosh)

log(p)						-0.97891***
						-0.0839
R^2	0.116802	0.118429	0.336379	0.414219	0.141974	0.234667
adj.R^2	0.115231	0.116861	0.3352	0.413177	0.140448	0.232624
N	1128	1128	1128	1128	1128	1128
Std errors in parentheses

All the variables have the ‘right’ coefficients
The higher the volatility, the higher the bid-ask spread. Actually the effect of volatility is quite strong. That is to be expected, since in reality the dealers would increase the spread in greater uncertainty
The higher the market capitalization and trading volume, the lower the bid-ask spread. Again it makes perfect sence–the stocks of large corporations are more liquid, hence lower spread. For trading volume the intuition is similar, bigger volume would lead to lower spread.
The higher the turnover rate, the lower the bid ask-spread
The log of the previous variables is much better estimate for this realtion, suggesting a non-linearity
the increase of 1% in the trading volume would lead (according to our regression #2) to 1%*(-2e-06) decrease in the bid-ask spread

(c) Repeat the empirical analysis under (b) using the Amihud illiquidity measure as dependent variable. How do the results compare with those under (b)?

model_c.1 <- lm(ami100 ~ vola + mktcap, data=data2)
model_c.2 <- lm(ami100 ~ vola + vp, data=data2)
model_c.3 <- lm(ami100 ~ vola + log(mktcap), data=data2)
model_c.4 <- lm(ami100 ~ vola + log(vp), data=data2)
model_c.5 <- lm(ami100 ~ vola + I(vo/ibnosh), data=data2)
model_c.6 <- lm(ami100 ~ vola + I(vo/ibnosh) + log(p), data=data2)

	ami100 (1)	ami100 (2)	ami100 (3)	ami100 (4)	ami100 (5)	ami100 (6)
(Intercept)	0.01314	0.016109	1.365159***	1.568658***	0.055636	0.121592
	-0.077563	-0.077424	-0.163863	-0.128923	-0.077194	-0.119705
vola	7.506072***	7.499831***	3.239655**	3.18021**	7.89254***	7.435228***
	-1.34533	-1.342619	-1.37385	-1.267161	-1.331428	-1.475038
mktcap	-5.00E-06
	-5.00E-06
vp		0
		0
log(mktcap)			-0.220125***
			-0.023606
log(vp)				-0.202925***
				-0.013991
I(vo/ibnosh)					-0.008034***	-0.007769***
					-0.002001	-0.002035
I(vo/ibnosh)

log(p)						-0.025316
						-0.035111
R^2	0.028905	0.029487	0.097812	0.18119	0.041809	0.042252
adj.R^2	0.027178	0.027761	0.096208	0.179735	0.040106	0.039696
N	1128	1128	1128	1128	1128	1128
Std errors in parentheses

the relations are broadly similar to (b)
however, the magnitude is somewhat different. The coeffecient of the volatility is smaller, and also all the other coefficients(in absolute value)
Since, the Amihud coeficient is a different measure of illiquidity, we would expect the effect of each of our variables to be different.However the direction should be broadly similar, as observed, owed to the fact that both with bid-ask spread measure the same thing
Rsquared is significantly lower, suggesting lower explanatory power

(d) Investigate whether fnancial stocks were more illiquid than non-fnancial stocks during the estimation period

data2$fin <- as.factor(as.numeric(data2$gics==40))
model_b.4_d <- lm(bas100 ~ vola + log(vp) + fin, data=data2)
model_b.6_d <- lm(bas100 ~ vola + I(vo/ibnosh) + log(p) + fin, data=data2)

#summary(model_b.4_d)
#summary(model_b.6_d)

	bas100	bas100
(Intercept)	5.511***	2.557***
	(0.3)	(0.28)
vola	24.346***	20.943***
	(2.861)	(3.483)
log(vp)	-0.745***
	(0.032)
fin1	0.505**	1.673***
	(0.207)	(0.231)
I(vo/ibnosh)		-0.016***
		(0.005)
log(p)		-1.068***
		(0.083)
R^2	0.417	0.269
adj.R^2	0.416	0.266
N	1128	1128
Standard errors in parentheses

From both regressions it could be seen that the coefficient of the dummy of financial sector is significant and positive.
That suggests a higher bid-ask spread for the financial firms compared to the non-financial ones.
R squared of the first regression is higher, but if turnover ratio is better predictor for the bid-ask spread the second regression could be a better model overall

(e) In the fnancial sector, the relationships of illiquidity with the explanatory variables may not have been the same as in other sectors, in April 2009 (why not?).

model_b.6_e <- lm(bas100 ~ vola + I(vo/ibnosh) + log(p) + fin*I(vo/ibnosh), data=data2)
summary(model_b.6_e)

## 
## Call:
## lm(formula = bas100 ~ vola + I(vo/ibnosh) + log(p) + fin * I(vo/ibnosh), 
##     data = data2)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.539 -1.381 -0.416  0.551 31.647 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        2.520730   0.275846   9.138  < 2e-16 ***
## vola              21.180895   3.433381   6.169 9.57e-10 ***
## I(vo/ibnosh)      -0.010328   0.004813  -2.146   0.0321 *  
## log(p)            -1.082510   0.081827 -13.229  < 2e-16 ***
## fin1               2.382996   0.258828   9.207  < 2e-16 ***
## I(vo/ibnosh):fin1 -0.142590   0.024554  -5.807 8.26e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.997 on 1122 degrees of freedom
## Multiple R-squared:   0.29,  Adjusted R-squared:  0.2869 
## F-statistic: 91.67 on 5 and 1122 DF,  p-value: < 2.2e-16

The relation with illiquidity for the financial sector could possibly be different from the non-finacilal one, due to the financial crisis that reached its nadir at the same period. Major investment banks were in a serious troubles, and as a result investors were shunning those securities. Of course that would lead to higher bid-ask spread.

Modifying the regression (6) of (b) by including interaction dummy variable of the financial sector, we can test the hypothesis of the stability of the coefficients of the model. Because it can be seen from the output of the regression that both coefficients for the dummy and dummy times turnover are hughly significat, that attests for the structural differncies between the coefficients for financial and non-financial sectors. That also suggests that the relation is different.

The same also could be done through the Chow test, but with runnung two regressions.

library(gap)
data2$logp <- log(data2$p)
data2$turnover<-data2$vo/data2$ibnosh
datafin<-subset(data2,fin==1) #splitting the data
dataNonFin<-subset(data2,fin==0)
y1<-as.matrix(datafin$bas100)
y2<-as.matrix(dataNonFin[,2])
x1<-as.matrix(datafin[,c("logp","turnover","vola")])
x2<-as.matrix(dataNonFin[,c("logp","turnover","vola")])
chow_test<-chow.test(y1,x1,y2,x2)
chow_test

##      F value        d.f.1        d.f.2      P value 
## 6.964634e+01 4.000000e+00 1.120000e+03 1.065615e-52

Since the pvalue of the Chow test is so low we obviously reject the null hypothesis that the coefficients for the financial and non-financial sectors are equal for the following period.