Part 1 : Time-series dynamics of asset returns.

Question 1.1

Plot the log price level over the period (do not forget to add all necessary elements and add visual appeal to your graphs). Describe the company (its activities, sector/industry, any “big event” which might have had an impact on the stock prices or any other information you find relevant for the analysis of its financial data. (Max 1/2 page, do not forget to report your sources)

We first download our databases (LENOVO and HSI) from Yahoo Finance and check them for errors using the summary function.

library(readr)
LENOVO <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/LENOVO.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

HSI <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/HSI.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

summary(HSI)

##       Date                 Open            High            Low       
##  Min.   :2017-12-29   Min.   :14831   Min.   :15113   Min.   :14597  
##  1st Qu.:2019-06-20   1st Qu.:21422   1st Qu.:21587   1st Qu.:21158  
##  Median :2020-11-30   Median :25733   Median :25867   Median :25503  
##  Mean   :2020-11-29   Mean   :24948   Mean   :25109   Mean   :24742  
##  3rd Qu.:2022-05-18   3rd Qu.:28171   3rd Qu.:28312   3rd Qu.:27986  
##  Max.   :2023-11-01   Max.   :33335   Max.   :33484   Max.   :32897  
##      Close         Adj Close         Volume         
##  Min.   :14687   Min.   :14687   Min.   :0.000e+00  
##  1st Qu.:21412   1st Qu.:21412   1st Qu.:1.651e+09  
##  Median :25714   Median :25714   Median :1.996e+09  
##  Mean   :24931   Mean   :24931   Mean   :2.143e+09  
##  3rd Qu.:28186   3rd Qu.:28186   3rd Qu.:2.463e+09  
##  Max.   :33154   Max.   :33154   Max.   :6.013e+09

summary(LENOVO)

##       Date                 Open             High            Low       
##  Min.   :2017-12-29   Min.   : 3.610   Min.   : 3.68   Min.   : 3.53  
##  1st Qu.:2019-06-20   1st Qu.: 5.170   1st Qu.: 5.25   1st Qu.: 5.10  
##  Median :2020-11-30   Median : 6.270   Median : 6.39   Median : 6.16  
##  Mean   :2020-11-29   Mean   : 6.535   Mean   : 6.66   Mean   : 6.43  
##  3rd Qu.:2022-05-18   3rd Qu.: 8.000   3rd Qu.: 8.16   3rd Qu.: 7.88  
##  Max.   :2023-11-01   Max.   :11.240   Max.   :11.60   Max.   :10.96  
##      Close          Adj Close         Volume         
##  Min.   : 3.600   Min.   :2.681   Min.   :        0  
##  1st Qu.: 5.160   1st Qu.:4.281   1st Qu.: 29291472  
##  Median : 6.300   Median :5.580   Median : 41814283  
##  Mean   : 6.549   Mean   :5.759   Mean   : 50042929  
##  3rd Qu.: 8.020   3rd Qu.:7.457   3rd Qu.: 61385236  
##  Max.   :11.280   Max.   :9.910   Max.   :410318644

no errors detected

Lenovo Group Limited is a multinational technology company with headquarters in Beijing, China and Morrisville, North Carolina, USA. Founded in 1984 as Legend, the company changed its name to Lenovo in 2004 and is now a leading global technology company, primarily engaged in the design, development, manufacture and sale of personal computers (PCs), notebook computers, tablets, smartphones, servers and other electronic devices. The company operates in a number of segments, including personal and intelligent devices, data centres, mobile telephony, etc. It is currently listed on the Hong Kong Stock Exchange (HKEX) in China under stock code 0992. The company is also listed on the Hang Seng Stock Index (HSI).

Graphical presentation of Lenovo AND HS’s Log Price

LENOVO$logprice<-log(LENOVO$Close)
HSI$logprice<-log(HSI$Close)

par(mar=c(4,4,3,5))
plot(LENOVO$Date,LENOVO$logprice, pch=16, axes=F, xlab="", ylab="", type="l",col="chocolate4", main="Lenovo and HSI log price from Jan 2018 - Nov 2023",cex.main = 1)
axis(2, ylim=c(0,100),col="blue")
mtext("LOG PRICE - Lenovo",side=2,line=2.5)
box()
par(new=T)
plot(LENOVO$Date, HSI$logprice, pch=15,  xlab="", ylab="", axes=F, type="l", col="blue")
mtext("LOG PRICE - HSI",side=4,col="blue",line=2.5)
axis(4, col="blue",col.axis="blue")
unique_years <- unique(format(LENOVO$Date, "%Y"))
axis(1, at = as.Date(paste0(unique_years, "-01-01")), labels = unique_years, las = 2)
mtext("Date",side=1,col="black",line=2.5)
legend("topright", legend=c("Lenovo", "HSI"),col=c("chocolate4","blue"),lty= 1, cex = 0.7)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

summary(LENOVO$logprice)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.281   1.641   1.841   1.843   2.082   2.423

summary(HSI$logprice)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   9.595   9.972  10.155  10.110  10.247  10.409

As part of our “Empirical Finance” course, we are going to analyse the behaviour of the company’s shares over the period 2017 (1st January) - 2018 (1st November). Below, you can see a presentation of its logarithmic share price compared with that of its benchmark index (HSI). We have chosen to represent the logarithmic price of these shares in order to visualise their percentage variation (the distances between the points on the axis are equal in terms of percentage variation). From the sample collected, we can see that Lenovo’s share price rose sharply: (1) from the beginning of 2018 to the beginning of 2019, (2) from the beginning of 2020 to the beginning of 2021 and (3) from the end of 2023 to the end of 2024. A minimum seems to have been reached in 2018 with a price of HKD 3.6 and a maximum in 2021 at HKD 11.28. Its benchmark index (HSI) appears to have experienced a general decline in price, with some sharp rises in 2020 and late 2019.

Despite overall price growth over the period, Lenovo experienced events that significantly affected its share price, such as the announcement of a disappointing dividend on Lenovo shares on 01 August 2022, which caused Lenovo’s share price to fall by 10%.

Question 1.2

Compute, plot, and interpret sample autocorrelation function (SACF).

Autocorrelation is used to measure the correlation of a time series with itself at different lags in time. In the case of Lenovo, we calculated the autocorrelation of each log price at time t(Pt) with its log price lagged by k lags up to the maximum possible lag, i.e. over 1400 lags.

Based on this calculation, we have the following function:

T <- length(LENOVO$Date)
acf(LENOVO$logprice, lag.max = T, type = c("correlation", "covariance", "partial"),ylab="Autocorrelation",col = "chocolate4", plot = TRUE, na.action = na.contiguous, demean = TRUE, ylim=c(-1,1),main ="")
title(main = "Lenovo SACF - log returns",col.main="antiquewhite4",cex.main = 1)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

We can see that the less time lag there is, the more the logarithmic series seems to correlate with itself and vice versa. This is perfectly normal, as the price of a share does not change dramatically from one day to the next (if the price is 100 today, it is very rare for it to be 5 tomorrow and 200 the next day). As the number of lags increases, the autocorrelation fades. An interesting feature of the graph is that from 700 lags upwards, the autocorrelation seems to be significant, suggesting that there could be an inverse relationship between the different prices observed, suggesting the presence of an economic or technological cycle or trend spaced out by around 2 years.

Question 1.4.

Run formal econometric tests in order to check whether your intuitions are verified or not. Explain the tests, state your hypothesis and finally carefully interpret the outcome of these tests. If they incline in the direction of non-stationarity, pose a clear diagnostic on the type of non-stationarity.

To answer this question, we will check whether Lenovo’s closing prices and simple returns are stationary or not. To do this, we will apply a Dickey Fuller test to the two series in question. Based on this formula :

We will check whether Φ is equivalent to 1. If this is the case, then we have a unit root, so the process is integrated of order 1 and is not stationary. If Φ = 0, then we have a stationary series. However, instead of using Φ, we will use a psi coefficient (ψ) to check whether this is significantly different from 0 and use an augmented version of the Dickey Fuller test using p lags of the dependent variable:

Returns are obtained by the difference between the price at time t and the price at time t-1 divided by the price at time t-1 and the presence of stationarity is verified using the same method with the following regression function:

For both models, we will therefore perform an Augmented Fuller Dickey test with the following hypotheses:

Null hypothesis: ψ = 0, the time series has a unit root, which means that it is not stationary.
Alternative hypothesis: ψ ≠ 0, the series is stationary

The value of the statistical test is calculated as follows and compared with the critical values proposed by Fuller (1976).

library(fUnitRoots)
LENOVO$RRLENOVO = c(NA, 100*diff(LENOVO$Close)/LENOVO$Close[1:nrow(LENOVO)-1])
adfTest(LENOVO$Close,lags = 10,type = "c")

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 10
##   STATISTIC:
##     Dickey-Fuller: -1.5943
##   P VALUE:
##     0.4647 
## 
## Description:
##  Sun Jan 21 12:09:20 2024 by user:

adfTest(LENOVO$RRLENOVO,lags = 10,type = "c")

## Warning in adfTest(LENOVO$RRLENOVO, lags = 10, type = "c"): p-value smaller
## than printed p-value

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 10
##   STATISTIC:
##     Dickey-Fuller: -11.6957
##   P VALUE:
##     0.01 
## 
## Description:
##  Sun Jan 21 12:09:20 2024 by user:

By performing this test, we find a test statistic of -1.5943, i.e. a P-value of 0.4647 (see appendix 1.4) for Lenovo prices. We cannot significantly reject the null hypothesis (P-value > 5%), so we are in the presence of a unit root and have a non-stationary series. With regard to share returns, we find a test statistic of -11.6957, i.e. a P-value of less than 0.01 (see appendix 1.4). We can significantly reject the null hypothesis (P-value < 5%), so we are in the presence of a stationary series for returns.

With regard to the type of non-stationarity that could be found in Lenovo’s stock price, we consider that we could be faced with a Non-Stationarity Autoregressive Process with a Drift and a Time Trend because, as mentioned in question 1.3, there seems to be a positive correlation between the present returns and those lagged by a few lags. In addition, we believe that a positive time trend may exist, pushing prices up in the long term, notably due to inflation and other macroeconomic factors.

Question 1.5

Generate the first difference of log prices and plot the series. Compute, plot, and interpret the ACF for the first difference of log prices.

To calculate the daily logarithmic returns on the price of Lenovo shares, we have calculated the logarithm of the ratio of the price at time t to the price at time t-1 to obtain the logarithmic return at time t.

Here’s an overview of the period under study.

LENOVO$RLENOVO = c(NA,100*diff(log(LENOVO$Close)))
LENOVO = LENOVO[-1,]
plot(LENOVO$Date,LENOVO$RLENOVO,type='l',col = "chocolate4",col.main="antiquewhite4",ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("Lenovo log return from Jan 2018 - Nov 2023"),cex.main = 1, cex.lab = 1.5)
legend("topright", legend=c("Lenovo"),col=c("chocolate4"),lty= 1, cex = 0.6)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

summary(LENOVO$RLENOVO)

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -16.37040  -1.32234   0.00000   0.04921   1.41783  11.64541

sd(LENOVO$RLENOVO)

## [1] 2.501519

We can see from this graph that Lenovo’s returns appear to be distributed around an average return close to 0 (0.049%) with a maximum of 11.6% reached on 30 March 2021 and a minimum of -16.37% on 05 October 2018. Lenovo’s returns fluctuate by an average of 2.5% (= standard deviation) per day relative to the average. However, we can see that this volatility does not appear to be constant but varies according to the period we are in (volatility cluster). When we are in a period of low volatility (green circle), the returns for the following days seem to be less volatile, whereas when we are in a period of high volatility (red circle), the volatility of the following returns seems to be just as high.

As far as the autocorrelation of logarithmic returns is concerned, we can see from the following two graphs that today’s returns are not correlated with past returns. As the graph on the left illustrates, the correlation of today’s returns shows an average correlation of less than 0.05 with other past returns, which is very low. In the chart on the right, comparing yesterday’s returns with today’s, the regression line through the scatter plot (red line) has almost zero slope, indicating a low correlation.

T2 <- length(LENOVO$Date)
acf(LENOVO$RLENOVO, lag.max = T2, type = c("correlation", "covariance", "partial"),col = "chocolate4", plot = TRUE, na.action = na.contiguous, demean = TRUE, ylim=c(-0.1,0.2),main ="")
title(main = "Lenovo ACF - log returns",col.main="antiquewhite4",cex.main = 1)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

plot(LENOVO$RLENOVO[-nrow(LENOVO)], LENOVO$RLENOVO[-1],
     xlab = "Yesterday's returns",
     ylab = "Today's returns",
     main = "Scatter plot - Yesterday's vs. today's returns", col.main="antiquewhite4",cex.main = 1,
     col = "chocolate4", pch = 16,
     axes = FALSE)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)
# Add a linear regression line
abline(lm(LENOVO$RLENOVO[-1] ~ LENOVO$RLENOVO[-nrow(LENOVO)]), col = "red")
# Add custom axes in the middle of the chart
axis(1, at = seq(-16, 12, by = 1), col = "black", las = 1) 
axis(2, at = seq(-16, 12, by = 1), col = "black", las = 1)

This lack of correlation, which is a stylised fact in finance, was first documented by Working in 1934 and Kendall in 1953, and means that we cannot predict future returns based on past ones.

Question 1.6

Compute the AIC information criteria for all ARMA(p,q) models with p = 0, 1, 2, 3 and q = 0, 1, 2, 3 (use the arma command). Which model is suggested by these model selection statistics ? Estimate the coefficients and provide a formally correct writing of the final model.

In order to determine which ARMA model best explains the logarithmic returns, we decided to calculate all the AIC values for all these types of model. On the basis of these models, the one with the lowest AIC was selected as the best model. On this basis, we estimate that the best model is the ARMA (1,1) model with the lowest AIC of 6710.03 (see appendix 1.6). We therefore have a model of the following form:

Lenovofinal.aic <- Inf
Lenovofinal.order <- c(0,0,0)
for (i in 0:4) for (j in 0:4) {
   Lenovocurrent.aic <- AIC(arima(LENOVO$RLENOVO, order=c(i, 1, j)))
   if (Lenovocurrent.aic < Lenovofinal.aic) {
     Lenovofinal.aic <- Lenovocurrent.aic
     Lenovofinal.order <- c(i, 0, j)
     Lenovofinal.arma <- arima(LENOVO$RLENOVO, order=Lenovofinal.order)
   }
 }
Lenovofinal.order

## [1] 1 0 2

Lenovofinal.arma

## 
## Call:
## arima(x = LENOVO$RLENOVO, order = Lenovofinal.order)
## 
## Coefficients:
##          ar1      ma1      ma2  intercept
##       0.6844  -0.6570  -0.0583     0.0491
## s.e.  0.1993   0.2002   0.0269     0.0595
## 
## sigma^2 estimated as 6.23:  log likelihood = -3351.1,  aic = 6712.2

By calculating the coefficients of this model, we estimate the following model:

And as the intercept has a t-value of less than 1.96, it is not significantly different from 0, unlike the other two estimated coefficients. We therefore have the following final model:

Question 1.7

Test the presence of ARCH effects in asset prices with the appropriate procedure. Explain in details your test and if the results coincide with you initial intuition.

Before answering this question, it is important to understand what the ARCH effect represents in finance. It is a phenomenon observed in financial time series, characterised by the presence of conditional volatility that varies over time. This means that the volatility of a time series is not constant, but depends on the past values of the series itself.

To get an overview of the potential ARCH effect in Lenovo’s share price, we squared the log returns to obtain a proxy for the volatility of Lenovo’s returns. Here’s what it looks like:

LENOVO$R2LENOVO = LENOVO$RLENOVO^2
plot(LENOVO$Date,LENOVO$R2LENOVO,type = 'l',xlab="Date",ylab="Squared log return", main="",col = "chocolate4")
title(main = "Historical squared first log difference of Lenovo's stock price",col.main="antiquewhite4",cex.main = 1)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

By analysing this graph, we can intuitively detect clusters of volatility in the company’s returns, i.e. periods when high volatility is present for several days in a row and vice versa. For example, we can see that at the beginning of 2021, the company’s returns were very volatile, whereas at the beginning of our sample (early 2018), they were very less volatile.

To test for the presence of this clustered volatility effect, we first tested for the presence of an ARCH effect via an ARCH test, which attempts to verify whether the volatility at a given time t depends on the volatility at the previous time t-1. We define as hypotheses

Null Hypothesis: Absence of ARCH Effect
Alternative hypothesis: Presence of ARCH Effect

##ARCH1
arch1 <- lm(LENOVO$R2LENOVO[2:T] ~ LENOVO$R2LENOVO[1:(T-1)])
summary(arch1)

## 
## Call:
## lm(formula = LENOVO$R2LENOVO[2:T] ~ LENOVO$R2LENOVO[1:(T - 1)])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.084  -5.621  -4.279   0.329 262.275 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 5.71454    0.41538  13.757  < 2e-16 ***
## LENOVO$R2LENOVO[1:(T - 1)]  0.08714    0.02631   3.312 0.000951 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.45 on 1433 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.007595,   Adjusted R-squared:  0.006902 
## F-statistic: 10.97 on 1 and 1433 DF,  p-value: 0.0009507

summary(arch1)$r.squared*(T-1)

## [1] 10.90631

qchisq(0.05, 1, lower.tail=FALSE)

## [1] 3.841459

## Reject of the null hypothesis, thus Garch (1) effect is present

To test these hypotheses, we regress Lenovo’s logarithmic returns squared on those obtained previously shifted by 1 lag. We can see that the coefficient of the lagged log return is 0.08714 and that it is significantly different from 0 at 5% because its P-value is 0.09%, i.e. less than 5%. We take the R2 of our regression and multiply it by the length (T-1) of our lagged vector and calculate the test statistic (T-1)*R2. We obtain a value of 10.89872, which is greater than the 5% critical value for a chi-square with T-1 degrees of freedom, which is 3.84. There is therefore an ARCH effect of order 1 and we reject the null hypothesis that there is no Garch (1) effect.

##ARCH2
arch2 <-lm(LENOVO$R2LENOVO[3:T] ~ LENOVO$R2LENOVO[1:(T-2)] + LENOVO$R2LENOVO[2:(T-1)] )
summary(arch2)

## 
## Call:
## lm(formula = LENOVO$R2LENOVO[3:T] ~ LENOVO$R2LENOVO[1:(T - 2)] + 
##     LENOVO$R2LENOVO[2:(T - 1)])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.183  -5.635  -4.282   0.353 262.296 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 5.74364    0.44227  12.987  < 2e-16 ***
## LENOVO$R2LENOVO[1:(T - 2)] -0.00429    0.02646  -0.162 0.871209    
## LENOVO$R2LENOVO[2:(T - 1)]  0.08741    0.02644   3.306 0.000968 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.46 on 1431 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.007593,   Adjusted R-squared:  0.006206 
## F-statistic: 5.475 on 2 and 1431 DF,  p-value: 0.00428

summary(arch2)$r.squared*(T-2)

## [1] 10.8964

qchisq(0.05, 2, lower.tail=FALSE)

## [1] 5.991465

## Reject of the null hypothesis thus Garch (2) effect is present

In parallel, we tested for the presence of a second-order ARCH effect and regressed Lenovo’s log-squared returns on those obtained previously lagged by 2 lag. Following the same procedure with the same assumptions, we find a coefficient for the log return lagged by 1 of 0.08741 and -0.00429 for the squared return lagged by 2 time units. The first is significantly different from 0, while the other is not (because P value = 87% > 5%). This means that yesterday’s squared returns have no influence (unlike yesterday’s squared returns) on today’s squared returns. Using the ARCH test (T*R2), we find a value of 10.888 for a critical value (3 df chi-square at 5%) of 5.99. We therefore conclude that there is a second-order ARCH effect and reject the null hypothesis that there is no ARCH effect.

We can continue to extend our model up to q lag but at how many lags must we stop to conclude that autoregressive conditional heteroskedasticity is present? This is a complicated question that gives rise to the use of a GARCH model.

Question 1.8

Try to complement you model with a GARCH(1,1) for the variance equation. State the complete model after estimation. Is it useful to use such a complicated model in your case ?

library(rugarch)

## Loading required package: parallel

## 
## Attaching package: 'rugarch'

## The following object is masked from 'package:stats':
## 
##     sigma

garch11.spec = ugarchspec(mean.model = list(armaOrder=c(0,0)),
                          variance.model = list(garchOrder=c(1,1), 
                                                model="sGARCH"))
garch11=ugarchfit(garch11.spec,data=LENOVO$RLENOVO)
garch11

## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(0,0,0)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.040709    0.064000  0.63607 0.524731
## omega   1.617076    0.592468  2.72939 0.006345
## alpha1  0.097319    0.029885  3.25641 0.001128
## beta1   0.647936    0.111306  5.82121 0.000000
## 
## Robust Standard Errors:
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.040709    0.063512  0.64096 0.521548
## omega   1.617076    0.845946  1.91156 0.055933
## alpha1  0.097319    0.041198  2.36223 0.018165
## beta1   0.647936    0.158260  4.09413 0.000042
## 
## LogLikelihood : -3337.575 
## 
## Information Criteria
## ------------------------------------
##                    
## Akaike       4.6540
## Bayes        4.6687
## Shibata      4.6540
## Hannan-Quinn 4.6595
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                     0.4576  0.4987
## Lag[2*(p+q)+(p+q)-1][2]    1.4108  0.3822
## Lag[4*(p+q)+(p+q)-1][5]    2.9318  0.4197
## d.o.f=0
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                   0.006375  0.9364
## Lag[2*(p+q)+(p+q)-1][5]  1.382814  0.7687
## Lag[4*(p+q)+(p+q)-1][9]  3.632293  0.6516
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]    0.4305 0.500 2.000  0.5117
## ARCH Lag[5]    1.8626 1.440 1.667  0.5024
## ARCH Lag[7]    3.7902 2.315 1.543  0.3778
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  0.5608
## Individual Statistics:              
## mu     0.04066
## omega  0.33476
## alpha1 0.12893
## beta1  0.27979
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          1.07 1.24 1.6
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value   prob sig
## Sign Bias           0.9367 0.3490    
## Negative Sign Bias  0.3603 0.7187    
## Positive Sign Bias  0.4547 0.6494    
## Joint Effect        0.9035 0.8246    
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     67.65    2.252e-07
## 2    30     70.89    2.286e-05
## 3    40     77.93    2.119e-04
## 4    50    105.02    5.860e-06
## 
## 
## Elapsed time : 0.1505291

By estimating the model via maximum likelihood, we find a value for mu of 0.04 (see appendix 1.8) but whose P-value is greater than 5%, so the conditional mean of the financial returns is not significantly different from 0. We can then estimate the parameters ω, a1 and b1 as being equal to 1.6171, 0.0973 and 0.6479 respectively. These coefficients are therefore significantly different from 0 because their P-value is well below the 5% threshold, so we have the following model for the variance equation according to the GARCH (1,1) model:

The GARCH model is an extension of the ARCH model that includes not only the impact of past residuals on conditional volatility (as in the ARCH model), but also the impact of past conditional variance. This model can generally be used to capture and explain volatility clustering in stock returns. This model can also be used to predict the financial risk of an asset. The financial risk of a stock market asset (represented in particular by its variance) is a key factor in financial risk management. Investors and portfolio managers can use the volatility forecasts generated by the GARCH model to assess and mitigate potential risks.

Question 1.9

Part 2 : Minimum-variance portfolio construction and factor models

Question 2.1

Choose (on any market) 4 other stocks and construct a portfolio made of these 4 stocks plus the stock of the company you focus on. To do so, use weights defined by the capitalization-weighted portfolio theory. + Compute the series of returns and risks of your portfolio from the 1st of January 2018 to the 1st of November 2023. + For simplicity, use the market-capitalization of the companies on the 1st of November 2023 for all dates. Pay attention to the currency. + Repeat the procedure to create a second portfolio, with the same five stocks but equally-weighted.

We have selected four companies: BYD, Tencent, Sunny Optical, and ACC Limited, all of which are active in different sectors in order to have a diversified portfolio. However, they all share a commitment to innovation and technology. Tencent is a leader in information technology and digital media, Sunny Optical specializes in optical and imaging technologies, BYD is involved in electric vehicles and energy technologies, while ACC Limited incorporates innovative construction solutions into its sector. This commitment to innovation is one of the reasons we made our choice.

LENOVO <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/LENOVO.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

HSI <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/HSI.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

ACC <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/4 entreprises asiatiques techno/ACC.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Tencent <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/4 entreprises asiatiques techno/Tencent.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

BYD <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/4 entreprises asiatiques techno/BYD.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Sunny <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/4 entreprises asiatiques techno/Sunny Optical.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

To create a portfolio based on the market values of these 5 companies (including Lenovo), we used the market capitalisation (as at 1 November 2023) of these 5 companies to estimate a weighting coefficient b (from b1 to b5) for each company, calculated as the company’s capitalisation divided by the sum of the capitalisation of the 5 companies. Subsequently, to avoid certain companies being over-represented in the portfolio due to their higher price than the others, we converted this price to a 100 basis with the company’s share price on 29 December 2017 as the reference value:

#Capitalization ACC on the 1st of November (Cap of 24,449 Billions HKD and price of 20,4 HKD for the 17/11/2023)
CapACC <- (24449000000/20.4)*(ACC$Close[ACC$Date == "2023-11-01"])

#Capitalization Tencent on the 1st of November (Cap of 2997 Billions HKD and price of 315,2 HKD for the 17/11/2023)
CapTencent <- (2997000000000/315.2)*(Tencent$Close[Tencent$Date == "2023-11-01"])

#Capitalization BYD on the 1st of November (cap of 730,576 Billions HKD and price of 244,2 HKD for the 17/11/2023)
CapBYD <- (730576000000/244.2)*(BYD$Close[BYD$Date == "2023-11-01"])

#Capitalization Sunny on the 1st of November (cap of 80,338 Billions HKD and price of 73,25 HKD for the 17/11/2023)
CapSunny <- (80360000000/73.25)*(Sunny$Close[Sunny$Date == "2023-11-01"])

#Capitalization LENOVO on the 1st of November (cap of 116,915 Billions HKD and price of 9,64 HKD for the 17/11/2023)
CapLENOVO <- (116915000000/9.64)*(LENOVO$Close[LENOVO$Date == "2023-11-01"])

# Capitalisation coefficient
b1 <- CapACC/(CapACC+CapTencent+CapBYD+CapSunny+CapLENOVO)
b2 <- CapTencent/(CapACC+CapTencent+CapBYD+CapSunny+CapLENOVO)
b3 <- CapBYD/(CapACC+CapTencent+CapBYD+CapSunny+CapLENOVO)
b4 <- CapSunny/(CapACC+CapTencent+CapBYD+CapSunny+CapLENOVO)
b5 <- CapLENOVO/(CapACC+CapTencent+CapBYD+CapSunny+CapLENOVO)

# Computation of the capitalization weighted portfolio
ACC$base2018 <- (ACC$Close/ACC$Close[[1]])*100
Tencent$base2018 <- (Tencent$Close/Tencent$Close[[1]])*100
BYD$base2018 <- (BYD$Close/BYD$Close[[1]])*100
Sunny$base2018 <- (Sunny$Close/Sunny$Close[[1]])*100
LENOVO$base2018 <- (LENOVO$Close/LENOVO$Close[[1]])*100

# Capitalization weighted portfolio log return
LENOVO$capitalportfolio <- b1*ACC$base2018 + b2*Tencent$base2018 + b3*BYD$base2018 + b4*Sunny$base2018 + b5*LENOVO$base2018
LENOVO$Rcapitalportfolio = c(NA,100*diff(log(LENOVO$capitalportfolio)))

We did the same with the equally weighted portfolio, the only difference being that the b coefficients are all equal to 0.2, which gives :

# Equally weighted portfolio log return
LENOVO$equalportfolio <- 0.2*ACC$base2018 + 0.2*Tencent$base2018 + 0.2*BYD$base2018 + 0.2*Sunny$base2018 + 0.2*LENOVO$base2018
LENOVO$Requalportfolio = c(NA,100*diff(log(LENOVO$equalportfolio)))

We have calculated the returns on these portfolios at time t on the basis of the logarithm of the ratio of the price at time t to the price at time t-1 :

To calculate the risk, we estimate the standard deviation of these returns. We will discuss this in question 2.3.

Question 2.2

Download daily prices of the index you choose your company from and compute daily returns for the same period. ::: {.justify} Lenovo is currently listed on the Hong Kong Stock Exchange (HKEX) in China under stock code 0992 and is an important part of the Hong Seng Index (HSI), which we will study in this paper. We have therefore downloaded the daily prices of the HSI from Yahoo Finance and calculated the returns and risks of this index using the same calculation methods developed in the previous question. We will discuss this in question 2.3. :::

Question 2.3

Compare, graphically and with summary statistics, the returns and risks of : your company, the market and the two portfolios you created. Explain the differences.

Here are graphical representations of the returns of the various portfolios created, of Lenovo and of its benchmark index, the Hang Seng Index (HSI).

# Lenovo and index log return
LENOVO$RLENOVO = c(NA,100*diff(log(LENOVO$Close)))
HSI$RHSI = c(NA,100*diff(log(HSI$Close)))
LENOVO = LENOVO[-1,]
HSI = HSI[-1,]

# return's graphics
par(mfrow = c(1,1))
plot(LENOVO$Date,LENOVO$Requalportfolio,type='l',col = "darkmagenta",col.main="antiquewhite4",ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("Capitalization weighted portofolio log return from 1st Jan 2018 - 1st Nov 2023"),cex.main = 0.8)
abline(h = 0, col = "black", lty = 1)  # Ajouter une ligne à y = 0 en vert, avec un style en pointillé (lty = 2)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.5)

plot(LENOVO$Date,LENOVO$Rcapitalportfolio,type='l',col = terrain.colors(4),ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("Equally weighted portofolio log return from 1st Jan 2018 - 1st Nov 2023"),col.main="antiquewhite4",cex.main = 0.8)
abline(h = 0, col = "black", lty = 1)  # Ajouter une ligne à y = 0 en vert, avec un style en pointillé (lty = 2)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.5)

plot(LENOVO$Date,LENOVO$RLENOVO,type='l',col = "chocolate4",col.main="antiquewhite4",ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("Lenovo log return from 1st Jan 2018 - 1st Nov 2023"),cex.main = 0.8)
abline(h = 0, col = "black", lty = 1)  # Ajouter une ligne à y = 0 en vert, avec un style en pointillé (lty = 2)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.5)

plot(LENOVO$Date,HSI$RHSI,type='l',col = "darkorange1",ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("HSI log return from 1st Jan 2018 - 1st Nov 2023"),col.main="antiquewhite4",cex.main = 0.8)
abline(h = 0, col = "black", lty = 1)  # Ajouter une ligne à y = 0 en vert, avec un style en pointillé (lty = 2)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.5)

Looking at these charts, it seems that the Hang Seng index is the least volatile of the 4 assets. As for Lenovo, the company’s shares seem to be much more volatile than all the other assets. As for the two portfolios, it’s difficult to tell them apart just by looking at them on these charts, so we need to calculate some statistics to compare them to each other and to the other assets. In order to compare these results, we have created box plots for each of the assets.

# Boxplot
stats1 <- summary(LENOVO$Rcapitalportfolio)
stats2 <- summary(LENOVO$Requalportfolio)
stats3 <- summary(LENOVO$RLENOVO)
stats4 <- summary(HSI$RHSI)

par(mfrow = c(1,1))
boxplot(LENOVO$Rcapitalportfolio,col = "azure2",ylab="Return (%)",main = ("Capitallization weighted portofolio log return Boxplot"),col.main="antiquewhite4",cex.main = 1)
grid()
important_values <- c(stats1[1], stats1[2], stats1[4:6], stats1[7]) 
text(rep(1, length(important_values)), important_values, 
     labels = round(important_values, 4), col = "black", pos = 3)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

boxplot(LENOVO$Requalportfolio,col = terrain.colors(4),ylab="Return (%)",main = ("Equally weighted portofolio log return Boxplot"),col.main="antiquewhite4",cex.main = 1)
grid()
important_values <- c(stats2[1], stats2[2], stats2[4:6], stats2[7])  
text(rep(1, length(important_values)), important_values, 
     labels = round(important_values, 4), col = "black", pos = 3)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

boxplot(LENOVO$RLENOVO,col = "chocolate2",ylab="Return (%)",main = ("LENOVO log return Boxplot"),col.main="antiquewhite4",cex.main = 1)
grid()
important_values <- c(stats3[1], stats3[2], stats3[4:6], stats3[7]) 
text(rep(1, length(important_values)), important_values, 
     labels = round(important_values, 4), col = "black", pos = 3)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

boxplot(HSI$RHSI,col = "bisque2",ylab="Return (%)",main = ("HSI log return Boxplot"),col.main="antiquewhite4",cex.main = 1)
grid()
important_values <- c(stats4[1], stats4[2], stats4[4:6], stats4[7]) 
text(rep(1, length(important_values)), important_values, 
     labels = round(important_values, 4), col = "black", pos = 3)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

# Box plot statistics
summary(LENOVO$Rcapitalportfolio)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -9.95920 -1.31246  0.03791  0.01641  1.35215 18.03682

summary(LENOVO$Requalportfolio)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -8.54630 -1.33592  0.02759  0.02248  1.36542 13.14596

summary(LENOVO$RLENOVO)

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -16.37040  -1.32234   0.00000   0.04921   1.41783  11.64541

summary(HSI$RHSI)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -6.56732 -0.83152  0.02200 -0.03895  0.74889  8.69278

# Risk
sd(LENOVO$Rcapitalportfolio)

## [1] 2.314814

sd(LENOVO$Requalportfolio)

## [1] 2.265344

sd(LENOVO$RLENOVO)

## [1] 2.501519

sd(HSI$RHSI)

## [1] 1.442611

As we can see from the Boxplot and the table above, Lenovo is the asset with the highest average return of the 4 assets/portfolios, with an expected daily return of 0.0492%. Conversely, HSI is the least profitable with an average daily return of -0.0390%. With Lenovo’s higher return, we would expect the asset to have a higher risk (standard deviation) than the other assets, which is in fact the case, as the stock has the highest average deviation from the mean at 2.5%. As HSI is less profitable, the standard deviation is the lowest of the 4 assets. Between the two portfolios, the equally weighted portfolio appears to be the more attractive of the two in terms of returns and risk, offering the best return (0.0225% daily) and the lowest risk (standard deviation =2.2653%). Next, we see that Lenovo has seen the biggest 1-day fall of the 4 assets, with -16.3704%, compared with -6.5673% for HSI (the smaller fall). Finally, we see that the best 1-day return was achieved by the capitalization weighted portfolio, which rose by more than 18% in 1 day, compared with 8.6928% for the HSI.

Question 2.4

Go to French website to download the factors needed to estimate the 4-factors Carhart model on your company stock returns and on the market-weighted portfolio. Pay attention to select the right factors.

The Carhart model requires 4 factors to be estimated, such as :

Where

RMRF is the excess return of the market over a risk-free rate
SMB is the difference between the return on a small stock portfolio and a large stock portfolio
HML is the difference between the return on a value portfolio and a growth portfolio
UMD (WML in our database) is the difference between the return on a portfolio of the best-performing stocks and a portfolio of the worst-performing stocks. This factor is often referred to as momentum.

In order to deal with these factors we downloaded the Fama/French Asia Pacific ex Japan 5 Factors [Daily] and the Asia Pacific ex Japan Momentum Factor (Mom) [Daily]. We gather the two databases into one (Asia4) and estimate the model on our two portfolios.

# See https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html => 
# Download of the Fama/French Asia Pacific ex Japan 5 Factors [Daily] and Asia Pacific ex Japan Momentum Factor (Mom) [Daily] We gather the to databases into one (Asia4)
library(readxl)
Asia4 <- read_excel("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/Fama-French Asia/Fama 4 factor Asia.xlsx",col_types = c("date", "numeric", "numeric", "numeric", "numeric"))
Asia4 = Asia4[-1,]

Question 2.5

Estimate the models, describe your results, what it means for your stock and your portfolio and explain the differences.

# 4-factor Carhart model of Lenovo returns
ff_Lenovo <-lm(LENOVO$RLENOVO ~ Asia4$`Mkt-RF`+ Asia4$SMB + Asia4$HML + Asia4$WML)
summary(ff_Lenovo)

## 
## Call:
## lm(formula = LENOVO$RLENOVO ~ Asia4$`Mkt-RF` + Asia4$SMB + Asia4$HML + 
##     Asia4$WML)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.9968  -1.2734   0.0071   1.1970  11.5604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.05693    0.06192   0.919   0.3580    
## Asia4$`Mkt-RF`  1.04479    0.08639  12.095   <2e-16 ***
## Asia4$SMB       0.24305    0.15395   1.579   0.1146    
## Asia4$HML       0.26434    0.12423   2.128   0.0335 *  
## Asia4$WML      -0.10164    0.09906  -1.026   0.3051    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.337 on 1431 degrees of freedom
## Multiple R-squared:  0.1299, Adjusted R-squared:  0.1275 
## F-statistic: 53.43 on 4 and 1431 DF,  p-value: < 2.2e-16

By estimating our alpha and beta coefficients for Lenovo’s returns, we can estimate the following model:

When the number is in red, this means that the absolute t-value is less than 1.96 and that the P-value is greater than 5%, indicating that the coefficient obtained is not significantly different from 0. On the basis of this result, we can estimate the following model:

We do the same thing with the market weighted portfolio.

# 4-factor Carhart model of market-weighted portfolio returns
ff_Rcapitalportfolio <-lm(LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF`+ Asia4$SMB + Asia4$HML + Asia4$WML)
summary(ff_Rcapitalportfolio)

## 
## Call:
## lm(formula = LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF` + Asia4$SMB + 
##     Asia4$HML + Asia4$WML)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0331  -1.0638  -0.0697   0.9846  13.8601 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.021476   0.050512   0.425    0.671    
## Asia4$`Mkt-RF`  1.276042   0.070468  18.108   <2e-16 ***
## Asia4$SMB       0.149940   0.125585   1.194    0.233    
## Asia4$HML      -0.269141   0.101341  -2.656    0.008 ** 
## Asia4$WML       0.004349   0.080811   0.054    0.957    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.906 on 1431 degrees of freedom
## Multiple R-squared:  0.3239, Adjusted R-squared:  0.322 
## F-statistic: 171.4 on 4 and 1431 DF,  p-value: < 2.2e-16

We obtain the following model:

Finally, by removing the insignificant terms, we find a model similar to the one estimated before :

On the basis of these two models, we can conclude that Lenovo is more sensitive to market variation, with a beta of 1.27 compared with 1.04 for our portfolio. In other words, a decrease (or increase) of 2% in the excess return of the market will generate a decrease (or increase) of 2.55% for the Lenovo share compared to 2.08958% for the equally weighted portfolio. The portfolio is therefore much less sensitive to market fluctuations, due in particular to the fact that it is much more diversified.

Next, as shown in the matrix below, we can see that Lenovo has a negative coefficient for the HML factor, which implies that when value stocks (high book-to-market, low P/E, etc.) show a higher increase in return than growth stocks (low book-to-market, high P/E, etc.), Lenovo’s return is positively influenced by this increase, which suggests that Lenovo is a value company. For our portfolio, the opposite is observed, an increase in the HML factor is negatively linked to the portfolio’s return, so we could be in a portfolio made up of growth stocks.

Question 2.6

What can you say about the efficient market hypothesis ? ::: {.justify} As a reminder, according to the efficient market hypothesis, markets are efficient in the sense that all available relevant information is rapidly incorporated into the price of market assets. It is therefore difficult to determine the price that a financial asset will fetch because the current price constantly reflects all the information available. It is therefore difficult to make abnormal profits based on current and past information. There are several forms of efficient market hypothesis:

The weak form: historical information is already reflected in asset prices
Semi-strong form: historical and current information is already reflected in asset prices
The strong form: all types of information, including insider information, are already reflected in asset prices.

In general, the first and second forms have already been demonstrated empirically, but the strong form has not.

In our work, we will analyse whether Lenovo and the equally weighted portfolio present market anomalies. As a reminder, a market anomaly in the context of stock market performance occurs when the observed return on an asset does not correspond to the market’s expectations. These expectations can be derived from models such as the CAPM, the 3 fama french factors or the Carhart model. Firstly, by analysing the Carhart model presented in the previous question, we can see that the alpha in both models (Lenovo’s and the market-weighted portfolio) is not significantly different from 0 (t-values of 0.919 and 0.425).

For the Fama-french model, we obtain the same results (t-values of 0.847 and 0.43). For the CAPM, we have the same results (0.794 and 0.321). These results are presented below.

:::

# 3-factor Fama-french model of Lenovo returns
ff2_Lenovo <-lm(LENOVO$RLENOVO ~ Asia4$`Mkt-RF`+ Asia4$SMB + Asia4$HML)
summary(ff2_Lenovo)

## 
## Call:
## lm(formula = LENOVO$RLENOVO ~ Asia4$`Mkt-RF` + Asia4$SMB + Asia4$HML)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0106  -1.2622  -0.0056   1.1921  11.6023 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.05231    0.06176   0.847   0.3971    
## Asia4$`Mkt-RF`  1.03723    0.08607  12.051   <2e-16 ***
## Asia4$SMB       0.21755    0.15193   1.432   0.1524    
## Asia4$HML       0.29140    0.12140   2.400   0.0165 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.337 on 1432 degrees of freedom
## Multiple R-squared:  0.1293, Adjusted R-squared:  0.1275 
## F-statistic: 70.88 on 3 and 1432 DF,  p-value: < 2.2e-16

# 3-factor Fama-french model of market-weighted portfolio returns
ff2_Rcapitalportfolio <-lm(LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF`+ Asia4$SMB + Asia4$HML)
summary(ff2_Rcapitalportfolio)

## 
## Call:
## lm(formula = LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF` + Asia4$SMB + 
##     Asia4$HML)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0360  -1.0633  -0.0690   0.9876  13.8547 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.02167    0.05036   0.430   0.6670    
## Asia4$`Mkt-RF`  1.27637    0.07019  18.185   <2e-16 ***
## Asia4$SMB       0.15103    0.12389   1.219   0.2230    
## Asia4$HML      -0.27030    0.09900  -2.730   0.0064 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.905 on 1432 degrees of freedom
## Multiple R-squared:  0.3239, Adjusted R-squared:  0.3225 
## F-statistic: 228.6 on 3 and 1432 DF,  p-value: < 2.2e-16

# CAPM of Lenovo returns
ff1_Lenovo <-lm(LENOVO$RLENOVO ~ Asia4$`Mkt-RF`)
summary(ff1_Lenovo)

## 
## Call:
## lm(formula = LENOVO$RLENOVO ~ Asia4$`Mkt-RF`)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0510  -1.2629   0.0039   1.1947  11.6952 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.04906    0.06175   0.794    0.427    
## Asia4$`Mkt-RF`  0.89876    0.06264  14.349   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.34 on 1434 degrees of freedom
## Multiple R-squared:  0.1256, Adjusted R-squared:  0.1249 
## F-statistic: 205.9 on 1 and 1434 DF,  p-value: < 2.2e-16

# CAPM of market-weighted portfolio returns
ff1_Rcapitalportfolio <-lm(LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF`)
summary(ff1_Rcapitalportfolio)

## 
## Call:
## lm(formula = LENOVO$Rcapitalportfolio ~ Asia4$`Mkt-RF`)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2985  -1.0360  -0.0595   0.9789  14.0909 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.01619    0.05047   0.321    0.748    
## Asia4$`Mkt-RF`  1.32313    0.05120  25.845   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.913 on 1434 degrees of freedom
## Multiple R-squared:  0.3178, Adjusted R-squared:  0.3173 
## F-statistic:   668 on 1 and 1434 DF,  p-value: < 2.2e-16

As a result, in these three models, the alpha is significantly different from 0, so it seems that there is no style anomaly linked to these models. However, it should be noted that, despite the results of our models, anomalies may still exist and we recommend continuing the analysis by considering other aspects or using other models if necessary to obtain a more in-depth understanding of asset performance.

Question 2.7

Test for a cointegration relationship between the market-weighted portfolio and the market returns from sub-question (2). Describe each step and conclude with the best econometric model suitable to study the impact of the market on your portfolio.

We therefore need to estimate whether there is a cointegrating relationship between the returns of the market-weighted portfolio and those of the Hang Seng Index. Here is a graphical representation of the price of HSI and the market-weighted portfolio (base 100 of 29 December 2018) and a graphical representation of their returns.

# price comparison between capitalportfolio and HSI
plot(LENOVO$Date,LENOVO$Rcapitalportfolio,type='l',col = "azure3",ylim=c(-17,15),ylab="Return (%)",xlab="Date",main = ("Market portfolio and HSI log return from Jan 2018 - Nov 2023"),col.main="antiquewhite4",cex.main = 1)
lines(HSI$Date,HSI$RHSI, lwd = 1,col = "bisque4")
legend("topright", legend=c("market portfolio", "HSI"),col=c("azure3","bisque4"),lty= 1,cex = 0.6)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

# logreturn comparison between capitalportfolio and HSI
HSI <- read_csv("~/Documents/Master_1_Dorian_Adjanski Original/finance Q1/Empirical finance/Travail Empirical finance/HSI.csv")

## Rows: 1437 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (6): Open, High, Low, Close, Adj Close, Volume
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

HSI$base2018 <- (HSI$Close/HSI$Close[[1]])*100
HSI$RHSI = c(NA,100*diff(log(HSI$Close)))
HSI = HSI[-1,]
plot(LENOVO$Date,LENOVO$capitalportfolio,type='l',col = "azure3",ylim=c(-50,250),ylab="2018 base 100",xlab="Date",main = ("Market portfolio and HSI price (base 100 of 2018) from Jan 2018 - Nov 2023"),col.main="antiquewhite4",cex.main = 1)
lines(HSI$Date,HSI$base2018, lwd = 1,col = "bisque4")
legend("topright", legend=c("market portfolio", "HSI"),col=c("azure3","bisque4"),lty= 1,cex = 0.6)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

In order to check whether the prices and returns of the two assets are indeed cointegrated or not, we will respectively regress the logarithms of the Hang Seng index price on those of the market portfolio studied and regress the logarithmic returns of the HSI on those of the portfolio :

On the basis of these two regressions, we will recover the residuals and test their stationarity using a Dickey Fuller test augmented by the following regressions:

If the residuals (ut and ωt) of points (1) and (2) show a unit root (P-value > 0.05), then we can significantly argue that the residuals are not stationary and that the two series are not cointegrated. If the P-value is less than 0.05, then the residuals are stationary and the two series are cointegrated. Here are the hypotheses of our test:

Null hypothesis: ψ = 0, the residuals have a unit root, which means that they are not stationary, so the series is not cointegrated.
Alternative hypothesis: ψ ≠ 0, the residuals are stationary, the series is cointegrated.

The value of the statistical test is calculated as follows and we will compare it with the critical values proposed by Fuller (1976).

By calculating and extracting our residuals on the basis of the model specifications, we obtain the following graphs:

# For the prices
LENOVO$Lcapitalportfolio <- log(LENOVO$capitalportfolio)
HSI$LHSI <- log (HSI$Close)
coin_regprice = lm(LENOVO$Lcapitalportfolio ~ HSI$LHSI)
summary(coin_regprice)

## 
## Call:
## lm(formula = LENOVO$Lcapitalportfolio ~ HSI$LHSI)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.59796 -0.25484  0.00096  0.22874  0.74738 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.75506    0.46479  18.837   <2e-16 ***
## HSI$LHSI    -0.39320    0.04597  -8.554   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2947 on 1434 degrees of freedom
## Multiple R-squared:  0.04855,    Adjusted R-squared:  0.04789 
## F-statistic: 73.17 on 1 and 1434 DF,  p-value: < 2.2e-16

adfTest(coin_regprice$residuals,lags=10,type="ct")

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 10
##   STATISTIC:
##     Dickey-Fuller: -1.7594
##   P VALUE:
##     0.6801 
## 
## Description:
##  Sun Jan 21 12:09:27 2024 by user:

# In the final step, we will test the stationarity of the residuals and find that they are not stationary.  => Not cointegrated

# For the returns
coin_regreturn = lm(LENOVO$Rcapitalportfolio[-1] ~ HSI$RHSI[-1])
summary(coin_regreturn)

## 
## Call:
## lm(formula = LENOVO$Rcapitalportfolio[-1] ~ HSI$RHSI[-1])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.6581 -0.7359 -0.0475  0.6586  6.8939 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.06679    0.03601   1.855   0.0638 .  
## HSI$RHSI[-1]  1.29705    0.02497  51.951   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.363 on 1433 degrees of freedom
## Multiple R-squared:  0.6532, Adjusted R-squared:  0.6529 
## F-statistic:  2699 on 1 and 1433 DF,  p-value: < 2.2e-16

adfTest(coin_regreturn$residuals,lags=10,type="ct")

## Warning in adfTest(coin_regreturn$residuals, lags = 10, type = "ct"): p-value
## smaller than printed p-value

## 
## Title:
##  Augmented Dickey-Fuller Test
## 
## Test Results:
##   PARAMETER:
##     Lag Order: 10
##   STATISTIC:
##     Dickey-Fuller: -11.3515
##   P VALUE:
##     0.01 
## 
## Description:
##  Sun Jan 21 12:09:27 2024 by user:

# => cointegrated

plot(coin_regprice$residuals,type='l',col = "darkorange3",ylab="residuals",xlab="",main = ("Residuals from Market portfolio ~  HSI prices regression"),col.main="antiquewhite4",cex.main = 1)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

plot(coin_regreturn$residuals,type='l',col = "darkgoldenrod",ylab="residuals",xlab="",main = ("Residuals from Market portfolio ~  HSI returns regression"),col.main="antiquewhite4",cex.main = 1)
mtext("Source: extracted from https://finance.yahoo.com", side = 1, line = -1, adj = 0, cex = 0.9)

It appears that the residuals obtained from point (1) are non-stationary while those obtained from point (2) appear to be stationary. Thanks to the augmented Dickey Fuller test, for the logarithmic price regression, we obtain a P-value of 0.68 (see appendix 2.7), so we cannot significantly reject the null hypothesis, the residuals ut are therefore not stationary, the logarithmic prices of HSI and the market-weighted portfolio are therefore not cointegrated. For the returns regression, we obtain a P-value of 0.01, we can therefore significantly reject the null hypothesis, the residuals ωt are therefore stationary, the returns of HSI and the market-weighted portfolio are therefore cointegrated.

In conclusion, the logarithmic prices of the two series are not cointegrated, so the most appropriate model is the one containing only the first differences of the variables, as they have no long-term relationship. Returns are cointegrated, so the most appropriate model is the error correction model (ECM), which includes a level term. The most appropriate models to represent the relationship between prices and returns can therefore be estimated as follows:

Part 3 : Event study

Question 3.1

Find an event during the last year of your sample that might have had an effect on your company stock prices (macro or micro event). Explain your intuition. ::: {.justify}

:::

Question 3.2

Compute the abnormal returns around this event (use an estimation window of 250 days; a buffet of 45 days : the event window comprises 10 days before the event up to 10 days after the event) based on the following methodologies :

Constant-mean
Market-model
Three-Factor Fama-French model

Graph and comment on your results. ::: {.justify}

:::

Question 3.3

Formally test the significance of the abnormal returns computed with the constant- mean methodology at a 95% significance level. Explain. ::: {.justify}

:::

Part 5 : Conclusion

Question

From your analysis and what has been seen in the course, develop and advise for any investor willing to buy this stock (1/2 page maximum). Don’t forget your report should be targeted at an investor willing to have detailed information about the company.

Empirical finance work - group E3

Dorian Adjanski, Alexandre Kerbusch

2024-01-20