When we want to examine the relationship between two non-stationary variables by running a regression model, we have the risk to end up with a non-valid - spurious - regression.
## Warning: package 'wbstats' was built under R version 4.0.5
# Mexico - Infant mortality
infantm<-wb_data(indicator = c("SP.DYN.IMRT.IN"),
country="MEX", start_date = 1980, end_date = 2020)
# Mexico - Export value
exports<-wb_data(indicator = c("TX.VAL.MRCH.XD.WD"),
country="MEX", start_date = 1980, end_date = 2020)The wb function brings a data frame with the requested data. We can plot the data to have an idea of these 2 variables:
Now run a regression using these series. Report the result of the regression.
##
## Call:
## lm(formula = exports$TX.VAL.MRCH.XD.WD ~ infantm$SP.DYN.IMRT.IN)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.331 -39.723 -7.456 37.848 77.194
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 275.1783 16.0408 17.16 < 2e-16 ***
## infantm$SP.DYN.IMRT.IN -6.1847 0.5327 -11.61 4.6e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 41.65 on 38 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.7801, Adjusted R-squared: 0.7743
## F-statistic: 134.8 on 1 and 38 DF, p-value: 4.602e-14
Did you find significant relationship? is your result what you expected?
AFTER LOOKING AT THE RESULTS OF THE MODEL, I CAN SAY THAT THE INFANT MORTALITY IS SIGNIFICANTLY AND NEGATIVELY RELATED TO THE EXPORTS IN MEXICO, THE T-VALUE OF B1 IS MUCH BIGGER THAN 2 WHICH REPRESENTS A REALLY STRONG RELATIONSHIP BETWEEN THIS VARIABLES. “THE LOWER THE INFANT MORTALITY THE HIGHER THE EXPORTS” ALTHOUGH WE HAVE THIS STATISTICAL EVIDENCE, WE CAN EASILY SAY THAT THIS IS REALLY WEIRD DUE TO THE FACT THAT IT´S NOT EXPECTED TO FIND A RELATIONSHIP BETWEEN INFANT MORTALITY AND EXPORTS IN MEXICO.
Using daily data of Mexico IPCyC market index and the S&P 500, examine whether two series are cointegrated. Generate an index for each instrument. To do these indexes, create a variable that represents how 1.00 peso or 1.00 dollar invested in each instrument would be changing over time.
For both cases, run a cointegration test and INTERPRET your results.
## Warning: package 'quantmod' was built under R version 4.0.3
## Loading required package: xts
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.0.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## [1] "^MXX" "^GSPC"
##
## Call:
## lm(formula = invmxx$MXX.Adjusted ~ invusa$GSPC.Adjusted)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.06656 -0.01835 0.00244 0.01944 0.06810
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.35668 0.01383 25.79 <2e-16 ***
## invusa$GSPC.Adjusted 0.69852 0.01311 53.29 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02702 on 672 degrees of freedom
## Multiple R-squared: 0.8086, Adjusted R-squared: 0.8083
## F-statistic: 2840 on 1 and 672 DF, p-value: < 2.2e-16
##
## Augmented Dickey-Fuller Test
##
## data: m1$residuals
## Dickey-Fuller = -3.5752, Lag order = 0, p-value = 0.03508
## alternative hypothesis: stationary
Están cointegrados, se ouede confar en el modelo
AFTER LOOKING AT THE RESULTS OF THE MODEL AND AT THE RESIDUALS WE CAN SAY THAT THERE´S COINTEGRATION DUE TO THE P-VALUE GREATER THAN 0.05 WHICH REPRESENTS STATIONARITY,AND WE CAN EASILY TRUST IN OUR MODEL. FOR EACH 1 MXN YOU GET 0.69852 USD.
getSymbols(Symbols<-c("^MXX", "^GSPC"), periodicity= "daily", from = "2017-10-03", to = "2018-02-28")## [1] "^MXX" "^GSPC"
firstmxx_1 = as.numeric(MXX$MXX.Adjusted[1])
firstusa_1 = as.numeric(GSPC$GSPC.Adjusted[1])
invmxx_1 <- data_1$MXX.Adjusted / firstmxx_1
invusa_1 <- data_1$GSPC.Adjusted / firstusa_1##
## Call:
## lm(formula = invmxx_1$MXX.Adjusted ~ invusa_1$GSPC.Adjusted)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.037454 -0.014008 -0.002567 0.017749 0.040777
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.79716 0.05928 13.447 < 2e-16 ***
## invusa_1$GSPC.Adjusted 0.16206 0.05640 2.874 0.00501 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02006 on 95 degrees of freedom
## Multiple R-squared: 0.07997, Adjusted R-squared: 0.07028
## F-statistic: 8.257 on 1 and 95 DF, p-value: 0.005008
##
## Augmented Dickey-Fuller Test
##
## data: m2$residuals
## Dickey-Fuller = -2.0418, Lag order = 0, p-value = 0.5592
## alternative hypothesis: stationary
AFTER LOOKING AT THE RESULTS OF THE MODEL AND AT THE RESIDUALS WE CAN SAY THAT THERE´S NO COINTEGRATION DUE TO THE P-VALUE SMALLER THAN 0.05 WHICH REPRESENTS A LACK OF STATIONARITY,AND WE CAN EASILY SAY THE RESULTS OF OUR MODEL ARE NOT RELIABLE. (SPURIOUS RELATIONSHIP)
Once we have an idea about cointegration, we will review how to form a portfolio of 2 assets and calculate its holding return over time. The holding period return of an asset from day 1 to day N (HPR) can be calculated with any of the following ways:
HPR = (Price of the stock at day N / Price of the stock at day 1) - 1
HPR = exp ( Sum of all continuous compounded returns from day 1 to day N) - 1
The expected return of a portfolio of 2 assets is estimated as a weighted average of the expected stock returns:
E[Rp]=w1∗E[R1]+w2∗E(R2)
The holding return of a portfolio of 2 assets for a specific period of time is estimated as a weighted average of the holding returns of the assets:
HPRp=w1∗HPR1+w2∗HPR2
Download daily prices from CEMEX and ALFA from Jan 1, 2015 to Dec 31, 2017.
Calculate their holding period returns of both stocks
getSymbols(Symbols<-c("CEMEXCPO.MX", "ALFAA.MX"), periodicity= "daily", from = "2015-01-01", to = "2017-12-31")## [1] "CEMEXCPO.MX" "ALFAA.MX"
CEMEX STOCK
n <- (nrow(CEMEXCPO.MX))
price0 <- as.numeric(CEMEXCPO.MX$CEMEXCPO.MX.Adjusted[1])
pricen <- as.numeric(CEMEXCPO.MX$CEMEXCPO.MX.Adjusted[n])
HPRCEMEX <- ((pricen /price0)-1)*100
print(paste("HPR of CEMEX = ", HPRCEMEX))## [1] "HPR of CEMEX = 12.6396524035649"
ALFA STOCK
n_1 <- (nrow(ALFAA.MX))
price0_1 <- as.numeric(ALFAA.MX$ALFAA.MX.Adjusted[1])
pricen_1 <- as.numeric(ALFAA.MX$ALFAA.MX.Adjusted[n])
HPRALFAA <- ((pricen_1 /price0_1)-1 )*100
print(paste("HPR of ALFAA = ", HPRALFAA))## [1] "HPR of ALFAA = -32.2857992258085"
Create a portfolio 1 assigning 30% to CEMEX and 70% for ALFA and calculate the HPR of the portfolio
## [1] "HPR of my portfolio 1 -18.8081637369965"
Create a portfolio 2 assigning -100% to CEMEX and +200% to ALFA and calcuate the HPR of this portfolio
## [1] "HPR of my portfolio 2 -77.2112508551819"
What does a negative sign mean a portfolio? Briefly explain with the previous example.
THE NEGATIVE SIGN REPRESENTS A SALE WE DO, TO BUY MORE OF THE OTHER STOCK.
dataset<- merge(ALFAA.MX$ALFAA.MX.Adjusted, CEMEXCPO.MX$CEMEXCPO.MX.Adjusted)
reg1 <- lm( dataset$ALFAA.MX.Adjusted ~ dataset$CEMEXCPO.MX.Adjusted)
summary(reg1)##
## Call:
## lm(formula = dataset$ALFAA.MX.Adjusted ~ dataset$CEMEXCPO.MX.Adjusted)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.9402 -1.0178 0.5681 1.8485 5.5081
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.9127 0.5181 82.83 <2e-16 ***
## dataset$CEMEXCPO.MX.Adjusted -1.0468 0.0382 -27.40 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.784 on 752 degrees of freedom
## Multiple R-squared: 0.4996, Adjusted R-squared: 0.4989
## F-statistic: 750.8 on 1 and 752 DF, p-value: < 2.2e-16
##
## Augmented Dickey-Fuller Test
##
## data: reg1$residuals
## Dickey-Fuller = -3.8726, Lag order = 0, p-value = 0.01537
## alternative hypothesis: stationary
Then, if this is the case, what can you do to take advantage in financial trading?
I CAN CONCLUDE THAT IF THE STOCKS SEPARATE CEMEX WOULD GO UP AND ALFA DOWN
If you were to invest from Jan 1, 2018 to Feb 28, 2018 in a portfolio of these stocks, which weights would you assign?
rm(list=ls())
getSymbols(("CEMEXCPO.MX"), periodicity= "daily", from = "2018-01-01", to = "2018-02-28")## [1] "CEMEXCPO.MX"
## [1] "ALFAA.MX"
n <- (nrow(CEMEXCPO.MX))
price0 <- as.numeric(CEMEXCPO.MX$CEMEXCPO.MX.Adjusted[1])
pricen <- as.numeric(CEMEXCPO.MX$CEMEXCPO.MX.Adjusted[n])
HPRCEMEX <- ((pricen /price0)-1)*100
print(paste("HPR of CEMEX = ", HPRCEMEX))## [1] "HPR of CEMEX = -15.7929656158527"
n_1 <- (nrow(ALFAA.MX))
price0_1 <- as.numeric(ALFAA.MX$ALFAA.MX.Adjusted[1])
pricen_1 <- as.numeric(ALFAA.MX$ALFAA.MX.Adjusted[n])
HPRALFAA <- ((pricen_1 /price0_1)-1 )*100
print(paste("HPR of ALFAA = ", HPRALFAA))## [1] "HPR of ALFAA = 1.18954999293077"
I WOULD ASIGN +200% TO ALFA AND -100% TO CEMEX
## [1] "HPR of my Portafolio = 18.1720656017143"