1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document (File -> New File -> R Notebook), where you have to write whatever is asked in this workshop.

You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

You have to keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file. Open R studio and open a new R-script. Write your Name and the Workshop name as comments at the top. Respond questions in CAPITAL LETTERS in your R-Script. Follow directions.

# To clear our environment we use the remove function rm:
rm(list=ls())
# To avoid scientific notation for numbers: 
options(scipen=999)

2 Run a regression with historical data

2.1 Data collection

We first load the quantmod package and download monthly price data for Alfa and the Mexican market index. We also merge both datasets into one:

# load package quantmod
library(quantmod)

# Download the data
getSymbols(c("ALFAA.MX", "^MXX"), from="2016-01-01", to= "2021-01-31", periodicity="monthly", src="yahoo")
## [1] "ALFAA.MX" "^MXX"
#Merge both xts-zoo objects into one dataset, but selecting only adjusted prices:

adjprices<-Ad(merge(ALFAA.MX,MXX))

2.2 Return calculation

We calculate continuously returns for both, Alfa and the IPCyC:

returns <- diff(log(adjprices)) 
#I dropped the na's:
returns <- na.omit(returns)

#I renamed the columns:
colnames(returns) <- c("ALFAA", "MXX")

2.3 Q Visualize the relationship

Do a scatter plot putting the IPCyC returns as the independent variable (X) and the stock return as the dependent variable (Y). We also add a line that better represents the relationship between the stock returns and the market returns.Type:

plot.default(x=returns$MXX,y=returns$ALFAA, xlim=c(-0.30,0.30) )
abline(lm(returns$ALFAA ~ returns$MXX),col='blue')

# As you see, I indicated that the Market returns goes in the X axis and 
#   Alfa returns in the Y axis. 
# The independent variable is the market returns, while
#   the dependent variable is the stock return

Using the lm() function, run a simple regression model to see how the monthly returns of the stock are related with the market return.

Assign your market model to an object named “reg”.

reg <- lm(ALFAA ~ MXX, data=returns)
# Or by calling the return objects by itself:
reg <- lm(returns$ALFAA ~ returns$MXX)
sumreg<-summary(reg)
sumreg
## 
## Call:
## lm(formula = returns$ALFAA ~ returns$MXX)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.31927 -0.04455 -0.01824  0.03004  0.34653 
## 
## Coefficients:
##             Estimate Std. Error t value      Pr(>|t|)    
## (Intercept) -0.01541    0.01235  -1.248         0.217    
## returns$MXX  1.90225    0.27196   6.995 0.00000000299 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09568 on 58 degrees of freedom
## Multiple R-squared:  0.4576, Adjusted R-squared:  0.4482 
## F-statistic: 48.92 on 1 and 58 DF,  p-value: 0.000000002993

The first parameter of the function is the DEPENDENT VARIABLE (in this case, the stock return), and the second parameter must be the INDEPENDENT VARIABLE, also named the EXPLANATORY VARIABLE (in this case, the market return).

3 Create simulated distributions of the independent variable with random data

3.1 Base-case scenario

First, I get the descriptive statistics from market returns

mean_rMXX <- mean(returns$MXX, na.rm=TRUE) # arithmetic mean
sd_rMXX <- sd(returns$MXX, na.rm=TRUE) # standard deviation

# Note that the na.rm argument is set to TRUE. This means that NA values will be removed.

With the real mean, and standard deviation of monthly returns of IPC, create (simulate) a random variable with that mean and standard deviation for the same time period. Use the rnorm function for this:

rMXX_sim <- rnorm(n=nrow(returns), mean = mean_rMXX, sd=sd_rMXX)

# We will use the same number of observations as returns
# The nrow function gets the number of rows of an R object

Do a histogram of the simulated returns:

# Calculate the histograms and store their information in variables (don't plot yet)
hist_rMXX_sim <- hist(rMXX_sim,plot = FALSE)
hist_rMXX <- hist(returns$MXX,plot = FALSE)

# Calculate the range of the graph
xlim <- range(hist_rMXX$breaks,hist_rMXX_sim$breaks)
ylim <- range(0,hist_rMXX$density,
              hist_rMXX_sim$density)

# Plot the first histogram
plot(hist_rMXX_sim,xlim = xlim, ylim = ylim,
     col = rgb(1,0,0,0.4),xlab = 'Lengths',
     freq = FALSE, ## relative, not absolute frequency
     main = 'Distribution of simulated and real IPC Returns')

# Plot the second histogram on top of the 1st one
opar <- par(new = FALSE)
plot(hist_rMXX,xlim = xlim, ylim = ylim,
     xaxt = 'n', yaxt = 'n', ## don't add axes
     col = rgb(0,0,1,0.4), add = TRUE,
     freq = FALSE) ## relative, not absolute frequency

# Add a legend in the corner
legend('topleft',c('Simulated Returns','Real Returns'),
       fill = rgb(1:0,0,0:1,0.4), bty = 'n')

3.2 Pessimistic and optimistic scenarios

Now, create (simulate) a random variable for a pessimistic scenario. The mean for this distribution will be one standard deviation below from the real mean of IPC and the standard deviation of monthly returns will be 10% higher than the original from IPC:

mean_rMXX_pes <- mean_rMXX-sd_rMXX
sd_rMXX_pes <- sd_rMXX*1.1

rMXX_sim_pes <- rnorm(n=nrow(returns), mean = mean_rMXX_pes, sd=sd_rMXX_pes)

Now, create (simulate) a random variable for a optimistic scenario. The mean for this distribution will be one standard deviation above from the real mean of IPC and the standard deviation of monthly returns will be 10% lower than the original from IPC:

mean_rMXX_op <- mean_rMXX+sd_rMXX
sd_rMXX_op <- sd_rMXX*.9

rMXX_sim_op <- rnorm(n=nrow(returns), mean = mean_rMXX_op, sd=sd_rMXX_op)

Calculate the mean, standard deviation and variance of retorns of three scenarios using the summary command:

summary(rMXX_sim)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.110911 -0.038114  0.001229 -0.004970  0.027905  0.085967
summary(rMXX_sim_pes)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.165673 -0.071793 -0.040584 -0.036294 -0.006951  0.123153
summary(rMXX_sim_op)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.04799  0.01823  0.04760  0.04727  0.07488  0.13470

Do a histogram of the simulated returns comparing pessimistic with base-case scenario:

# Calculate the histograms and store their information in variables (don't plot yet)
hist_rMXX_sim <- hist(rMXX_sim,plot = FALSE)
hist_rMXX_sim_pes <- hist(rMXX_sim_pes,plot = FALSE)

# Calculate the range of the graph
xlim <- range(hist_rMXX_sim_pes$breaks,hist_rMXX_sim$breaks)
ylim <- range(0,hist_rMXX_sim_pes$density,
              hist_rMXX_sim$density)

# Plot the first histogram
plot(hist_rMXX_sim,xlim = xlim, ylim = ylim,
     col = rgb(1,0,0,0.4),xlab = 'Lengths',
     freq = FALSE, ## relative, not absolute frequency
     main = 'Distribution of simulated IPC Returns')

# Plot the second histogram on top of the 1st one
opar <- par(new = FALSE)
plot(hist_rMXX_sim_pes,xlim = xlim, ylim = ylim,
     xaxt = 'n', yaxt = 'n', ## don't add axes
     col = rgb(0,0,1,0.4), add = TRUE,
     freq = FALSE) ## relative, not absolute frequency

# Add a legend in the corner
legend('topleft',c('Returns in Base-case scenario','Returns in pessimistic scenario'),
       fill = rgb(1:0,0,0:1,0.4), bty = 'n')

Do a histogram of the simulated returns comparing optimistic with base-case scenario:

# Calculate the histograms and store their information in variables (don't plot yet)
hist_rMXX_sim <- hist(rMXX_sim,plot = FALSE)
hist_rMXX_sim_op <- hist(rMXX_sim_op,plot = FALSE)

# Calculate the range of the graph
xlim <- range(hist_rMXX_sim_op$breaks,hist_rMXX_sim$breaks)
ylim <- range(0,hist_rMXX_sim_op$density,
              hist_rMXX_sim$density)

# Plot the first histogram
plot(hist_rMXX_sim,xlim = xlim, ylim = ylim,
     col = rgb(1,0,0,0.4),xlab = 'Lengths',
     freq = FALSE, ## relative, not absolute frequency
     main = 'Distribution of simulated IPC Returns')

# Plot the second histogram on top of the 1st one
opar <- par(new = FALSE)
plot(hist_rMXX_sim_op,xlim = xlim, ylim = ylim,
     xaxt = 'n', yaxt = 'n', ## don't add axes
     col = rgb(0,0,1,0.4), add = TRUE,
     freq = FALSE) ## relative, not absolute frequency

# Add a legend in the corner
legend('topleft',c('Returns in Base-case scenario','Returns in optimistic scenario'),
       fill = rgb(1:0,0,0:1,0.4), bty = 'n')

4 Predict using simulated data

4.1 Estimate the simulated returns for the dependent variable

Create a new object with the expected returns for ALFA in the base-case scenario. For that, substitute the base-case simulations in the regression:

base_ALFA <- sumreg$coefficients[1,1] + sumreg$coefficients[2,1]*rMXX_sim
summary (base_ALFA)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.22639 -0.08792 -0.01308 -0.02487  0.03767  0.14812

Create a new object with the expected returns for ALFA in the pessimistic scenario. For that, substitute the pessimistic simulations in the regression:

pes_ALFA <- sumreg$coefficients[1,1] + sumreg$coefficients[2,1]*rMXX_sim_pes
summary (pes_ALFA)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.33057 -0.15198 -0.09261 -0.08445 -0.02863  0.21886

Create a new object with the expected returns for ALFA in the optimistic scenario. For that, substitute the optimistic simulations in the regression:

op_ALFA <- sumreg$coefficients[1,1] + sumreg$coefficients[2,1]*rMXX_sim_op
summary (op_ALFA)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.10670  0.01926  0.07513  0.07450  0.12702  0.24082

4.2 Histograms of the simulated returns for the dependent variable

Do a histogram of the simulated returns of ALFA, comparing pessimistic with base-case scenario:

# Calculate the histograms and store their information in variables (don't plot yet)
hist_base_ALFA <- hist(base_ALFA,plot = FALSE)
hist_pes_ALFA <- hist(pes_ALFA,plot = FALSE)

# Calculate the range of the graph
xlim <- range(hist_pes_ALFA$breaks,hist_base_ALFA$breaks)
ylim <- range(0,hist_pes_ALFA$density,
              hist_base_ALFA$density)

# Plot the first histogram
plot(hist_base_ALFA,xlim = xlim, ylim = ylim,
     col = rgb(1,0,0,0.4),xlab = 'Lengths',
     freq = FALSE, ## relative, not absolute frequency
     main = 'Distribution of simulated ALFA Returns')

# Plot the second histogram on top of the 1st one
opar <- par(new = FALSE)
plot(hist_pes_ALFA,xlim = xlim, ylim = ylim,
     xaxt = 'n', yaxt = 'n', ## don't add axes
     col = rgb(0,0,1,0.4), add = TRUE,
     freq = FALSE) ## relative, not absolute frequency

# Add a legend in the corner
legend('topleft',c('Returns in Base-case scenario','Returns in pessimistic scenario'),
       fill = rgb(1:0,0,0:1,0.4), bty = 'n')

Do a histogram of the simulated returns of ALFA, comparing optimistic with base-case scenario:

# Calculate the histograms and store their information in variables (don't plot yet)
hist_base_ALFA <- hist(base_ALFA,plot = FALSE)
hist_op_ALFA <- hist(op_ALFA,plot = FALSE)

# Calculate the range of the graph
xlim <- range(hist_op_ALFA$breaks,hist_base_ALFA$breaks)
ylim <- range(0,hist_op_ALFA$density,
              hist_base_ALFA$density)

# Plot the first histogram
plot(hist_base_ALFA,xlim = xlim, ylim = ylim,
     col = rgb(1,0,0,0.4),xlab = 'Lengths',
     freq = FALSE, ## relative, not absolute frequency
     main = 'Distribution of simulated ALFA Returns')

# Plot the second histogram on top of the 1st one
opar <- par(new = FALSE)
plot(hist_op_ALFA,xlim = xlim, ylim = ylim,
     xaxt = 'n', yaxt = 'n', ## don't add axes
     col = rgb(0,0,1,0.4), add = TRUE,
     freq = FALSE) ## relative, not absolute frequency

# Add a legend in the corner
legend('topleft',c('Returns in Base-case scenario','Returns in optimistic scenario'),
       fill = rgb(1:0,0,0:1,0.4), bty = 'n')

5 Q Interpretation

Provide a detailed interpretation of your results.