library(quantmod)
library(tidyverse)
rm(list=ls())
= c("WMT","TSLA")
tickers getSymbols(Symbols = tickers, periodicity="monthly",from="2020-01-01", to="2024-10-31")
[1] "WMT" "TSLA"
<- lapply(tickers, get) datasets_list
Using an RNotebook you have to:
Replicate all the R Code along with its output.
You have to do whatever is asked in the workshop. It can be: a) Responses to specific questions and/or do an exercise/challenge.
Any QUESTION or any INTERPRETATION you need to do will be written in CAPITAL LETTERS. For ANY QUESTION or INTERPRETATION, you have to RESPOND IN CAPITAL LETTERS right after the question.
You have to submit ONLY the .html version of your .Rmd file.
A portfolio is a set of 2 or more financial assets. A financial asset can of any type such as stock, bond, risk-free instrument, derivative, commodity. The portfolio owner needs to decide how much money allocate to each individual asset. The main advantage of financial portfolios is the possibility to diversify risk, while maintaining an expected rate of return.
In this workshop we review practical easy examples of 2-asset and 3-asset portfolios with the purpose to illustrate portfolio theory and portfolio optimization.
It is recommended that you review the lecture notes posted in the course site where portfolio theory, matrix algebra, and portfolio optimization are explain in more detail.
Let’s start with an example of calculating historical returns of a 2-asset portfolio. Imagine that we want to calculate how much return you would have realized if you had invested in the following portfolio between January 2019 to December 2020:
Wal Mart: 70%
Tesla: 30%
We start downloading historical monthly prices from Yahoo Finance:
library(quantmod)
library(tidyverse)
rm(list=ls())
= c("WMT","TSLA")
tickers getSymbols(Symbols = tickers, periodicity="monthly",from="2020-01-01", to="2024-10-31")
[1] "WMT" "TSLA"
<- lapply(tickers, get) datasets_list
Now we merge all datasets using the function call. I do this to generalize our program for any number of tickers:
<- do.call(merge, datasets_list)
prices # We select only Adjusted prices:
<- Ad(prices)
prices
# We change the name of the columns to be equal to the tickers vector:
names(prices) = tickers
We calculate holding period returns for both stocks. We can divide the price of each stock by its own price in the first month and then subtract 1:
= as.vector(prices[1,])
firstprices
= sweep(prices,2,firstprices,'/') - 1
HPR
#Also, the HPR can be calculated in the prices dataset as:
#prices$HPRWMT = prices$WMT / as.numeric(prices$WMT[1]) - 1
#prices$HPRTSLA = prices$TSLA / as.numeric(prices$TSLA[1]) - 1
The sweep function divides each row of prices by the fistprices vector. The parameter 2 indicates that the division is by row.
We set the weight for each stock to form a portfolio:
= 0.70
w1 = 0.30 w2
The holding period return for each month of this portfolio would be the weighted average of the holding period stock returns:
PortfolioHPR_t=0.7*HPR1_t+0.3*HPR2_t At the end of the first month the weights assigned to each stock will change depending on the return of each stock. For example, if stock 1 had a much higher return than stock 2, then the weight w1 will increase to more than 70% and weight w2 will be less than 30%.
We can calculate the holding portfolio return as follows:
$Port = w1 * HPR$WMT + w2 * HPR$TSLA HPR
The historical holding portfolio monthly return would look like:
plot(HPR$Port)
We can calculate how much $1 invested in our portfolio would have grew over time. We can use the HPR of the portfolio and add 1 to get a growth factor for each month:
$invport = 1 * (1+HPR$Port)
HPR
plot(HPR$invport)
We can do the previous portfolio return calculations and the same plot using the PerformanceAnalytics package:
library(PerformanceAnalytics)
# I create a vector with the portfolio weights:
= c(w1,w2)
w
# I create the monthly return for each stock:
= prices / lag(prices,1) - 1
R
# I calculate the portfolio historical returns using the weight vector and the stock historical returns:
<- Return.portfolio(R,weights=w)
PR
# The charts.PerformanceSummary function calculates and plots the $1.0 performance invested in the portfolio:
charts.PerformanceSummary(PR,
main = "Performance of $1.00 over time",
wealth.index = TRUE)
This plot shows the same portfolio performance than the previous one, but with this we can also see the monthly return and the drowdowns, which are the negative % declines over time.
The historical risk of a portfolio can be measured with the standard deviation of the portfolio historical period returns. Another important measure for portfolio risk is Value at Risk, but we will not cover that measure in this workshop.
I recommend to use the continuously compounded (cc) returns to calculate the standard deviation of returns. The main reason is that cc returns behave more like a normal distribution, and also because cc returns are more conservative measure than simple returns: for negative returns, cc returns have higher magnitude, and for positive returns, cc returns are less than simple returns.
Then, we can calculate the cc returns of the monthly portfolio returns as follows. We can use the monthly returns calculated by the Return.Portfolio function and convert it to continuously compounded returns:
= log(1+PR$portfolio.returns) pr
Now we can use the table.Stats from PeformanceAnalytics to quickly calculate the standard deviation and also other important descriptive statistics of returns:
= table.Stats(pr)
returns_statistics returns_statistics
portfolio.returns
Observations 58.0000
NAs 0.0000
Minimum -0.2776
Quartile 1 -0.0597
Median 0.0151
Arithmetic Mean 0.0208
Geometric Mean 0.0144
Quartile 3 0.0837
Maximum 0.3188
SE Mean 0.0152
LCL Mean (0.95) -0.0097
UCL Mean (0.95) 0.0513
Variance 0.0134
Stdev 0.1159
Skewness 0.3066
Kurtosis 0.1916
"Stdev",] returns_statistics[
[1] 0.1159
Also we can use the sd and the mean function:
= sd(pr,na.rm = TRUE)
portfolio_volatility portfolio_volatility
[1] 0.1159008
= mean(pr,na.rm=TRUE)
portfolio_mean_return portfolio_mean_return
[1] 0.02081747
Remember that in Finance, the standard deviation of returns is usually called volatility.
We can say that the monthly volatility of this portfolio composed of 70% of Wal Mart and 30% of Tesla calculated from 2018 up to Oct 2022 was 11.5900759%.
We can complement this including the mean historical return of the portfolio. We can say that this portfolio had an average monthly return of 2.0817467%, and a monthly volatility of 11.5900759%.
We can also appreciate the historical risk of a portfolio if we do a Box Plot of its monthly returns:
chart.Boxplot(PR)
We can compare this portfolio risk with the individual stock risk:
chart.Boxplot(merge(R,PR))
The box contains the returns between the quartile 1 (Q1) and the quartile 3 (Q3), while the mean return is the red point and the mdian (Q2) is the mid vertical line.
Can you appreciate the risk of the stocks and the portfolio?
It is very important to be aware of the granularity of the historical data used to calculate volatility and mean returns. Granularity of a dataset refers to the frequency of periods. In this case, we are using monthly returns, so the granularity is monthly.
Then, we can have a better interpretation of portfolio volatility and portfolio return if we convert both values from monthly to annual. It is very common to report annual mean portfolio return and annual portfolio volatility based on monthly data.
To convert monthly portfolio mean return to annual portfolio mean return, we just multiply the monthly average return times 12:
= 12 * portfolio_mean_return
annual_portfolio_mean_return annual_portfolio_mean_return
[1] 0.2498096
However, to convert from monthly to annual volatility, we need to multiply by the square root of the numbers of the periods in the year, in this case, 12:
= sqrt(12) * portfolio_volatility
annual_portfolio_volatility annual_portfolio_volatility
[1] 0.401492
Now we can do a better interpretation using these annual figures:
We can say that this portfolio had an average annual return of 24.9809605%, and an annual volatility of 40.1492005%.
In the following sections we review what is variance and standard deviation of returns, and how they are calculated (without using a function)
Variance of any variable is actually the arithmetic mean of squared deviations. A squared deviation is the value resulting from subtracting the value of the variable minus its mean, and the square the value.
The mean of returns is estimated as:
\bar{r} =\frac{r_{1}+r_{2}+...+r_{N}}{N}
The variance is estimated as:
VAR(r)=\frac{(r_{1}-\bar{r})^{2}+(r_{2}-\bar{r})^{2}+...+(r_{N}-\bar{r})^{2}}{N}
Variance is a measure of dispersion. The higher the variance, the more dispersed the values from its mean. It is hard to interpret the magnitude of variance. That is the reason why we need to calculate standard deviation, which is basically the squared root of the variance.
Standard deviation of a variable is the squared root of the variance of the variable:
SD(r)=\sqrt{VAR(r)} Then, the standard deviation of returns can be interpreted as a standardized average distance from each value of the variable from its mean.
The standard deviation of returns is called volatility. Then, volatility of returns tells us an idea of how much on average (above or below) the period returns move from its mean.
We can calculate volatility of a single stock, or volatility of a portfolio composed of 2 or more stocks.
The Sharpe ratio is a standardized measure of portfolio premium return after considering its volatility.
A premium return is the return above the risk-free rate. In Mexico the risk-free rate is the CETES; in the US, the risk-free rate is the Treasury Bills.
Then, the Sharpe ratio tells us how much portfolio returns (above the risk free rate) we can expect for each percent point of volatility. Let’s see the formula:
SharpeRatio=\frac{(PortfolioReturn-riskfreeRate)}{PortfolioVolatility}
Up to now we have calculated historical portfolio returns and risk. Here we review how to estimate future expected portfolio returns and risk based on Portfolio Theory developed by Harry Markowitz.
We will use the simpler method to estimate the expected return of each stock, which is the geometric mean of historical returns. Other methods to estimate expected stock return are a) the CAPM regression model, b) ARIMA models.
The geometric mean of historical returns is the average period return needed so that holding an investment at that geometric mean return per period, we will get the final holding return of the stock.
The mathematical formula of geometric mean return is:
GeomMean(R)=\sqrt[N]{(1+R_{1})(1+R_{2})...(1+R_{N})}-1 Where R_t is the historical return for period t. In this formula we have N historical periods (can be months)
Another easier way to calculate geometric mean of returns is to calculate the arithmetic mean of continuously compounded returns, and then convert the result to simple return by applying the exponential function:
GeomMean(R)=e^{\bar{r}}-1 Where \bar{r} is the arithmetic mean of historical returns:
\bar{r}=\frac{r_{1}+r_{2}+...+r_{N}}{N}
Let’s do an example. Calculate the expected monthly return of Wal Mart and Tesla using the same historical data from 2018 to Oct 2022.
We need to calculate the continuously compounded returns of each stock:
#Calculating continuously compounded returns:
= diff(log(prices)) r
Now we get the expected return for each stock as the geometric mean of their historical returns:
= colMeans(r,na.rm=TRUE) ccmean_returns
colMeans calculate the arithmetic mean of 1 or more columns of a dataset.
Once we have the arithmetic mean cc returns, we convert them to simple returns:
= exp(ccmean_returns) - 1
ER ER
WMT TSLA
0.01479262 0.03119713
Now that we have individual expected returns we can estimate the expected return of a portfolio composed of the stocks.
The expected portfolio returns is the weighted average of the individual expected return of the stocks of the portfolio.
Imagine we have a 2-stock portfolio composed as follows:
WalMart: 70%
Tesla: 30%
We use the weights (%) allocated for each asset to calculate the weighted average as the portfolio expected return:
= 0.7
w1 = 0.3
w2 = w1 * ER[1] + w2 * ER[2]
ER_Portfolio # ER is a vector of 2 numbers: the expected return of stock 1 and the expected return of stock 2
names(ER_Portfolio)=c("ER_Portfolio")
ER_Portfolio
ER_Portfolio
0.01971397
Then, the expected return of this portfolio for the future month is 1.9713969%.
Another way to calculate the expected return of a portfolio is using matrix algebra. This is a very useful method when we have many assets in the portfolio since it is very easy to compute.
If you do not remember how to multiply matrices, it is strongly recommended to review this (you can read Note 2 of Portfolio Theory).
Matrix multiplication is used to compute sum of products or sum of multiplications.
For example, the way to estimate the expected return of our portfolio is the following:
ERPort=t\left(W\right)*ER
ERPort=\begin{bmatrix}w_{1} & w_{2}\end{bmatrix}*\begin{bmatrix}ER_{1}\\ ER_{2} \end{bmatrix}
We compute this in R as follows:
# We set a vector composed of the asset weights:
= c(w1,w2)
W # We multiply the transpose of the weight vector times the vector of expected returns
= t(W) %*% ER
ER_portfolio ER_portfolio
[,1]
[1,] 0.01971397
The transposed of a matrix or a vector is the conversion of the rows by columns and columns by rows. Transposing is like rotating a vector or a matrix 90 degrees:
t\left(\left[\begin{array}{c} w_1\\ w_2 \end{array}\right]\right)=\left[\begin{array}{cc} w_1 & w_2\end{array}\right]
Then, the previous matrix multiplication was:
\begin{bmatrix}w_{1} & w_{2}\end{bmatrix}*\begin{bmatrix}ER_{1}\\ ER_{2} \end{bmatrix}
\begin{bmatrix}0.7 & 0.3\end{bmatrix}*\begin{bmatrix}0.00775\\ 0.05588 \end{bmatrix}=0.7*0.00775+0.3*0.05588
With this multiplication we got the same expected portfolio return as above.
Before we calculate the expected portfolio risk we need to understand what is the expected portfolio variance.
According to Portfolio Theory, the expected variance of a 2-asset portfolio returns is given by:
VAR(PortReturns)=w_{1}^{2}VAR(r_{1})+w_{2}^{2}VAR(r_{2})+2w_{1}w_{2}COV(r_{1},r_{2}) r_1 refers to returns of stock 1, and r_2 refers to returns of stock 2.
COV(r_1,r_2) is the covariance of return 1 and return 2.
Check the lecture note Basics of Portfolio Theory-Note 1 to understand why this is the way to estimate the expected variance of a 2-asset portfolio.
It is worth to remember what is covariance.
The covariance of 2 stock returns is the arithmetic mean of the product return deviations. A deviation is the difference between the stock return in a period t and its mean. Here is the formula:
COV(r_{1},r_{2})=\frac{(r_{(1,1)}-\bar{r_{1}})(r_{(2,1)}-\bar{r_{2}})+(r_{(1,2)}-\bar{r_{1}})(r_{(2,2)}-\bar{r_{2}})+(r_{(1,3)}-\bar{r_{1}})(r_{(2,3)}-\bar{r_{2}})+...}{N}
Were:
r_{(1,1)} is the return of stock 1 in the period 1
r_{(2,1)} is the return of stock 2 in the period 1
Then:
r_{(i,j)} is the return of stock i in the period j
\bar{r_{1}} is the average return of stock 1
\bar{r_{2}} is the average return of stock 2
Then, in the numerator we have a sum of product deviations. Each product deviation is the deviation of the stock 1 return multiplied by the deviation of the stock 2 return.
The covariance is a measure of linear relationship between 2 variables. If covariance between stock return 1 and stock return 2 is positive this means that both stock returns are positively related. In other words, when stock 1 return moves up it is likely that stock 2 return moves up and vice versa; both returns move in the same direction (not always, but mostly).
If covariance is negative that means that stock 1 return is negatively related to stock 2 return; when stock 1 return moves up it is likely that stock 2 return moves down.
Covariance can be a negative or positive number (it is not limited to any number). It is very difficult to interpret the magnitude of covariance. It is much more intuitive if we standardize covariance.
The standardization of the covariance is called correlation.
Correlation of 2 stock returns is also a measure of linear relationship between both returns. The difference compared to covariance is that the possible values of correlation is between -1 and +1, and the correlation gives us a percentage of linear relationship.
Correlation between 2 returns is the covariance between the 2 returns divided by the product of standard deviation of return 1 and standard deviation of return 2.
We calculate correlation as follows:
CORR(r_{1},r_{2})=\frac{COV(r_{1},r_{2})}{SD(r_{1})SD(r_{2})}
Correlation has the following possible values:
-1<=CORR(r_{1},r_{2})<=+1
Now we can calculate the expected variance of our portfolio according to the previous formula:
= var(r$WMT,na.rm=TRUE)
VAR1 = var(r$TSLA,na.rm=TRUE)
VAR2 = var(x=r$WMT,y=r$TSLA,na.rm=TRUE)
COV
= w1^2 * VAR1 + w2^2 * VAR2 + 2*w1*w2*COV
VarPortfolio VarPortfolio
WMT
WMT 0.006121081
Another faster method to get the expected variance of a portfolio is by multiplying the following matrices:
t\left(W\right)*COV*W Where:
W=\begin{bmatrix}w_{1}\\ w_{2} \end{bmatrix}
And COV is the Variance-Covariance matrix, which has the return variances in the diagonal and the pair correlations in the non-diagonal:
COV=\begin{bmatrix}VAR(r_{2}) & COV(r_{1},r_{2})\\ COV(r_{2},r_{1}) & VAR(r_{1}) \end{bmatrix}
We can easily calculate the expected portfolio variance with matrix multiplication in R as follows:
# We calculate the Variance-Covariance matrix with the var function:
=var(r,na.rm=TRUE)
COV # Note var function calculates the variance-covariance matrix if r has more than 1 column.
# If r has only 1 column, then var calculates only the variance of that column.
COV
WMT TSLA
WMT 0.002988332 0.002924079
TSLA 0.002924079 0.038096498
= t(W) %*% COV %*% W
VarPortfolio VarPortfolio
[,1]
[1,] 0.006121081
It is always good to review the correlation matrix of the asset returns of a portfolios. We can get the correlation matrix from the covariance matrix as follows:
= cov2cor(COV)
CORR CORR
WMT TSLA
WMT 1.0000000 0.2740515
TSLA 0.2740515 1.0000000
In this case we see 1’s in the diagonal since the correlation of an asset return with itself will always be equal to 1. In the diagonal we see the level of correlation between both asset returns.
We got the same result as above, but the result now is a matrix of 1 row and 1 element.
To get the expected risk of a portfolio, we simply get the squared root of the expected variance:
PortRisk=SD(PortReturns)=\sqrt{VAR(PortReturns)}
We do this in R as follows:
= sqrt(VarPortfolio[1,1])
PortRisk PortRisk
[1] 0.07823733
The expected portfolio monthly risk is 7.8237335%. Remember that portfolio risk is also called portfolio volatility. We can annualize portfolio volatility:
= sqrt(12) * PortRisk
AnnualPortRisk AnnualPortRisk
[1] 0.2710221
The expected portfolio annual risk is 27.1022077%.
Financial portfolios allow an investor to diversify the risk. The main drivers of portfolio diversification are:
N - the number of financial assets. The more the financial assets, the higher the expected diversification.
Combination of pair of correlations between asset returns. The less the correlation of the pair of assets, the more the diversification.
The first driver is just given by N, the number of assets.
The second driver can actually be manipulated by changing the weights allocated for each asset.
In the following section we illustrate how changing asset weights actually change both, the expected return of the portfolio and the expected portfolio volatility.
Weight combination determines both expected risk and expected return of a portfolio. Let’s do an example with the same 2-asset portfolio:
Let’s create a set of 11 portfolios where we change the weight assigned to the first stock from 0 to 1 changing by 0.10. We can create a matrix of weights, where each column represents one portfolio:
W=\begin{bmatrix}0 & 0.1 & 0.2 & 0.3 & 0.4 & 0.5 & 0.6 & 0.7 & 0.8 & 0.9\\ 1 & 0.9 & 0.8 & 0.7 & 0.6 & 0.5 & 0.4 & 0.3 & 0.2 & 0.1 \end{bmatrix}
We had already calculated the vector of expected return for each stock:
ER=\begin{bmatrix}ER_{1}\\ ER_{2} \end{bmatrix}=\begin{bmatrix}0.00775\\ 0.05588 \end{bmatrix}
Using the advantage of matrix algebra we can easily get the 11 portfolio expected returns as follows:
ERPortfolios=t(W)*ER
We compute this in R:
# I first create a vector for the weights of the first stock:
= seq(from=0,to=1,by=0.1)
w1 w1
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
# The weight vector for stock 2 is just the complement of w1:
= 1 - w1
w2 w2
[1] 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
# I construct the weight matrix with the cbind function:
= rbind(w1,w2)
W W
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
w1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
w2 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
# I compute the expected return of the 11 portfolios:
= t(W) %*% ER ERPortfolios
We see that the portfolios with higher expected returns are the first, which are the portfolios with higher weight in stock 2.
Now we can compute the expected risk of the 11 portfolio as follows:
= t(W) %*% COV %*% W VARPortfolios
When we have more than 1 portfolio in the weight matrix, the result of the previous matrix multiplication will be a matrix of 11 rows and 11 columns. The expected variance of the 11 portfolio result in the diagonal of the previous multiplication:
= diag(VARPortfolios)
VARPortfolios VARPortfolios
[1] 0.038096498 0.031414381 0.025436997 0.020164347 0.015596430 0.011733247
[7] 0.008574797 0.006121081 0.004372098 0.003327848 0.002988332
Finally, the expected portfolio volatility is the squared root of these 11 expected variances:
= sqrt(VARPortfolios)
RISKPortfolios RISKPortfolios
[1] 0.19518324 0.17724102 0.15948980 0.14200122 0.12488567 0.10832011
[7] 0.09260020 0.07823733 0.06612184 0.05768750 0.05466564
Now that we have the 11 portfolio risk and 11 portfolio returns we can plot the 11 portfolios in the risk-return space:
plot(x=RISKPortfolios,y= ERPortfolios,
xlabel="Portfolio Expected Risk", ylabel="Portfolio Expected Return")
text(x=RISKPortfolios,y= ERPortfolios,
labels=w1, cex=0.7, pos=3)
Weights of stock 1 (WMT) are shown in each portfolio. We see that less the w1, the less the expected portfolio risk, but the less the expected portfolio return.
With this illustration we can see how asset weight combination in a portfolio actually change the expected return and risk of a portfolio!
This plot is called the frontier of feasible portfolios of 2 assets. Interestingly, for portfolios with 3 or more assets, the possibilities of feasible portfolios become will not be a nice frontier like this. The feasible portfolios will be in between a frontier. In other words, for these portfolios some feasible portfolios will not be efficient since there will be other feasible portfolios with the same level of expected risk, but with higher expected return.
Let’s do an example of 4-asset portfolios.
We will add 2 stocks to our portfolio: Oracle (ORCL) and Freeport-McMoran, Inc (FCX). This last firm is a mining company. I will collect the data for the 4 assets using the same code we used above:
= c("WMT","TSLA","ORCL","FCX")
tickers getSymbols(Symbols = tickers, periodicity="monthly",from="2020-01-01", to="2024-10-31")
[1] "WMT" "TSLA" "ORCL" "FCX"
<- lapply(tickers, get)
datasets_list
# Merging the datasets:
<- do.call(merge, datasets_list)
prices # We select only Adjusted prices:
<- Ad(prices)
prices
# We change the name of the columns to be equal to the tickers vector:
names(prices) = tickers
#Calculating continuously compounded returns:
= diff(log(prices)) r
Since we have now 4 assets, let’s create many feasible portfolios with random weights, but with the condition that the sum of the weights must be 1 (100%). Let’s create 1,000 random portfolios:
# Initialize the Weight matrix
= c()
W # Create 4 vectors of 1000 random values from 0 to 1:
for(i in 1:4) {
= rbind(W,runif(1000))
W
}# The problem I have now is that some of the 1,000 portfolios
# might end up having a sum of weights higher than one.
# I can do a simple "trick" by
# dividing each of the 4 weights by the sum of these 4
# weights. And I can do this # for all 1000 portfolios:
# I first create a vector with the sum of weights for all portfolios:
<- colSums(W)
sumw # I do another loop to divide each weight by the sum of weights:
for(i in 1:4) {
<-W[i,]/sumw
W[i,]# In each iteration I divide one raw of W2 by the vector sumw,
# which is the sum of the weights of all 1000 portfolios
}
# I check that the sum of weights is 1 (I do this for 5 portfolios)
1:5] W[,
[,1] [,2] [,3] [,4] [,5]
[1,] 0.30533803 0.22863123 0.2467750 0.3076700 0.19416852
[2,] 0.02561224 0.10385961 0.2982268 0.1418303 0.46844257
[3,] 0.38843321 0.07696658 0.1547991 0.2489081 0.09539128
[4,] 0.28061653 0.59054258 0.3001990 0.3015917 0.24199763
colSums(W[,1:5])
[1] 1 1 1 1 1
# All sums are equal to 1, as expected
# Then each column of this matrix represents a random portfolio without allowing for short sales.
I calculate the expected return of each asset:
= exp(colMeans(r,na.rm=TRUE)) - 1
ER ER
WMT TSLA ORCL FCX
0.01479262 0.03119713 0.02195815 0.02579882
I calculate the variance-covariance matrix:
= var(r,na.rm=TRUE)
COV COV
WMT TSLA ORCL FCX
WMT 0.002988332 0.002924079 0.001617215 0.001719990
TSLA 0.002924079 0.038096498 0.004776271 0.012022469
ORCL 0.001617215 0.004776271 0.006880790 0.004332892
FCX 0.001719990 0.012022469 0.004332892 0.019741561
I calculate the correlation matrix to review the correlation levels of pair of assets:
= cov2cor(COV)
CORR CORR
WMT TSLA ORCL FCX
WMT 1.0000000 0.2740515 0.3566435 0.2239343
TSLA 0.2740515 1.0000000 0.2950036 0.4383898
ORCL 0.3566435 0.2950036 1.0000000 0.3717646
FCX 0.2239343 0.4383898 0.3717646 1.0000000
The lowest correlations are the one between ORCL and TSLA and the correlation between WMT and FCX (around 0.12).
We can calculate the expected return of the 1,000 portfolios:
= t(W) %*% ER ERPortfolios
We can calculate the expected risk of the 1,000 portfolios:
= diag(t(W) %*% COV %*% W)
VARPortfolios = sqrt(VARPortfolios) RISKPortfolios
Now we can visualize the feasible portfolios in the risk-return space:
plot(x=RISKPortfolios,y= ERPortfolios,
xlabel="Portfolio Expected Risk", ylabel="Portfolio Expected Return")
We can see that for each level of expected risk there are many possible portfolios. The best portfolios will be those that lie in the frontier of this feasible portfolios.
The problem we have when we have 3 or more assets in a portfolio is that we need to find the efficient frontier to get only those portfolios that maximize the expected return for each level of expected risk.
Here is where we need to find optimization algorithms to do this.
In the lecture note Basics of Portfolio Theory - Part III I explain what type of optimization methods can be used.
Fortunately, we can use R packages that already do these complicated optimization algorithms.
Out of all feasible portfolios we might be interested in the following:
The portfolio with the least expected risk of all - The Global Minimum Variance Portfolio (GMV)
The efficient frontier - all portfolios that offer the highest expected return for any level of expected risk
The tangent/optimal portfolio - The portfolio with the highest Sharpe Ratio
You need to install the IntroCompFinR package. You need to install this package using the Console, and typing:
install.packages(“IntroCompFinR”, repos=“http://R-Forge.R-project.org”)
If you copy.paste this line from this Workshop, re-type the double quotes (“) before you run in the Console.
We can easily get the GMV portfolio as follows:
library(IntroCompFinR)
= globalMin.portfolio(ER,COV)
GMVPort GMVPort
Call:
globalMin.portfolio(er = ER, cov.mat = COV)
Portfolio expected return: 0.01618212
Portfolio standard deviation: 0.05167417
Portfolio weights:
WMT TSLA ORCL FCX
0.7916 -0.0280 0.1961 0.0403
Allowing for short-sales, the portfolio with the minimum variance has an expected return of 0.0161821, and expected risk of 0.0516742. We see that FCX has a negative weight, meaning that we need to short FCX with 3.4%.
We can estimate the GMV portfolio without allowing for short sales:
= globalMin.portfolio(ER,COV, shorts = FALSE)
GMVPort GMVPort
Call:
globalMin.portfolio(er = ER, cov.mat = COV, shorts = FALSE)
Portfolio expected return: 0.01645
Portfolio standard deviation: 0.05189667
Portfolio weights:
WMT TSLA ORCL FCX
0.7828 0.0000 0.1908 0.0263
Now the GMV assigned 0% to FCX and 0% to TSLA. This makes sense since both stocks are very volatile. We can check the volatility of each of the stock returns:
table.Stats(r)
WMT TSLA ORCL FCX
Observations 57.0000 57.0000 57.0000 57.0000
NAs 1.0000 1.0000 1.0000 1.0000
Minimum -0.1734 -0.4578 -0.1941 -0.3890
Quartile 1 -0.0174 -0.1187 -0.0241 -0.0762
Median 0.0171 0.0214 0.0197 0.0263
Arithmetic Mean 0.0147 0.0307 0.0217 0.0255
Geometric Mean 0.0132 0.0124 0.0184 0.0155
Quartile 3 0.0508 0.1719 0.0790 0.1270
Maximum 0.1179 0.5547 0.2456 0.2993
SE Mean 0.0072 0.0259 0.0110 0.0186
LCL Mean (0.95) 0.0002 -0.0211 -0.0003 -0.0118
UCL Mean (0.95) 0.0292 0.0825 0.0437 0.0628
Variance 0.0030 0.0381 0.0069 0.0197
Stdev 0.0547 0.1952 0.0830 0.1405
Skewness -0.6311 0.2881 0.1072 -0.2875
Kurtosis 0.9903 0.0331 0.2674 0.1121
We can see that the historical volatility (Stdev) of Tesla and FCX are much higher than the volatility of ORCL and WMT.
<- efficient.frontier(ER, COV, nport = 100,
efrontier alpha.min = -0.5,
alpha.max = 1.5, shorts = FALSE)
plot(efrontier, plot.assets=TRUE, col="blue")
We need to define a risk-free rate before estimating the optimal portfolio, since the returns that are important for an investor is the premium returns, which are the returns above the risk-free instrument.
In this case, I check the current risk-free rate or the risk-free rate at the time of the portfolio formation. In this case, I assume that I am at the end of Oct 2024 (I downloaded stock price data until Oct 31, 2024), which was around 5.5% annual.
It is important that I have to use the corresponding rate at granularity of my data. In this case, I am using monthly data, so I need to convert the risk-free rate from annual to monthly:
= 0.055
annual_rfree # I convert from simple to continuously compounded:
= log(1+annual_rfree)
ccannual_rfree # I convert from annual cc rate to monthly cc rate:
= ccannual_rfree / 12
rfree rfree
[1] 0.004461731
= tangency.portfolio(ER, COV,rfree, shorts=FALSE)
tangentPort tangentPort
Call:
tangency.portfolio(er = ER, cov.mat = COV, risk.free = rfree,
shorts = FALSE)
Portfolio expected return: 0.01926056
Portfolio standard deviation: 0.05806525
Portfolio weights:
WMT TSLA ORCL FCX
0.4861 0.0461 0.3740 0.0937
= getPortfolio(ER, COV, weights=tangentPort$weights)
tangentPortWeights plot(tangentPortWeights, col="blue")
Finally, we can plot the efficient frontier, the tangent portfolio and the Capital Market Line:
plot(efrontier, plot.assets=TRUE, col="blue", pch=16)
points(GMVPort$sd, GMVPort$er, col="green", pch=16, cex=2)
points(tangentPort$sd, tangentPort$er, col="red", pch=16, cex=2)
text(GMVPort$sd, GMVPort$er, labels="GLOBAL MIN", pos=2)
text(tangentPort$sd, tangentPort$er, labels="TANGENCY", pos=2)
= (tangentPort$er - rfree)/tangentPort$sd
SharpeRatio
abline(a=rfree, b=SharpeRatio, col="green", lwd=2)
The Capital Market Line (CML) is the green line that goes from the risk-free rate to the tangent portfolio. One of the main findings of portfolio theory is that when we add 1 risk-free instrument to a portfolio composed of stocks, the new efficient frontier becomes the Capital Market Line (instead of the hyperbola), which is more efficient that the previous efficient frontier (the hyperbola).
Then, an investor can play between the tangent portfolio and the risk-free rate to move in the CML. If an investor has a middle-level aversion to risk, he/she might allocate 50% to the risk-free asset and the rest 50% in the tangent portfolio, locating the portfolio in a mid-point in the CML (the green line) between the risk-free rate and the tangent portfolio.
The main assumption of portfolio theory is that all investors behave as rational participants all the time. In other words, they always maximize return and minimize risk using the available information disclosed to the market in a rational way. In other words, they act with no emotions, no fear and they always understand what happen in the market.
If most of the investors are not rational all the time, how can you take advantage of this and define a disruptive portfolio? Which trends, beliefs, fears, regulations, opportunities do you think that few investors are looking for, so that you can beat the rational optimized portfolio?
Based on your own set of beliefs and your understanding of portfolio risk-return trade-off, propose weights for a “disruptive” portfolio that might beat most of the optimized portfolios in the market?
Create a vector of 10 tickers (select any you prefer). If you prefer, select the best 10 tickers you got from Workshop 2.
With this vector, bring monthly data from Yahoo from Jan 2020 to Oct 2024, and do the following:
1.- Generate the Global Minimum Variance Portfolio
2.- Generate 10,000 portfolios with random weights for each asset.
3.- (Optional) Propose 2 portfolios: Conservative, Aggressive. Include the description of each portfolio, expected return and volatility.
I will use the best stocks I got from Workshop 2. Then, I will do the same stock screening (Solution of Workshop 2) here before I optimize the portfolio with the selected stocks:
# Load the datasets
= read.csv("dataus2024.csv")
uspanel = read.csv("firmsus2024.csv") usfirms
I calculate the period EBIT for all stocks-quarters:
# I check the cases with sgae=NA and revenue>0:
|> filter(q=="2024q2", !is.na(revenue), is.na(sgae)) |> select(firm,q,revenue,cogs,sgae) uspanel
firm q revenue cogs sgae
1 AFL 2024q2 10575000 0 NA
2 AIG 2024q2 13323000 0 NA
3 AJX 2024q2 -28436 54044 NA
4 ALL 2024q2 30973000 0 NA
5 AMSF 2024q2 156319 0 NA
6 AMTB 2024q2 324942 273607 NA
7 ARR 2024q2 271405 305239 NA
8 ASB 2024q2 1183806 932729 NA
9 AUBN 2024q2 20596 16641 NA
10 AXP 2024q2 36204000 26732000 NA
11 BAC 2024q2 96600000 78951000 NA
12 BCBP 2024q2 97605 80773 NA
13 BFH 2024q2 2418000 1438000 NA
14 BFIN 2024q2 37737 32798 NA
15 BK 2024q2 19542000 16664000 NA
16 BOH 2024q2 508258 410187 NA
17 BOKF 2024q2 1738434 1404510 NA
18 BWB 2024q2 122860 100648 NA
19 BXMT 2024q2 269165 63041 NA
20 C 2024q2 86453000 72758000 NA
21 CCBG 2024q2 133289 98574 NA
22 CFB 2024q2 253627 203081 NA
23 CFFI 2024q2 81836 65557 NA
24 CFR 2024q2 1404352 1038203 NA
25 CINF 2024q2 5479000 0 NA
26 CMA 2024q2 2517000 2067000 NA
27 CNA 2024q2 6963000 0 NA
28 CNOB 2024q2 267861 212534 NA
29 COF 2024q2 26331000 17506000 NA
30 CPF 2024q2 173607 129861 NA
31 CRBG 2024q2 9546000 0 NA
32 DCOM 2024q2 342702 281600 NA
33 DFS 2024q2 11656000 6946000 NA
34 DX 2024q2 199844 168029 NA
35 EFSC 2024q2 447019 328628 NA
36 EIG 2024q2 440100 0 NA
37 EQH 2024q2 5740000 0 NA
38 ERIE 2024q2 357926 0 NA
39 FAF 2024q2 3036900 0 NA
40 FBIZ 2024q2 127875 100864 NA
41 FCBC 2024q2 91419 57560 NA
42 FFIN 2024q2 363818 228605 NA
43 FHN 2024q2 2546000 1927000 NA
44 FIBK 2024q2 733400 564000 NA
45 FLIC 2024q2 89680 79505 NA
46 FNM_old 2024q2 75100000 64543000 NA
47 FRE_old 2024q2 59507000 52018000 NA
48 FRME 2024q2 530253 404995 NA
49 GBCI 2024q2 615429 513073 NA
50 GNW 2024q2 3633000 0 NA
51 GS 2024q2 63089000 53336000 NA
52 HBAN 2024q2 5814000 4511000 NA
53 HBCP 2024q2 96888 73966 NA
54 HBT 2024q2 140021 92846 NA
55 HIG 2024q2 12905000 0 NA
56 HOMB 2024q2 728791 452487 NA
57 HTGC 2024q2 217855 85039 NA
58 HTH 2024q2 791680 715973 NA
59 IBOC 2024q2 513523 243673 NA
60 ISTR 2024q2 77010 67871 NA
61 JPM 2024q2 142257000 96593000 NA
62 KEY 2024q2 5394000 4580000 NA
63 KNSL 2024q2 757344 0 NA
64 KREF 2024q2 315704 255682 NA
65 L 2024q2 8498000 0 NA
66 LADR 2024q2 258691 198479 NA
67 LMND 2024q2 241100 0 NA
68 LNC 2024q2 9269000 0 NA
69 MBWM 2024q2 176152 120850 NA
70 MCB 2024q2 241239 191005 NA
71 MCY 2024q2 2579080 0 NA
72 MET 2024q2 33880000 0 NA
73 MKL 2024q2 8168498 0 NA
74 MTG 2024q2 599638 0 NA
75 MYFW 2024q2 90701 83301 NA
76 NAVI 2024q2 2252000 2090000 NA
77 NBHC 2024q2 295902 222484 NA
78 NLY 2024q2 2775496 2308167 NA
79 NMFC 2024q2 185144 108916 NA
80 NMIH 2024q2 318375 0 NA
81 NRIM 2024q2 90138 68049 NA
82 NTRS 2024q2 8263400 6799700 NA
83 NYMT 2024q2 81792 185701 NA
84 OCFC 2024q2 344299 269833 NA
85 OMF 2024q2 2746000 1444000 NA
86 ONB 2024q2 1424437 1060081 NA
87 ORC 2024q2 128037 113240 NA
88 ORI 2024q2 3887600 0 NA
89 PEBO 2024q2 307846 222333 NA
90 PFG 2024q2 8364100 0 NA
91 PMT 2024q2 488771 427781 NA
92 PNC 2024q2 17058000 13193000 NA
93 PPBI 2024q2 465481 340164 NA
94 PRK 2024q2 310538 214501 NA
95 PRU 2024q2 38392000 0 NA
96 RDN 2024q2 640565 0 NA
97 RGA 2024q2 11215000 0 NA
98 RLI 2024q2 861273 0 NA
99 SBSI 2024q2 228225 172634 NA
100 SFST 2024q2 105096 97479 NA
101 SIGI 2024q2 2360964 0 NA
102 SLM 2024q2 1620997 866094 NA
103 SNV 2024q2 1573989 1354650 NA
104 SOV_old 2024q2 8692481 7090498 NA
105 SRCE 2024q2 283014 190749 NA
106 SSB 2024q2 1195162 852221 NA
107 STBA 2024q2 282654 197576 NA
108 STEL 2024q2 312314 239700 NA
109 STT 2024q2 10728000 9218000 NA
110 SYF 2024q2 10850000 4723000 NA
111 TCBI 2024q2 931189 798657 NA
112 TFIN 2024q2 241128 222726 NA
113 THG 2024q2 3087800 0 NA
114 TIPT 2024q2 1044894 0 NA
115 TRTX 2024q2 184198 142430 NA
116 TRV 2024q2 22511000 0 NA
117 TWO 2024q2 686827 400731 NA
118 UBFO 2024q2 32421 20325 NA
119 UMBF 2024q2 1362510 1077676 NA
120 UNM 2024q2 6433700 0 NA
121 UVE 2024q2 748173 0 NA
122 UVSP 2024q2 245016 194729 NA
123 VBTX 2024q2 393074 312049 NA
124 VEL 2024q2 236625 192296 NA
125 VOYA 2024q2 4084000 0 NA
126 WAFD 2024q2 1042814 849924 NA
127 WAL 2024q2 2447600 1915600 NA
128 WFC 2024q2 63126000 49205000 NA
129 WRB 2024q2 6570805 0 NA
130 WSBC 2024q2 460311 367352 NA
131 WSFS 2024q2 694314 481972 NA
132 WTBA 2024q2 96783 83220 NA
= uspanel |>
uspanel mutate(sgae = ifelse(!is.na(revenue) & is.na(sgae),0,sgae),
# for some reason there are some cases of firms with revenue and cogs, but with sgae=NA;
# this should be the case for service firms, so I will replace sgae=0 when revenue>0;
# if I do not replace sgae=0 for these cases, then YTDebit cannot be calculated!
YTDebit = revenue - cogs - sgae,
ebitp = ifelse(fiscalq==1,YTDebit, YTDebit - lag(YTDebit)))
I create a column for the industry in the uspanel dataset by pulling the industry from the usfirms dataset:
# I create a temporal usfirms1 dataset with only 2 columns. I rename the empresa column with firm to have the same name for the ticker in both files:
= usfirms |>
usfirms1 mutate(firm = empresa) |>
select(firm,naics1, naics2)
# I merge the usfirms1 with the uspanel to pull the industry:
= left_join(uspanel,usfirms1,by='firm') uspanel
I calculate annual stock return for all firms-quarters:
= uspanel |>
uspanel # I sort by firm-quarter:
arrange(firm,q) |>
# I group by firm to make sure that the lag function works only on the
# quarters of the firm :
group_by(firm) |>
# I create market value, operating earnings per share deflated by stock price, and stock annual return:
mutate(marketvalue = originalprice * sharesoutstanding,
oepsp = (YTDebit / sharesoutstanding) / originalprice,
# The annual return is equal to the current stock price minus its
# stock price of 4 quarters ago:
r = log(adjprice) - dplyr::lag(log(adjprice),4)
|>
) # I do ungroup to undo the group_by
ungroup()
I winsorize the oepsp:
library(statar)
hist(uspanel$oepsp)
$oepspw= winsorize(uspanel$oepsp, probs=c(0.005,0.999))
uspanelhist(uspanel$oepspw)
For each quarter-industry I calculate the median return and the binary variable rabove that indicates whether the stock returns beats the industry returns:
= uspanel |>
uspanel group_by(q,naics1) |>
mutate(industry_ret = median(r,na.rm=TRUE),
rabove = ifelse(r>industry_ret,1,0)) |>
ungroup()
I calculate another variable equal to the rabove, but 1 quarter in the future (1 lag in the future):
= uspanel |>
uspanel group_by(firm) |>
mutate(F1rabove = lead(rabove,1))
# The lead function gets the future value of the variable
# I check how many rows/cases have 1 and 0's:
table(uspanel$F1rabove)
0 1
115114 114211
I run the logit model to examine whether oepspw is related to the probability of a stock to beat its industry return 1 quarter in the future:
<- glm(F1rabove ~ oepspw, data= uspanel, family="binomial",na.action=na.omit)
logitm1 summary(logitm1)
Call:
glm(formula = F1rabove ~ oepspw, family = "binomial", data = uspanel,
na.action = na.omit)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.020277 0.004256 -4.764 1.9e-06 ***
oepspw 0.884870 0.017116 51.699 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 311904 on 224995 degrees of freedom
Residual deviance: 308217 on 224994 degrees of freedom
(332185 observations deleted due to missingness)
AIC: 308221
Number of Fisher Scoring iterations: 4
= coef(logitm1)
coefficients # The coefficients for the odd ratio interpretation are:
exp(0.1*coefficients)
(Intercept) oepspw
0.9979744 1.0925201
We can use the last model to predict the probability whether the firm will beat the return of the industry 1 year in the future. For this prediction, it might be a good idea to select only the firms in the last quarter of the dataset (2024Q2), so we can predict which firm will beat the industry in 2025:
We create a dataset with only the Q2 of 2024:
<- uspanel |>
uspanel2024 select(firm,q,yearf, oepspw, F1rabove) |>
filter(q=="2024q2") |>
as.data.frame()
Now we run the prediction using the model 2 with this new dataset:
<- uspanel2024 |>
uspanel2024 mutate(pred=predict(logitm1,newdata=uspanel2024,type=c("response")) )
The type=c(“response”) parameter indicates to get the prediction in probability. If I do not specify this option, the prediction is calculated in the logarithm of ODDS RATIO.
We can sort the firms according to the predicted probability of beating the benchmark, and then select the top 100 with the highest probability:
<- uspanel2024 |>
top110merge(usfirms1,by="firm") |> # merge with firms1 to pull firm name and industry
arrange(desc(pred)) |> # sort by probability, from highest to lowest
#slice_head(prop=0.05) |> # select the top 5% firms
head(n=110)
# Show the # of firms selected:
nrow(top110)
[1] 110
I extract the 100 firms (tickers) column and put it into a char vector:
= as.vector(top110$firm) tickers
I selected the top 110 just in case there are some tickers that do not exist in Yahoo.
Now I write a for loop to get the historical prices of the 100 tickers. Here is a difficult problem to solve related to data validation. It is possible that some of the tickers are not in Yahoo Finance. If I use getSymbols with the ticker vector and it is one ticker that does not exist in Yahoo, then the getSymbols will stop running. Then, I can write a loop to run the getSymbols for each ticker and detect when a ticker is not in Yahoo.
The tryCatch function is used to run a set of R lines of code in the try mode, so if there is an error in any R code, R DOES NOT STOP RUNNING AND I can cach the error and do a specific action:
# I do a loop to run getSymbols and in each iteration I can detect which one has an error because
# it does not exist in Yahoo Finance:
# I initialize an empty list where I will be storing the tickers that exist in Yahoo Finance:
= c()
finaltickerlist # I initilize a counter i=0 and tottickers=0.
# The counter i will be the counter of ticker #, while the counter tottickers
# will count the tickers that are found in Yahoo to know when to stop the loop after finding 100 tickers in Yahoo:
=0
i= 0
tottickers library(quantmod)
#for (t in tickers) {
# I will use a while loop instead of a for to make sure that I get exactly 100 tickers:
while (tottickers<100 & i<110) {
# I increment i in 1:
= i + 1
i # I save the ticker i in t:
= tickers[i]
t tryCatch(
{getSymbols(t, from="2020-01-01", to="2024-10-31",periodicity="monthly")
# Since the ticker was found in Yahoo Finance, then:
=tottickers + 1
tottickers# I add the ticker found in Yahoo into the finaltickerlist vector:
= c(finaltickerlist,t)
finaltickerlist cat("The ticker ", t, " was found in Yahoo \n")
if (tottickers==1) {
# If I am in the ticker 1 I only get the Adjusted price, but I do not merge
# the historical prices with the prices dataset:
= get(t)
temp # prices will be the final big dataset with all prices for all stocks:
= Ad(temp)
prices
}else { # if tottickers>1, so this means that I already created the price dataset,
# so I need to do a merge:
=get(t)
temp= merge(prices,Ad(temp))
prices
}
}, error = function(e) {
cat("THE TICKER ",t," WAS NOT FOUND IN YAHOO FINANCE \n")
}
) }
The ticker AJG was found in Yahoo
The ticker APA was found in Yahoo
The ticker LNC was found in Yahoo
The ticker TIPT was found in Yahoo
The ticker UVE was found in Yahoo
The ticker GNW was found in Yahoo
THE TICKER OCN WAS NOT FOUND IN YAHOO FINANCE
The ticker CYH was found in Yahoo
The ticker PRU was found in Yahoo
The ticker MCY was found in Yahoo
The ticker RGA was found in Yahoo
The ticker ALL was found in Yahoo
The ticker THG was found in Yahoo
The ticker MET was found in Yahoo
The ticker UNM was found in Yahoo
The ticker ATUS was found in Yahoo
THE TICKER VIA__old WAS NOT FOUND IN YAHOO FINANCE
The ticker VOYA was found in Yahoo
The ticker CNA was found in Yahoo
The ticker FAF was found in Yahoo
The ticker CRBG was found in Yahoo
The ticker L was found in Yahoo
The ticker ORI was found in Yahoo
The ticker TRV was found in Yahoo
The ticker PFG was found in Yahoo
The ticker BFH was found in Yahoo
The ticker HIG was found in Yahoo
The ticker EQH was found in Yahoo
The ticker SIGI was found in Yahoo
The ticker EIG was found in Yahoo
The ticker UNIT was found in Yahoo
The ticker MKL was found in Yahoo
The ticker WRB was found in Yahoo
The ticker VNCE was found in Yahoo
The ticker SYF was found in Yahoo
The ticker THC was found in Yahoo
The ticker CINF was found in Yahoo
The ticker DTIL was found in Yahoo
The ticker ILPT was found in Yahoo
The ticker WLFC was found in Yahoo
The ticker AIG was found in Yahoo
The ticker VGZ was found in Yahoo
The ticker RM was found in Yahoo
The ticker BIPC was found in Yahoo
The ticker OMF was found in Yahoo
The ticker TWO was found in Yahoo
The ticker AFL was found in Yahoo
The ticker LMND was found in Yahoo
The ticker OI was found in Yahoo
The ticker AAL was found in Yahoo
The ticker AMSF was found in Yahoo
The ticker SLM was found in Yahoo
The ticker COF was found in Yahoo
The ticker ENVA was found in Yahoo
The ticker SABR was found in Yahoo
The ticker EZPW was found in Yahoo
The ticker CHTR was found in Yahoo
The ticker DIT was found in Yahoo
The ticker DFS was found in Yahoo
The ticker HZO was found in Yahoo
The ticker GMS was found in Yahoo
The ticker RDN was found in Yahoo
The ticker KEQU was found in Yahoo
The ticker GM was found in Yahoo
The ticker JACK was found in Yahoo
The ticker HOV was found in Yahoo
The ticker DAKT was found in Yahoo
The ticker RLI was found in Yahoo
The ticker SR was found in Yahoo
The ticker AMWD was found in Yahoo
The ticker GBX was found in Yahoo
The ticker JBL was found in Yahoo
The ticker UAL was found in Yahoo
The ticker AL was found in Yahoo
The ticker MIND was found in Yahoo
The ticker DLHC was found in Yahoo
The ticker INN was found in Yahoo
The ticker GPI was found in Yahoo
The ticker PDCO was found in Yahoo
The ticker CIVI was found in Yahoo
The ticker NMIH was found in Yahoo
The ticker CABO was found in Yahoo
The ticker CZR was found in Yahoo
The ticker C was found in Yahoo
The ticker SJM was found in Yahoo
The ticker PHM was found in Yahoo
The ticker VLGEA was found in Yahoo
The ticker TNL was found in Yahoo
The ticker OPY was found in Yahoo
The ticker FDP was found in Yahoo
The ticker MCB was found in Yahoo
The ticker MHO was found in Yahoo
The ticker VGR was found in Yahoo
The ticker TRN was found in Yahoo
The ticker DLX was found in Yahoo
The ticker MTG was found in Yahoo
The ticker CALM was found in Yahoo
The ticker SNCY was found in Yahoo
The ticker MTN was found in Yahoo
The ticker NAVI was found in Yahoo
The ticker COOP was found in Yahoo
The ticker CFFI was found in Yahoo
I check whether I ended up with a price dataset of 100 columns, one for each ticker:
ncol(prices)
[1] 100
I create a dataset for returns of these 100 stocks:
= diff(log(prices))
returns # finaltickerlist has the tickers that were found in Yahoo Finance:
colnames(returns) = finaltickerlist
Now I get the price data of the market index and calculate its monthly returns:
getSymbols(Symbol="^GSPC", from="2020-01-01", to="2024-10-31",periodicity="monthly")
[1] "GSPC"
= diff(log(Ad(GSPC))) sp500ret
I now run a loop to estimate all market regression models and store them in a data frame:
# I initialize an empty vector/matrix betas:
=c()
betas for (i in 1:ncol(returns)) {
# I run the market regression model for stock i:
<- lm(returns[,i] ~ sp500ret$GSPC)
model #print(summary(model))
# I store the summary of the model in a temporal object:
= summary(model)
smodel # From the coefficients matrix I extract the coefficients along with their standard errors and pvalues:
= smodel$coefficients[1,1]
b0 = smodel$coefficients[2,1]
b1 = smodel$coefficients[1,2]
seb0 =smodel$coefficients[2,2]
seb1 = smodel$coefficients[1,4]
pb0 = smodel$coefficients[2,4]
pb1 # I save the number of valid observations (months) used in the regression:
= model$df.residual + 2
N # I save N to later drop those tickers with very few history available since their beta estimations will not be reliable
# I put the values together in a temporal vector:
= c(b0,b1,seb0,seb1,pb0,pb1, N)
vectorbetas # I bind this vector with the betas cumulative matrix:
= rbind(betas,vectorbetas)
betas
}# Finally I rename columns and rows of the betas matrix:
# I use the finaltickerlist since it has the tickers that where FOUND IN YAHOO FINANCE:
rownames(betas) = finaltickerlist[1:ncol(returns)]
colnames(betas) =c("b0","b1","seb0","seb1","pb0","pb1","N")
# I create a data frame from the beta matrix:
= data.frame(betas) betasdf
betasdf
b0 | b1 | seb0 | seb1 | pb0 | pb1 | N | |
---|---|---|---|---|---|---|---|
AJG | 0.0113380 | 0.7393096 | 0.0071630 | 0.1330145 | 0.1191871 | 0.0000008 | 57 |
APA | -0.0345240 | 3.3370946 | 0.0360089 | 0.6686752 | 0.3418732 | 0.0000064 | 57 |
LNC | -0.0223796 | 1.8342302 | 0.0159093 | 0.2954323 | 0.1651429 | 0.0000001 | 57 |
TIPT | 0.0087913 | 1.1767828 | 0.0165934 | 0.3081359 | 0.5983811 | 0.0003425 | 57 |
UVE | -0.0089182 | 0.9499327 | 0.0154648 | 0.2871777 | 0.5665117 | 0.0016617 | 57 |
GNW | -0.0001226 | 0.8838090 | 0.0169328 | 0.3144380 | 0.9942496 | 0.0068344 | 57 |
CYH | -0.0185765 | 1.7556430 | 0.0313874 | 0.5828554 | 0.5563784 | 0.0039148 | 57 |
PRU | -0.0035397 | 1.2985853 | 0.0081189 | 0.1507666 | 0.6645558 | 0.0000000 | 57 |
MCY | 0.0006371 | 0.8561274 | 0.0099851 | 0.1854206 | 0.9493547 | 0.0000238 | 57 |
RGA | -0.0006143 | 0.9260021 | 0.0115070 | 0.2136816 | 0.9576220 | 0.0000629 | 57 |
ALL | 0.0052743 | 0.4791599 | 0.0087642 | 0.1627479 | 0.5497781 | 0.0047364 | 57 |
THG | -0.0041453 | 0.7352856 | 0.0080893 | 0.1502162 | 0.6103913 | 0.0000090 | 57 |
MET | 0.0005412 | 1.0361106 | 0.0099504 | 0.1847764 | 0.9568180 | 0.0000007 | 57 |
UNM | 0.0107548 | 0.8126782 | 0.0128720 | 0.2390300 | 0.4070410 | 0.0012612 | 57 |
ATUS | -0.0584363 | 1.5950116 | 0.0214332 | 0.3980079 | 0.0085722 | 0.0001862 | 57 |
VOYA | -0.0038743 | 1.0277504 | 0.0078811 | 0.1463498 | 0.6249630 | 0.0000000 | 57 |
CNA | 0.0007392 | 0.6721978 | 0.0073793 | 0.1370321 | 0.9205697 | 0.0000087 | 57 |
FAF | -0.0103280 | 1.3776780 | 0.0067748 | 0.1258053 | 0.1331170 | 0.0000000 | 57 |
CRBG | 0.0114967 | 0.6665819 | 0.0187263 | 0.4615249 | 0.5455552 | 0.1627457 | 24 |
L | -0.0005005 | 0.8390363 | 0.0067960 | 0.1261993 | 0.9415619 | 0.0000000 | 57 |
ORI | 0.0042975 | 0.9366384 | 0.0082018 | 0.1523056 | 0.6024031 | 0.0000001 | 57 |
TRV | 0.0063405 | 0.6503172 | 0.0074961 | 0.1392007 | 0.4013039 | 0.0000197 | 57 |
PFG | -0.0010812 | 1.2059125 | 0.0091295 | 0.1695326 | 0.9061573 | 0.0000000 | 57 |
BFH | -0.0287154 | 2.1501948 | 0.0216137 | 0.4013605 | 0.1894751 | 0.0000017 | 57 |
HIG | 0.0032758 | 0.9601573 | 0.0093990 | 0.1745376 | 0.7287730 | 0.0000010 | 57 |
EQH | -0.0006974 | 1.4085886 | 0.0098480 | 0.1828745 | 0.9438043 | 0.0000000 | 57 |
SIGI | 0.0009735 | 0.5749528 | 0.0084151 | 0.1562667 | 0.9083283 | 0.0005334 | 57 |
EIG | 0.0033086 | 0.2181561 | 0.0106267 | 0.1973352 | 0.7567106 | 0.2737510 | 57 |
UNIT | -0.0115029 | 1.4583344 | 0.0230401 | 0.4278492 | 0.6195909 | 0.0012288 | 57 |
MKL | -0.0029650 | 0.7759603 | 0.0075975 | 0.1410843 | 0.6978552 | 0.0000010 | 57 |
WRB | 0.0048429 | 0.6560228 | 0.0082789 | 0.1537367 | 0.5609585 | 0.0000787 | 57 |
VNCE | -0.0558057 | 1.8868295 | 0.0346208 | 0.6428990 | 0.1127056 | 0.0048607 | 57 |
SYF | -0.0054996 | 1.7062412 | 0.0127636 | 0.2370162 | 0.6682396 | 0.0000000 | 57 |
THC | 0.0064087 | 2.1458508 | 0.0155155 | 0.2881181 | 0.6811760 | 0.0000000 | 57 |
CINF | 0.0003360 | 0.7061352 | 0.0105423 | 0.1957681 | 0.9746900 | 0.0006687 | 57 |
DTIL | -0.0716242 | 1.2937344 | 0.0287758 | 0.5343592 | 0.0158637 | 0.0188011 | 57 |
ILPT | -0.0476199 | 1.7844572 | 0.0226104 | 0.4198693 | 0.0397732 | 0.0000833 | 57 |
WLFC | 0.0080878 | 1.1833150 | 0.0222588 | 0.4133396 | 0.7177354 | 0.0059303 | 57 |
AIG | -0.0025594 | 1.2047645 | 0.0124726 | 0.2316120 | 0.8381705 | 0.0000030 | 57 |
VGZ | -0.0169125 | 1.6111618 | 0.0187612 | 0.3483901 | 0.3712742 | 0.0000232 | 57 |
RM | -0.0133970 | 1.6511918 | 0.0163004 | 0.3026945 | 0.4146952 | 0.0000012 | 57 |
BIPC | -0.0060592 | 1.3036748 | 0.0118454 | 0.2457618 | 0.6111502 | 0.0000024 | 54 |
OMF | -0.0023353 | 1.5450772 | 0.0139289 | 0.2586555 | 0.8674658 | 0.0000002 | 57 |
TWO | -0.0439982 | 2.4489828 | 0.0205205 | 0.3810600 | 0.0364597 | 0.0000000 | 57 |
AFL | 0.0048980 | 0.9596294 | 0.0078087 | 0.1450050 | 0.5330908 | 0.0000000 | 57 |
LMND | -0.0343659 | 1.6666532 | 0.0302652 | 0.6291802 | 0.2618073 | 0.0108976 | 50 |
OI | -0.0143482 | 1.2178123 | 0.0149605 | 0.2778133 | 0.3417204 | 0.0000531 | 57 |
AAL | -0.0272285 | 1.5097844 | 0.0155172 | 0.2881509 | 0.0848761 | 0.0000026 | 57 |
AMSF | -0.0010850 | 0.3538946 | 0.0100081 | 0.1858480 | 0.9140626 | 0.0621183 | 57 |
SLM | 0.0015965 | 1.2379163 | 0.0121828 | 0.2262315 | 0.8962195 | 0.0000011 | 57 |
COF | -0.0049734 | 1.5078016 | 0.0122869 | 0.2281644 | 0.6872173 | 0.0000000 | 57 |
ENVA | 0.0075924 | 1.4216987 | 0.0150296 | 0.2790961 | 0.6154638 | 0.0000044 | 57 |
SABR | -0.0531091 | 2.0126269 | 0.0260138 | 0.4830690 | 0.0460022 | 0.0001102 | 57 |
EZPW | 0.0009183 | 0.9842858 | 0.0141193 | 0.2621924 | 0.9483806 | 0.0004213 | 57 |
CHTR | -0.0188152 | 1.0789512 | 0.0120383 | 0.2235479 | 0.1238023 | 0.0000115 | 57 |
DIT | 0.0048551 | 0.4953524 | 0.0161000 | 0.2989720 | 0.7641269 | 0.1032439 | 57 |
DFS | -0.0010414 | 1.4975452 | 0.0131692 | 0.2445494 | 0.9372599 | 0.0000001 | 57 |
HZO | -0.0125688 | 1.9216400 | 0.0165536 | 0.3073959 | 0.4509275 | 0.0000001 | 57 |
GMS | 0.0042707 | 1.7003253 | 0.0112299 | 0.2085371 | 0.7051886 | 0.0000000 | 57 |
RDN | -0.0024941 | 1.1346989 | 0.0112744 | 0.2093626 | 0.8257412 | 0.0000014 | 57 |
KEQU | 0.0088525 | 0.7396444 | 0.0179389 | 0.3331211 | 0.6236399 | 0.0305344 | 57 |
GM | -0.0068359 | 1.4796292 | 0.0122657 | 0.2277702 | 0.5795692 | 0.0000000 | 57 |
JACK | -0.0270817 | 1.9923881 | 0.0161809 | 0.3004750 | 0.0998722 | 0.0000000 | 57 |
HOV | 0.0050693 | 2.8768282 | 0.0299148 | 0.5555090 | 0.8660591 | 0.0000033 | 57 |
DAKT | 0.0027270 | 1.1335075 | 0.0165543 | 0.3074088 | 0.8697611 | 0.0005202 | 57 |
RLI | 0.0079262 | 0.3608799 | 0.0094364 | 0.1752320 | 0.4045702 | 0.0441983 | 57 |
SR | -0.0068911 | 0.5478918 | 0.0073282 | 0.1360830 | 0.3511483 | 0.0001751 | 57 |
AMWD | -0.0210336 | 1.7696686 | 0.0142693 | 0.2649776 | 0.1461706 | 0.0000000 | 57 |
GBX | 0.0042405 | 1.4256696 | 0.0156000 | 0.2896869 | 0.7867708 | 0.0000082 | 57 |
JBL | 0.0081466 | 1.2517573 | 0.0098525 | 0.1829589 | 0.4118950 | 0.0000000 | 57 |
UAL | -0.0147082 | 1.5492610 | 0.0183997 | 0.3416773 | 0.4275133 | 0.0000317 | 57 |
AL | -0.0144037 | 1.6639302 | 0.0120150 | 0.2231146 | 0.2357412 | 0.0000000 | 57 |
MIND | -0.0944896 | 1.7749395 | 0.0479676 | 0.8907463 | 0.0538982 | 0.0512723 | 57 |
DLHC | -0.0006415 | 1.1622485 | 0.0181187 | 0.3364592 | 0.9718843 | 0.0010695 | 57 |
INN | -0.0299584 | 2.1157986 | 0.0155649 | 0.2890368 | 0.0594412 | 0.0000000 | 57 |
GPI | 0.0083219 | 1.4810676 | 0.0143332 | 0.2661636 | 0.5638814 | 0.0000008 | 57 |
PDCO | -0.0082330 | 1.0626403 | 0.0117684 | 0.2185362 | 0.4871348 | 0.0000101 | 57 |
CIVI | 0.0081122 | 1.4536969 | 0.0187859 | 0.3488486 | 0.6675556 | 0.0001099 | 57 |
NMIH | -0.0099721 | 1.3334456 | 0.0147531 | 0.2739616 | 0.5019157 | 0.0000099 | 57 |
CABO | -0.0356232 | 0.8512587 | 0.0120768 | 0.2242626 | 0.0046638 | 0.0003689 | 57 |
CZR | -0.0397440 | 3.2698129 | 0.0243790 | 0.4527115 | 0.1087619 | 0.0000000 | 57 |
C | -0.0140461 | 1.4632022 | 0.0104668 | 0.1943658 | 0.1851202 | 0.0000000 | 57 |
SJM | 0.0019157 | 0.2254669 | 0.0077158 | 0.1432808 | 0.8048456 | 0.1213150 | 57 |
PHM | 0.0020400 | 1.7541108 | 0.0124673 | 0.2315148 | 0.8706261 | 0.0000000 | 57 |
VLGEA | 0.0048138 | 0.2857608 | 0.0090340 | 0.1677589 | 0.5962808 | 0.0941379 | 57 |
TNL | -0.0147136 | 1.7793751 | 0.0129833 | 0.2410970 | 0.2620173 | 0.0000000 | 57 |
OPY | 0.0034775 | 1.1347066 | 0.0107546 | 0.1997094 | 0.7476558 | 0.0000005 | 57 |
FDP | -0.0014145 | 0.3716576 | 0.0119085 | 0.2211379 | 0.9058804 | 0.0985005 | 57 |
MCB | -0.0102722 | 1.1753253 | 0.0186966 | 0.3471917 | 0.5849451 | 0.0013182 | 57 |
MHO | -0.0024442 | 2.3977187 | 0.0165943 | 0.3081516 | 0.8834399 | 0.0000000 | 57 |
VGR | 0.0040314 | 1.0809228 | 0.0111707 | 0.2074369 | 0.7195638 | 0.0000029 | 57 |
TRN | -0.0011655 | 1.3427477 | 0.0109939 | 0.2041545 | 0.9159584 | 0.0000000 | 57 |
DLX | -0.0274098 | 1.4926079 | 0.0126830 | 0.2355205 | 0.0350550 | 0.0000000 | 57 |
MTG | -0.0016569 | 1.4118013 | 0.0116711 | 0.2167298 | 0.8876269 | 0.0000000 | 57 |
CALM | 0.0198805 | -0.1541523 | 0.0108815 | 0.2020659 | 0.0731274 | 0.4487939 | 57 |
SNCY | -0.0363092 | 1.4843406 | 0.0200047 | 0.4175285 | 0.0770229 | 0.0009883 | 42 |
MTN | -0.0161188 | 1.1945486 | 0.0090680 | 0.1683896 | 0.0810056 | 0.0000000 | 57 |
NAVI | -0.0115599 | 1.5118820 | 0.0109989 | 0.2042466 | 0.2978507 | 0.0000000 | 57 |
COOP | 0.0197037 | 1.4805047 | 0.0141093 | 0.2620058 | 0.1681725 | 0.0000006 | 57 |
CFFI | 0.0031484 | 0.3621739 | 0.0115736 | 0.2149179 | 0.7866148 | 0.0976218 | 57 |
Based on the data of each company select a set of 10 companies based on any of these criteria:
CRITERIA A: Stocks that are SIGNIFICANTLY offering returns over the market
CRITERIA B: Stocks that are SIGNIFICANTLY less risky than the market according to the market model regressions.
Selecting the stocks of CRITERIA A:
<- betasdf |>
stocks_over_market filter(b0>0,pb0<=0.05) |>
arrange(desc(b0))
stocks_over_market
[1] b0 b1 seb0 seb1 pb0 pb1 N
<0 rows> (or 0-length row.names)
nrow(stocks_over_market)
[1] 0
There is only 0 stock(s) that have a positive and significant beta0.
Filtering the stocks according to CRITERIA B:
# I first add the 95% minimum and the maximum values of the C.I. for beta1:
<- betasdf |>
betasdf mutate(
minb1= b1 - 2*seb1,
maxb1= b1 + 2*seb1)
# I filter those that might be significantly less risky than the market:
<- betasdf |>
less_risky_market filter(maxb1<1) |>
arrange(b1)
less_risky_market
b0 b1 seb0 seb1 pb0 pb1 N
CALM 0.0198804656 -0.1541523 0.010881466 0.2020659 0.07312738 4.487939e-01 57
EIG 0.0033086412 0.2181561 0.010626712 0.1973352 0.75671063 2.737510e-01 57
SJM 0.0019156530 0.2254669 0.007715824 0.1432808 0.80484564 1.213150e-01 57
VLGEA 0.0048138060 0.2857608 0.009033997 0.1677589 0.59628083 9.413785e-02 57
AMSF -0.0010850108 0.3538946 0.010008117 0.1858480 0.91406258 6.211828e-02 57
RLI 0.0079261776 0.3608799 0.009436433 0.1752320 0.40457022 4.419835e-02 57
CFFI 0.0031483978 0.3621739 0.011573561 0.2149179 0.78661477 9.762175e-02 57
FDP -0.0014145242 0.3716576 0.011908517 0.2211379 0.90588040 9.850051e-02 57
ALL 0.0052742926 0.4791599 0.008764152 0.1627479 0.54977812 4.736408e-03 57
SR -0.0068911133 0.5478918 0.007328219 0.1360830 0.35114827 1.751330e-04 57
SIGI 0.0009734538 0.5749528 0.008415131 0.1562667 0.90832829 5.334472e-04 57
TRV 0.0063405497 0.6503172 0.007496107 0.1392007 0.40130388 1.968401e-05 57
WRB 0.0048429224 0.6560228 0.008278888 0.1537367 0.56095848 7.866353e-05 57
CNA 0.0007392236 0.6721978 0.007379326 0.1370321 0.92056970 8.666018e-06 57
minb1 maxb1
CALM -0.55828405 0.2499795
EIG -0.17651424 0.6128264
SJM -0.06109466 0.5120284
VLGEA -0.04975695 0.6212786
AMSF -0.01780149 0.7255907
RLI 0.01041584 0.7113439
CFFI -0.06766193 0.7920096
FDP -0.07061824 0.8139335
ALL 0.15366406 0.8046558
SR 0.27572577 0.8200579
SIGI 0.26241942 0.8874862
TRV 0.37191579 0.9287185
WRB 0.34854936 0.9634962
CNA 0.39813363 0.9462620
nrow(less_risky_market)
[1] 14
There is only 14 stock(s) that are significantly less risky than the market.
Applying these 2 criteria I only got 2 stocks. If I want to get the best 10 stocks with these criteria, I can relax both criteria and do a filter by steps. Let’s do the following:
I will first use beta0 information to select the best 20 stocks. I can sort all stocks by the minimum of the 95% CI of beta0 (by the maximum to the minimum value):
# I first add the 95% minimum and the maximum values of the C.I. for beta1:
<- betasdf |>
betasdf mutate(
minb0= b0 - 2*seb0,
maxb0= b0 + 2*seb0)
# I do the first filter selecting 20 stocks based on the min of b0.
# I will add the restriction that the estimation of the market regression model had more than 30 months, to ensure that the beta estimations are more stable:
<- betasdf |>
filter1 filter(N>30) |>
arrange(desc(minb0)) |>
head(20)
filter1
b0 | b1 | seb0 | seb1 | pb0 | pb1 | N | minb1 | maxb1 | minb0 | maxb0 | |
---|---|---|---|---|---|---|---|---|---|---|---|
CALM | 0.0198805 | -0.1541523 | 0.0108815 | 0.2020659 | 0.0731274 | 0.4487939 | 57 | -0.5582840 | 0.2499795 | -0.0018825 | 0.0416434 |
AJG | 0.0113380 | 0.7393096 | 0.0071630 | 0.1330145 | 0.1191871 | 0.0000008 | 57 | 0.4732807 | 1.0053385 | -0.0029880 | 0.0256639 |
COOP | 0.0197037 | 1.4805047 | 0.0141093 | 0.2620058 | 0.1681725 | 0.0000006 | 57 | 0.9564930 | 2.0045163 | -0.0085149 | 0.0479223 |
TRV | 0.0063405 | 0.6503172 | 0.0074961 | 0.1392007 | 0.4013039 | 0.0000197 | 57 | 0.3719158 | 0.9287185 | -0.0086517 | 0.0213328 |
AFL | 0.0048980 | 0.9596294 | 0.0078087 | 0.1450050 | 0.5330908 | 0.0000000 | 57 | 0.6696195 | 1.2496393 | -0.0107194 | 0.0205153 |
RLI | 0.0079262 | 0.3608799 | 0.0094364 | 0.1752320 | 0.4045702 | 0.0441983 | 57 | 0.0104158 | 0.7113439 | -0.0109467 | 0.0267990 |
JBL | 0.0081466 | 1.2517573 | 0.0098525 | 0.1829589 | 0.4118950 | 0.0000000 | 57 | 0.8858396 | 1.6176751 | -0.0115585 | 0.0278516 |
WRB | 0.0048429 | 0.6560228 | 0.0082789 | 0.1537367 | 0.5609585 | 0.0000787 | 57 | 0.3485494 | 0.9634962 | -0.0117149 | 0.0214007 |
ORI | 0.0042975 | 0.9366384 | 0.0082018 | 0.1523056 | 0.6024031 | 0.0000001 | 57 | 0.6320273 | 1.2412496 | -0.0121061 | 0.0207012 |
ALL | 0.0052743 | 0.4791599 | 0.0087642 | 0.1627479 | 0.5497781 | 0.0047364 | 57 | 0.1536641 | 0.8046558 | -0.0122540 | 0.0228026 |
VLGEA | 0.0048138 | 0.2857608 | 0.0090340 | 0.1677589 | 0.5962808 | 0.0941379 | 57 | -0.0497570 | 0.6212786 | -0.0132542 | 0.0228818 |
SJM | 0.0019157 | 0.2254669 | 0.0077158 | 0.1432808 | 0.8048456 | 0.1213150 | 57 | -0.0610947 | 0.5120284 | -0.0135160 | 0.0173473 |
CNA | 0.0007392 | 0.6721978 | 0.0073793 | 0.1370321 | 0.9205697 | 0.0000087 | 57 | 0.3981336 | 0.9462620 | -0.0140194 | 0.0154979 |
L | -0.0005005 | 0.8390363 | 0.0067960 | 0.1261993 | 0.9415619 | 0.0000000 | 57 | 0.5866376 | 1.0914350 | -0.0140924 | 0.0130915 |
UNM | 0.0107548 | 0.8126782 | 0.0128720 | 0.2390300 | 0.4070410 | 0.0012612 | 57 | 0.3346183 | 1.2907382 | -0.0149892 | 0.0364989 |
HIG | 0.0032758 | 0.9601573 | 0.0093990 | 0.1745376 | 0.7287730 | 0.0000010 | 57 | 0.6110821 | 1.3092324 | -0.0155222 | 0.0220739 |
SIGI | 0.0009735 | 0.5749528 | 0.0084151 | 0.1562667 | 0.9083283 | 0.0005334 | 57 | 0.2624194 | 0.8874862 | -0.0158568 | 0.0178037 |
EIG | 0.0033086 | 0.2181561 | 0.0106267 | 0.1973352 | 0.7567106 | 0.2737510 | 57 | -0.1765142 | 0.6128264 | -0.0179448 | 0.0245621 |
OPY | 0.0034775 | 1.1347066 | 0.0107546 | 0.1997094 | 0.7476558 | 0.0000005 | 57 | 0.7352878 | 1.5341255 | -0.0180316 | 0.0249866 |
MKL | -0.0029650 | 0.7759603 | 0.0075975 | 0.1410843 | 0.6978552 | 0.0000010 | 57 | 0.4937917 | 1.0581289 | -0.0181601 | 0.0122301 |
After this first screening using beta0 information, I now use the information of beta1 to further select only 10 stocks.
For beta1 I will consider conservative stocks that are not too risky compared to the market, assuming that I will create a portfolio in the sort-term:
<- filter1 |>
filter2 arrange(maxb1) |>
head(10)
filter2
b0 | b1 | seb0 | seb1 | pb0 | pb1 | N | minb1 | maxb1 | minb0 | maxb0 | |
---|---|---|---|---|---|---|---|---|---|---|---|
CALM | 0.0198805 | -0.1541523 | 0.0108815 | 0.2020659 | 0.0731274 | 0.4487939 | 57 | -0.5582840 | 0.2499795 | -0.0018825 | 0.0416434 |
SJM | 0.0019157 | 0.2254669 | 0.0077158 | 0.1432808 | 0.8048456 | 0.1213150 | 57 | -0.0610947 | 0.5120284 | -0.0135160 | 0.0173473 |
EIG | 0.0033086 | 0.2181561 | 0.0106267 | 0.1973352 | 0.7567106 | 0.2737510 | 57 | -0.1765142 | 0.6128264 | -0.0179448 | 0.0245621 |
VLGEA | 0.0048138 | 0.2857608 | 0.0090340 | 0.1677589 | 0.5962808 | 0.0941379 | 57 | -0.0497570 | 0.6212786 | -0.0132542 | 0.0228818 |
RLI | 0.0079262 | 0.3608799 | 0.0094364 | 0.1752320 | 0.4045702 | 0.0441983 | 57 | 0.0104158 | 0.7113439 | -0.0109467 | 0.0267990 |
ALL | 0.0052743 | 0.4791599 | 0.0087642 | 0.1627479 | 0.5497781 | 0.0047364 | 57 | 0.1536641 | 0.8046558 | -0.0122540 | 0.0228026 |
SIGI | 0.0009735 | 0.5749528 | 0.0084151 | 0.1562667 | 0.9083283 | 0.0005334 | 57 | 0.2624194 | 0.8874862 | -0.0158568 | 0.0178037 |
TRV | 0.0063405 | 0.6503172 | 0.0074961 | 0.1392007 | 0.4013039 | 0.0000197 | 57 | 0.3719158 | 0.9287185 | -0.0086517 | 0.0213328 |
CNA | 0.0007392 | 0.6721978 | 0.0073793 | 0.1370321 | 0.9205697 | 0.0000087 | 57 | 0.3981336 | 0.9462620 | -0.0140194 | 0.0154979 |
WRB | 0.0048429 | 0.6560228 | 0.0082789 | 0.1537367 | 0.5609585 | 0.0000787 | 57 | 0.3485494 | 0.9634962 | -0.0117149 | 0.0214007 |
These will be my selected stocks based on their beta0 and beta1 information. These stocks can be candidates to include them in a winning portfolio.
I can show the name of these firms and their industry:
= rownames(filter2)
selected_tickers = as.data.frame(selected_tickers)
selected_tickers colnames(selected_tickers) = c("firm")
= merge(selected_tickers,usfirms,by.x="firm", by.y="empresa") selected_tickers
selected_tickers
firm | Nombre | status | partind | naics1 | naics2 | SectorEconomatica |
---|---|---|---|---|---|---|
ALL | Allstate Corp | activo | 0.10 | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
CALM | Cal-Maine Foods, Inc | activo | NA | Agricultura, ganadería, aprovechamiento forestal, pesca y caza | Producción de Animales y Acuicultura | Agro & Pesca |
CNA | CNA Financial Corp | activo | NA | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
EIG | Employers Hldg Inc | activo | NA | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
RLI | Rli Corp | activo | NA | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
SIGI | Selective Insurance Group, Inc | activo | NA | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
SJM | The J. M. Smucker Company | activo | 0.03 | Industrias manufactureras | Conservación de frutas, verduras, guisos y otros alimentos preparados | Alimentos y Beb |
TRV | Travelers Companies, Inc | activo | 0.12 | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
VLGEA | Village Super Market, Inc | activo | NA | Comercio al por menor | Almacenes de alimentos | Comercio |
WRB | W. R. Berkley Corp | activo | 0.04 | Servicios financieros y de seguros | Instituciones de seguros | Finanzas y Seguros |
I select the tickers column:
= selected_tickers$firm tickers
I bring monthly prices from Yahoo Finance:
library(quantmod)
getSymbols(Symbols = tickers, periodicity="monthly",from="2020-01-01", to="2024-10-31")
[1] "ALL" "CALM" "CNA" "EIG" "RLI" "SIGI" "SJM" "TRV" "VLGEA"
[10] "WRB"
I do the data management to create a return dataset:
<- lapply(tickers, get)
datasets_list <- do.call(merge, datasets_list)
prices # We select only Adjusted prices:
<- Ad(prices)
prices
# We change the name of the columns to be equal to the tickers vector:
names(prices) = tickers
#Calculating continuously compounded returns:
= diff(log(prices)) r
I get the Expected Returns for the stocks:
= colMeans(r,na.rm=TRUE)
ccmean_returns = exp(ccmean_returns) - 1
ER ER
ALL CALM CNA EIG RLI SIGI
0.010119504 0.018507230 0.007493000 0.005506566 0.011603861 0.006748959
SJM TRV VLGEA WRB
0.004180331 0.012930345 0.007702569 0.011472225
I get the Variance-Covariance matrix:
# We calculate the Variance-Covariance matrix with the var function:
=var(r,na.rm=TRUE)
COV # Note var function calculates the variance-covariance matrix if r has more than 1 column.
# If r has only 1 column, then var calculates only the variance of that column.
COV
ALL CALM CNA EIG RLI
ALL 4.805863e-03 -0.0008049815 0.0028619725 0.0004942691 0.0010617025
CALM -8.049815e-04 0.0064675163 -0.0006040997 0.0013452828 0.0003903121
CNA 2.861973e-03 -0.0006040997 0.0042309221 0.0016923103 0.0019709493
EIG 4.942691e-04 0.0013452828 0.0016923103 0.0062392775 0.0040160842
RLI 1.061702e-03 0.0003903121 0.0019709493 0.0040160842 0.0051840367
SIGI 2.110362e-03 -0.0002059750 0.0026340015 0.0021269729 0.0026080845
SJM 1.922001e-06 0.0008226165 0.0001326381 0.0010027191 0.0008089074
TRV 2.835639e-03 -0.0004794668 0.0031211579 0.0020064495 0.0024491445
VLGEA 1.108544e-03 0.0015632195 0.0010005230 0.0021102787 0.0025779227
WRB 3.067534e-03 -0.0003185802 0.0033795999 0.0018220343 0.0020181785
SIGI SJM TRV VLGEA WRB
ALL 0.0021103621 1.922001e-06 0.0028356393 0.0011085443 0.0030675338
CALM -0.0002059750 8.226165e-04 -0.0004794668 0.0015632195 -0.0003185802
CNA 0.0026340015 1.326381e-04 0.0031211579 0.0010005230 0.0033795999
EIG 0.0021269729 1.002719e-03 0.0020064495 0.0021102787 0.0018220343
RLI 0.0026080845 8.089074e-04 0.0024491445 0.0025779227 0.0020181785
SIGI 0.0047695496 6.514068e-04 0.0026280577 0.0006681966 0.0026370310
SJM 0.0006514068 3.362650e-03 0.0002252621 0.0008141834 0.0001672133
TRV 0.0026280577 2.252621e-04 0.0042423514 0.0012092477 0.0035212997
VLGEA 0.0006681966 8.141834e-04 0.0012092477 0.0046438550 0.0009514900
WRB 0.0026370310 1.672133e-04 0.0035212997 0.0009514900 0.0049310149
I now generate 10,000 random portfolios:
# Initialize the Weight matrix
= c()
W # Create 4 vectors of 1000 random values from 0 to 1:
for(i in 1:10) {
= rbind(W,runif(10000))
W
}# The problem I have now is that some of the 1,000 portfolios
# might end up having a sum of weights higher than one.
# I can do a simple "trick" by
# dividing each of the 4 weights by the sum of these 4
# weights. And I can do this # for all 1000 portfolios:
# I first create a vector with the sum of weights for all portfolios:
<- colSums(W)
sumw # I do another loop to divide each weight by the sum of weights:
for(i in 1:10) {
<-W[i,]/sumw
W[i,]# In each iteration I divide one raw of W2 by the vector sumw,
# which is the sum of the weights of all 1000 portfolios
}
# I check that the sum of weights is 1 (I do this for 5 portfolios)
1:5] W[,
[,1] [,2] [,3] [,4] [,5]
[1,] 0.164194821 0.133227989 0.229972350 0.16296711 0.011870374
[2,] 0.238535227 0.077625858 0.126307719 0.01121542 0.178613086
[3,] 0.157410213 0.151120432 0.076791777 0.11815910 0.178386763
[4,] 0.225814659 0.103510809 0.004533933 0.09593400 0.000270053
[5,] 0.057418770 0.069527936 0.096700015 0.05455689 0.090638681
[6,] 0.041345846 0.081167242 0.046281369 0.09738743 0.191569866
[7,] 0.027976689 0.003707023 0.041371959 0.11494273 0.086374875
[8,] 0.015185007 0.056791268 0.165067279 0.08657030 0.043265799
[9,] 0.001387582 0.196604614 0.111840610 0.14369416 0.153140971
[10,] 0.070731186 0.126716827 0.101132988 0.11457285 0.065869531
I calculate the Expected Return of the 10,000 portfolios:
= t(W) %*% ER ERPortfolios
I calculate the expected risk of the 10,000 portfolios:
= diag(t(W) %*% COV %*% W)
VARPortfolios = sqrt(VARPortfolios) RISKPortfolios
Now we can visualize the feasible portfolios in the risk-return space:
plot(x=RISKPortfolios,y= ERPortfolios,
xlabel="Portfolio Expected Risk", ylabel="Portfolio Expected Return")
I now directly estimate the GMV portfolio:
= globalMin.portfolio(ER,COV, shorts = FALSE)
GMVPort GMVPort
Call:
globalMin.portfolio(er = ER, cov.mat = COV, shorts = FALSE)
Portfolio expected return: 0.009354169
Portfolio standard deviation: 0.03689871
Portfolio weights:
ALL CALM CNA EIG RLI SIGI SJM TRV VLGEA WRB
0.1624 0.1892 0.1250 0.0013 0.0309 0.0526 0.3152 0.0543 0.0691 0.0000