I have following stocks in my portfolio i.e. Petronet LNG,RIL, ICICI Bank, Infosys & S&P 500 NYSE. To avoid putting all eggs in one basket, my investment strategy is to invest in all of the five stocks rather than to invest in only one stock.

Basic Concepts

Before we dive into the R programming, let’s review some of the basic concepts about the financial portfolio management.

Expected Return

Expected Return is a weighted-average return of a portfolio. The return of each asset in the portfolio can be the mean of return values of the asset in a past period of time; it can also be calculated by the capital asset pricing model (CAPM), which is used to determine a theoretically appropriate required rate of return of an asset1. It can also calculated by other factor models. I will use CAPM to calculate the expected return of each stock in this article.

Risk

Risk is the variation of the return of an asset. It can be calculated by the standard deviation of the historical return data; it can also be calculated by factor models. I will use one-factor model to calculate the variance of an asset and the covariance of two assets in a portfolio.

Sharpe Ratio

The Sharpe ratio characterizes how well the return of an asset compensates the investor for the risk taken. The higher Sharpe Ratio means the better investment option. In a set of risky assets, I can find the optimal portfolio asset allocations so that the Sharpe Ratio is the largest.

Capital Asset Pricing Model (CAPM)

The CAPM is a model for pricing an individual security or portfolio. The formula of CAPM is in the following:-

E(Ri) = Rf + βi * (E(Rm) - Rf)

Where:

E(Ri) is the expect return of an asset

Rf is the risk-free return, we usually use the interest rate of T-Bills as the Rf

E(Rm) is the expected return of the market, we usually use SP500 index return as the US stock market

βi is the sensitivity of the expected excess asset return to the expected excess market return. We can get the value of βi by do a regression on the historical data of the excess returns of an asset and the market.

Reading the data

library(xlsx)
return <- read.xlsx("C:/Data Science/R/Exercises/Designing Investment Portfolio/Investment Portfolio.xlsx", sheetIndex = 1, header = TRUE)
head(return)
##        Month SP500NYSE Petronet Infosys     RIL   ICICI BSESensex
## 1 2017-02-01    0.0372   0.0873  0.0893  0.1867  0.0273    0.0393
## 2 2017-01-01    0.0179   0.0177 -0.0805 -0.0340  0.0539    0.0387
## 3 2016-12-01    0.0182  -0.0547  0.0361  0.0880 -0.0366   -0.0010
## 4 2016-11-01    0.0342   0.0044 -0.0270 -0.0556 -0.0433   -0.0457
## 5 2016-10-01   -0.0194   0.1204 -0.0343 -0.0286  0.0979    0.0023
## 6 2016-09-01   -0.0012  -0.0183  0.0013  0.0228 -0.0210   -0.0206
##   IndiaBond USTBills
## 1    0.0394   0.0052
## 2   -0.0148   0.0051
## 3    0.0685   0.0051
## 4   -0.0755   0.0045
## 5   -0.0223   0.0033
## 6   -0.0213   0.0029

The excess return is equal to monthly return minus monthly T-Bills / Indian Bond Yield interest rate.

library(xlsx)
Excessreturn <- read.xlsx("C:/Data Science/R/Exercises/Designing Investment Portfolio/Investment Portfolio.xlsx", sheetIndex = 2, header = TRUE)
head(Excessreturn)
##        Month BSESensex SP500NYSE Petronet Infosys     RIL   ICICI NA.
## 1 2017-02-01   -0.0001    0.0320   0.0479  0.0499  0.1473 -0.0121  NA
## 2 2017-01-01    0.0535    0.0128   0.0325 -0.0657 -0.0192  0.0687  NA
## 3 2016-12-01   -0.0695    0.0131  -0.1232 -0.0324  0.0195 -0.1051  NA
## 4 2016-11-01    0.0298    0.0297   0.0799  0.0485  0.0199  0.0322  NA
## 5 2016-10-01    0.0246   -0.0227   0.1427 -0.0120 -0.0063  0.1202  NA
## 6 2016-09-01    0.0007   -0.0041   0.0030  0.0226  0.0441  0.0003  NA

Removing the date column & calculating the mean and standard deviation

Excessreturn <- Excessreturn[,-1]
Mean <- sapply(Excessreturn,mean)
SD <- sapply(Excessreturn,sd)
cbind(Mean,SD)
##                  Mean         SD
## BSESensex 0.010959184 0.05828589
## SP500NYSE 0.008451020 0.02937673
## Petronet  0.025561224 0.08689000
## Infosys   0.013216327 0.07009431
## RIL       0.011818367 0.07440408
## ICICI     0.009795918 0.11365530
## NA.                NA         NA

I will use the mean and standard deviation of BSE Sensex excess returns as the market excess return and risk (standard deviation) in the following parts.

CAPM Analysis

According to the CAPM formula, I will first get the beta of each stock by regressions; then further calculate the expected return of each stock and the covariance matrix of the four stocks; finally I can calculate the optimal asset allocations (weights) of the portfolio consisting of the four stocks.

Beta (β)

I can conduct the regression on the excess returns of each stock and the excess return of market (BSE Sensex).

Regression of Petronet

lmPetronet <- lm(Petronet~BSESensex,Excessreturn)
summary(lmPetronet)
## 
## Call:
## lm(formula = Petronet ~ BSESensex, data = Excessreturn)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.202429 -0.036015 -0.000749  0.040169  0.206109 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.01601    0.01036   1.546    0.129    
## BSESensex    0.87127    0.17644   4.938 1.04e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.07125 on 47 degrees of freedom
## Multiple R-squared:  0.3416, Adjusted R-squared:  0.3276 
## F-statistic: 24.38 on 1 and 47 DF,  p-value: 1.037e-05

Therefore beta (β) for Petronet = 0.87127

The CAPM formula of Petronet is:-

E(RPetronet) - Rf = βPetronet * (E(Rm) - Rf), that is,

Expected excess return of Petronet =βPetronet * Expected excess return of BSESensex

Where:

The expected excess return of BSESensex is the mean value, which is 0.010959184, of the excess BSESensex return in the past 4 years.

So, the expected excess return of Petronet = 0.87127*0.010959184 = 0.009548408

According to Single-Factor Model, the variance of Petronet is:-

σPetronet2 = βPetronet2 * σBSESensex2+ σ(e)2

Where,

σBSESensex2 is the variance of BSESensex, which is 0.058285892

σ(e)2 is the variance of residual standard error, which is 0.07125, as shown in the above output.

So, σPetronet2 = 0.871272 * 0.058285892 + 0.071252 = 0.0874954282

Regression of Infosys

lmInfosys <- lm(Infosys~BSESensex,Excessreturn)
summary(lmInfosys)
## 
## Call:
## lm(formula = Infosys ~ BSESensex, data = Excessreturn)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.212659 -0.039667  0.007818  0.046923  0.106175 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.01022    0.01003   1.019    0.314
## BSESensex    0.27355    0.17082   1.601    0.116
## 
## Residual standard error: 0.06898 on 47 degrees of freedom
## Multiple R-squared:  0.05174,    Adjusted R-squared:  0.03156 
## F-statistic: 2.564 on 1 and 47 DF,  p-value: 0.116

Therefore beta (β) for Infosys = 0.27355

Expected excess return of Infosys = βInfosys * Expected excess return of BSESensex

So, the expected excess return of Infosys = 0.27355*0.010959184 = 0.0030

And σInfosys2 = 0.273552 * 0.058285892 + 0.068982 = 0.0756604952

Regression of RIL

lmRIL <- lm(RIL~BSESensex,Excessreturn)
summary(lmRIL)
## 
## Call:
## lm(formula = RIL ~ BSESensex, data = Excessreturn)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.093362 -0.052119 -0.007302  0.031959  0.144511 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.002870   0.008405   0.342    0.734    
## BSESensex   0.816491   0.143133   5.704 7.52e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0578 on 47 degrees of freedom
## Multiple R-squared:  0.4091, Adjusted R-squared:  0.3965 
## F-statistic: 32.54 on 1 and 47 DF,  p-value: 7.519e-07

Therefore beta (β) for RIL = 0.816491

Expected excess return of RIL = βRIL * Expected excess return of BSESensex

So, the expected excess return of RIL = 0.816491*0.010959184 = 0.0089

And σRIL2 = 0.8164912 * 0.058285892 + 0.05782 = 0.0808393232

Regression of ICICI

lmICICI <- lm(ICICI~BSESensex,Excessreturn)
summary(lmICICI)
## 
## Call:
## lm(formula = ICICI ~ BSESensex, data = Excessreturn)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.103964 -0.031254  0.000051  0.022188  0.125241 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.009781   0.006697   -1.46    0.151    
## BSESensex    1.786325   0.114055   15.66   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04606 on 47 degrees of freedom
## Multiple R-squared:  0.8392, Adjusted R-squared:  0.8358 
## F-statistic: 245.3 on 1 and 47 DF,  p-value: < 2.2e-16

Therefore beta (β) for ICICI = 1.786325

Expected excess return of ICICI = βICICI * Expected excess return of BSESensex

So, the expected excess return of ICICI = 1.786325*0.010959184 = 0.0196

And σICICI2 = 1.78632512 * 0.058285892 + 0.046062 = 0.0925697032

Regression of S&P 500 NYSE

lmSP500NYSE <- lm(SP500NYSE~BSESensex,Excessreturn)
summary(lmSP500NYSE)
## 
## Call:
## lm(formula = SP500NYSE ~ BSESensex, data = Excessreturn)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.060565 -0.014220 -0.001658  0.016714  0.073743 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 0.006721   0.004100   1.639   0.1078  
## BSESensex   0.157868   0.069818   2.261   0.0284 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02819 on 47 degrees of freedom
## Multiple R-squared:  0.09811,    Adjusted R-squared:  0.07892 
## F-statistic: 5.113 on 1 and 47 DF,  p-value: 0.02842

Therefore beta (β) for SP500NYSE = 0.157868

Expected excess return of SP500NYSE = βSP500NYSE * Expected excess return of BSESensex

So, the expected excess return of SP500NYSE = 0.157868*0.010959184 = 0.0017

And σSP500NYSE2 = 0.1578682 * 0.058285892 + 0.028192 = 0.049757552

Summarizing the Analysis

Portfolio <- matrix(c(0.8713,0.2736,0.8165,1.7863,0.1579,0.07125,0.06898,0.0578,0.04606,0.02819,0.0095,0.0030,0.0089,0.0196,0.0017,0.00766,0.00572,0.00653,0.00857,0.00248),nrow = 4,ncol=5,byrow = TRUE)
colnames(Portfolio) <- c("Petronet","Infosys","RIL","ICICI","S&P 500 NYSE")
rownames(Portfolio) <- c("Beta","Residual Standard Error","Expected Excess Return","Variance")
Portfolio
##                         Petronet Infosys     RIL   ICICI S&P 500 NYSE
## Beta                     0.87130 0.27360 0.81650 1.78630      0.15790
## Residual Standard Error  0.07125 0.06898 0.05780 0.04606      0.02819
## Expected Excess Return   0.00950 0.00300 0.00890 0.01960      0.00170
## Variance                 0.00766 0.00572 0.00653 0.00857      0.00248
library(stockPortfolio)
stocks <- c('PETRONET.BO','INFY.BO','RELIANCE.BO','ICICIBANK.BO','^GSPC')
returns <- getReturns(stocks,freq = "month", start = "2013-03-01", end = "2017-03-31")
summary(returns)
## 5 stocks, observed once per month between 2013-04-01 and 2017-03-01 
## 
##             PETRONET.BO INFY.BO RELIANCE.BO ICICIBANK.BO ^GSPC
## Mean Return       0.027   0.023       0.014         0.08 0.009

Calculating the covariance between portfolio stocks

## Covariance between Petronet & Infosys = &beta;Petronet*&beta;Infosys*&sigma;BSESensex^2^
## Cov(Petronet,Infosys) = 0.87130*0.27360*0.05828589^2^ = 0.000809685
## I calculated the variance in the table as follows
covar <- read.xlsx("C:/Data Science/R/Exercises/Designing Investment Portfolio/Investment Portfolio.xlsx",sheetIndex = 4)
covar
##         NA.     Petronet      Infosys          RIL        ICICI
## 1  Petronet 0.0025788874 0.0008096855 0.0024167461 0.0052873749
## 2   Infosys 0.0008096855 0.0002542145 0.0007587784 0.0016600611
## 3       RIL 0.0024167461 0.0007587784 0.0022647990 0.0049549439
## 4     ICICI 0.0052873749 0.0016600611 0.0049549439 0.0108404626
## 5 SP500NYSE 0.0004672763 0.0001467093 0.0004378974 0.0009580352
##      SP500NYSE
## 1 4.672763e-04
## 2 1.467093e-04
## 3 4.378974e-04
## 4 9.580352e-04
## 5 8.466718e-05

Mean-Variance Portfolio Optimization

Now with the expected excess returns and the covariance matrix ready, I can conduct the Mean-Variance Portfolio Optimization to get the optimal allocation (weight) of each stock so that the Sharpe Ratio of the portfolio is maximized.

Here I will use the functions effFrontier and maxSharpe from stockPrtfolio R package from which use the core function of optimalPort for the calculations of efficient frontier and maximum Sharpe Ratio.

I need the expected excess return matrix and covariance matrix for the parameters of maxSharpe function. The numbers are from the calculations in the previous parts.

Developing Efficient Frontier function with following assumptions

1. No short selling is allowed

2. max.allocation is the maximum % allowed for any one security (reduces concentration)

3. risk.premium.up is the upper limit of the risk premium modeled in the for loop

4. risk.increment is the increment (by) value used in the for loop

5. Dmat: The covariance table (matrix). I will calculate this based on the data frame of returns.

6. dvec: This is a vector of the average returns for each security. To find the minimium portfolio variance, I set these all to zero. To find points along the efficient frontier, I use a for loop to allow these returns to vary.

7. Amat: This is the matrix of constraints. This can be a bit complicated. Hang in there as I explain a little behind creating a matrix that imposes constraints. For those not steeped in algebraic matrix math, it may be easiest to learn by examing the examples. In the example, I use Amat to impose three distinct constraints: the portfolio weights must sum to 1, whether we allow any weights to be negative (implying short-selling and leverage), and whether there is a limit to any individual weight (to avoid high concentrations in just one security).

8.bvec: Think of this as the legend to the Amat. This is a vector of values that is matched up against the Amat matrix to enforce our constraints. Again, if you are not familiar with matrix math, it may be easiest to learn by looking at the example and playing around with the code.

9. meq: This simply tells the solve.QP function which columns in the Amat matrix to treat as equality constraints. In our example, we only have one (the weights must sum to 1). The others are all inequality constraints (for example, weight must be > 0). We simply assign 1 to meq.

eff.frontier <- function (returns, short="no", max.allocation=NULL, risk.premium.up=.5, risk.increment=.005){
        covariance <- cov(returns)
        print(covariance)
        n <- ncol(covariance)
 
# Create initial Amat and bvec assuming only equality constraint (short-selling is allowed, no allocation constraints)
Amat <- matrix (1, nrow=n)
bvec <- 1
meq <- 1
 
# Then modify the Amat and bvec if short-selling is prohibited
if(short=="no"){
Amat <- cbind(1, diag(n))
bvec <- c(bvec, rep(0, n))
}
# And modify Amat and bvec if a max allocation (concentration) is specified
if(!is.null(max.allocation)){
if(max.allocation > 1 | max.allocation <0){
stop("max.allocation must be greater than 0 and less than 1")
}
if(max.allocation * n < 1){
stop("Need to set max.allocation higher; not enough assets to add to 1")
}
Amat <- cbind(Amat, -diag(n))
bvec <- c(bvec, rep(-max.allocation, n))
}
 
# Calculate the number of loops based on how high to vary the risk premium and by what increment
loops <- risk.premium.up / risk.increment + 1
loop <- 1
 
# Initialize a matrix to contain allocation and statistics
# This is not necessary, but speeds up processing and uses less memory
eff <- matrix(nrow=loops, ncol=n+3)
# Now I need to give the matrix column names
colnames(eff) <- c(colnames(returns), "Std.Dev", "Exp.Return", "sharpe")
 
# Loop through the quadratic program solver
for (i in seq(from=0, to=risk.premium.up, by=risk.increment)){
dvec <- colMeans(returns) * i # This moves the solution up along the efficient frontier
sol <- solve.QP(covariance, dvec=dvec, Amat=Amat, bvec=bvec, meq=meq)
eff[loop,"Std.Dev"] <- sqrt(sum(sol$solution *colSums((covariance * sol$solution))))
eff[loop,"Exp.Return"] <- as.numeric(sol$solution %*% colMeans(returns))
eff[loop,"sharpe"] <- eff[loop,"Exp.Return"] / eff[loop,"Std.Dev"]
eff[loop,1:n] <- sol$solution
loop <- loop+1
}
 
return(as.data.frame(eff))
}

Applying Efficient frontier function

library(quadprog)
eff <- eff.frontier(returns=returns$R, short="no", max.allocation=NULL, risk.premium.up=.5, risk.increment=.001)
##                PETRONET.BO       INFY.BO   RELIANCE.BO  ICICIBANK.BO
## PETRONET.BO   0.0051853804 -0.0001555386  0.0010311263  0.0058554891
## INFY.BO      -0.0001555386  0.0238691276 -0.0016419514  0.0747884926
## RELIANCE.BO   0.0010311263 -0.0016419514  0.0048537135 -0.0065187494
## ICICIBANK.BO  0.0058554891  0.0747884926 -0.0065187494  0.3606697987
## ^GSPC         0.0005293896 -0.0001171956  0.0005990592 -0.0001692872
##                      ^GSPC
## PETRONET.BO   0.0005293896
## INFY.BO      -0.0001171956
## RELIANCE.BO   0.0005990592
## ICICIBANK.BO -0.0001692872
## ^GSPC         0.0008713377

Finding the optimum portfolio

eff.optimal.point <- eff[eff$sharpe==max(eff$sharpe),]*100
eff.optimal.point
##    PETRONET.BO  INFY.BO RELIANCE.BO ICICIBANK.BO    ^GSPC  Std.Dev
## 74    30.08916 8.227545    11.84337            0 49.83992 3.419415
##    Exp.Return   sharpe
## 74   1.603218 46.88574

Therefore, from the given choices, it is advisable to invest about 30% of money in Petronet LNG, 8% in Infosys, 0% in ICICI Bank and 50% in S&P 500 NYSE.

Plotting the graph

eff.optimal.point <- eff[eff$sharpe==max(eff$sharpe),]
library(ggplot2)
# Color Scheme
 ealred  <- "#7D110C"
 ealtan  <- "#CDC4B6"
 eallighttan <- "#F7F6F0"
 ealdark  <- "#423C30"
ggplot(eff, aes(x=Std.Dev, y=Exp.Return)) + geom_point(alpha=.1, color=ealdark) +
 geom_point(data=eff.optimal.point, aes(x=Std.Dev, y=Exp.Return, label=sharpe), color=ealred, size=5) +
 annotate(geom="text", x=eff.optimal.point$Std.Dev, y=eff.optimal.point$Exp.Return,
 label=paste("Risk: ", round(eff.optimal.point$Std.Dev*100, digits=3),"\nReturn: ",
 round(eff.optimal.point$Exp.Return*100, digits=4),"%\nSharpe: ",
 round(eff.optimal.point$sharpe*100, digits=2), "%", sep=""), hjust=0, vjust=1.2) +
 ggtitle("Efficient Frontier\nand Optimal Portfolio") + labs(x="Risk (standard deviation of portfolio variance)", y="Return") +
 theme(panel.background=element_rect(fill=eallighttan), text=element_text(color=ealdark),
 plot.title=element_text(size=24, color=ealred, hjust = 0.5))