FINAL PROJECT. STAT METHODS IN FINANCE

We choose six stocks from HSY The Hershey Company, AME Ametek, CMG Chipotle Mexican Grill, FOSL Fossil, KSU Kansas City Southern Railway Company, and MCK McKesson Corporation. Those stocks will be abbreviated as HSY, AME, CMG, FOSL, KSU,and MCK in our following analysis. Daily adjusted closing prices on these six stocks and the S&P 500 index from January 3, 2007, to November 11, 2013, are usedas the sample data. S&P 500 index is taken as the market price. Three-month Treasury yield curve rates of the same period are used as the risk-free returns.The excess returns are the returns minus the T-bill rates. The sample data has 1728 observations in total. All rates are divided by 253 to convert to a daily rate.

The remainder of the project is organized as follows. Exploratory analysis first reports statistical summary of the six stocks, and then describes the time series plots, kernel density functions, and normality testing, and finally choose a distribution model that provides the best fit to our sample data. Portfolio theory and asset allocation part investigates minimum variance portfolio and tangency portfolio with and without short sales. Capital asset pricing model focuses on finding betas of portfolios and each stock and testing whether a stock is mispriced. Value at risk part gives the threshold loss value for the portfolios and their expected shortfall at 1% and 5% significance level. Principal component analysis examines components of the total variation of stock returns. Copula analyzes dependency among six stocks and we investigate the best copulas model that provides a good fit to the sample data. At last, we experiment resampling and efficient portfolio analysis to select the best shrinkage parameter to stabilize returns estimation.

EXPLANATORY DATA

opts_chunk$set(comment = "test", results = "hide", echo = FALSE, warning = FALSE, 
    error = FALSE, message = FALSE)

STATISTICS REPORTING

	HSY_ac	AME_ac	CMG_ac	FOSL_ac	KSU_ac	MCK_ac
MEAN	-4.902 × 10^-4	-7.4017 × 10^-4	-0.0013	-9.9161 × 10^-4	-8.5507 × 10^-4	-6.8643 × 10^-4
MEDIAN	-3.2462 × 10^-4	-0.001	-0.0013	-8.3063 × 10^-4	-0.0013	-7.7845 × 10^-4
SD	0.0146	0.0198	0.0278	0.033	0.0278	0.0181
KURTOSIS	5.9755	5.9962	11.3501	28.988	5.0135	15.0344
SKEWNESS	-0.1992	0.1791	0.6735	1.3211	0.3361	0.2253

Skewness measures the degree of asymmetry, with symmetry implying zero skewness, positive skewness indicating a relatively long right tail compared to the left tail, and negative skewness indicating the opposite. Every normal distribution has a skewness coefficient of 0. Deviation of the sample skewness from 0 is an indication of nonnormality. The sample skewness results show that the log return distribution of HSY is a little left-skewed, with a value of -0.1992, and that the log return distributions for all other five stocks are right-skewed, with positive sample skewness values included in the above table.

Kurtosis examines the extent to which probability is concentrated in the center and especially the tails of the distribution. Because kurtosis is particularly sensitive to tail weight, high kurtosis is nearly synonymous with having a heavy tailed distribution. The kurtosis of a normal random variable is 3. The excess kurtosis is kurtosis minus 3. By examining the sample excess kurtosis values contained in the above table, we find that all six log return distributions have heavy tails since the sample excess kurtoses are well above 0 and indicate nonnormality. The log return distribution has heavy tails because there is a possibility of an extremely large negative return for each stock. 

Figure 1 shows time series plots of daily prices in the six stocks, including HSY, AME, CMG, FOSL, KSU, and MCK. Those six time series plots do not indicate stationary time series and exhibit upward trend from 2009, with some fluctuations during the sample period. The time series plots also show that stock prices dropped significantly during the financial crisis. Therefore, we cannot conclude that the fluctuations are of constant size. The volatility fluctuates more intensely for the price of FOSL around 2012, while HSY and MCK appreciated steadily throughout the period.

TIME SERIES PLOTS

Fig.1

plot of chunk unnamed-chunk-3

DENSITY ESITIMATION & MODEL SELECTION

Fig.3

plot of chunk unnamed-chunk-4

NORMALITY TEST: SHAPIRO TEST P-VALUE REPORT

Table2

	HSY_ac	AME_ac	CMG_ac	FOSL_ac	KSU_ac	MCK_ac
P-Value	2.8356 × 10^-29	8.5938 × 10^-28	1.7972 × 10^-32	4.4683 × 10^-36	1.9184 × 10^-25	1.177 × 10^-35

A kernel density estimate is often used to suggest a parametric statistical model. From Figure 3, the kernel density plots do not indicate normal distributions due to heavy tails, which is consistent with the sample kurtosis values. To test whether the sample data of log returns on stocks are normally distributed, the null hypothesis is that the sample comes from a normal distribution and the alternative is that the sample is from a nonnormal distribution. The Shapiro-Wilk test is used to do this hypothesis testing and the results with p-values are shown in Table 2.For all these six stocks, the Shapiro-Wilk test rejects the null hypothesis of normality at the level 0.01, with extremely small p-values. A small p-value is interpreted as evidence that the sample is not from a normal distribution. Therefore, the normal distribution model is not appropriate for the sample data.

MODEL SELECTION

Table.3

HSY_ac DISTRIBUTION	aic	bic	number of parameters
T	-118.5314	-102.169	3
SKEWED T	-116.6199	-94.8033	4
GED	-74.8812	-58.5188	3
SKEWED GED	-73.13	-51.3134	4

AME_ac DISTRIBUTION	aic	bic	number of parameters
T	980.5509	996.9133	3
SKEWED T	981.7215	1003.5381	4
GED	995.8256	1012.1881	3
SKEWED GED	996.9075	1018.7241	4

CMG_ac DISTRIBUTION	aic	bic	number of parameters
T	2059.7382	2076.1006	3
SKEWED T	2060.9086	2082.7251	4
GED	2096.5439	2112.9063	3
SKEWED GED	2098.5379	2120.3545	4

FOSL_ac DISTRIBUTION	aic	bic	number of parameters
T	2569.4245	2585.7869	3
SKEWED T	2571.407	2593.2235	4
GED	2614.8917	2631.2541	3
SKEWED GED	2616.5095	2638.326	4

KSU_ac DISTRIBUTION	aic	bic	number of parameters
T	2216.8936	2233.2561	3
SKEWED T	2218.564	2240.3806	4
GED	2237.5476	2253.9101	3
SKEWED GED	2238.4245	2260.2411	4

MCK_ac DISTRIBUTION	aic	bic	number of parameters
T	442.044	458.4064	3
SKEWED T	459.7617	481.5782	4
GED	504.9828	521.3453	3
SKEWED GED	506.8025	528.619	4

CONCLUSION

Table.4

DISTRIBUTION	HSY_ac	AME_ac	CMG_ac	FOSL_ac	KSU_ac	MCK_ac
T DISTRIBUTION	BEST	BEST	BEST	BEST	BEST	BEST
SKEWED.T DIST.
GE DISTRIBUTION
SKEWED.GE DIST.

QQ-PLOT OF BEST FITTED DISTRIBUTION

T-DISTRIBUTION QQ-PLOT OF HSY AME CMG FOSL KSU MCK
Fig.4

plot of chunk unnamed-chunk-6

We next focus on model selection to identify parametric statistical models that might be appropriate for the sample data of six stocks. The model selection includes four types of distribution: t-distribution, skewed t-distribution, generalized error distribution, and skewed generalized error distribution. For log returns of each stock, we calculate AIC (Akaike’s information criterion) and BIC (Bayesian information criterion) using four types of distributions and choose the one with minimum AIC or BIC value. AIC and BIC are two means for achieving a good tradeoff between fit and complexity. The summary of the results of the model selection is included in Table 3 and Table 4 (in Appendix). Accordingly, the t-distribution model provides a reasonably good fit to the sample data of log returns on six stocks. Since t-distribution has heavier tails, it can be used to capture the heavy tales of financial data. 

For large sample sizes, some deviation from linearity is to be expected in the extreme left and right tails, where the plots are more variable. Figure 4 contains quantile-quantile plots of samples of size 1728 from t-distributions with 2.89, 2.92, 3.03, 3.09, 3.26 and 2.89 degrees of freedom for the corresponding stocks HSY, AME, CMG, FOSL, KSU, and MCK, respectively. For log returns of each stock, the choice of t-distribution with corresponding degrees of freedom gives the best fitting to the data, because except a few outliers, the QQ plots have the best linearity against the reference line and almost no heavy tails. It is worthwhile to note that the historical sample data have more extreme outliers than a t-distribution. Heavy-tailed distributions with little or no skewness are common in finance and thus the t-distribution is a reasonable model for stock log returns in our sample data.

Portfolio Management

Portfolio without short sales

plot of chunk unnamed-chunk-9

Weights of tangency portfolio : 0.164, 0.048, 0.356, 0.105, 0.034, 0.292

Risk and mean of tangency portfolio : 1.67, 0.12

Sharpe ratio of tangency portfolio : 0.069

Weights minimum variance portfoli : 0.578, 0.124, 0.035, -3.16 × 10^-19, 3.596 × 10^-18, 0.263

Risk and mean of minimum variance portfolio : 1.294, 0.074

Sharpe ratio of minimum variance portfolio : 0.055

percentile method bootstrap confidence interval for sharp ratio without short selling

5%	95%
0.09	0.16

plot of chunk unnamed-chunk-11

Portfolio with short sales

plot of chunk unnamed-chunk-13

Weights of tangency portfolio : 0.162, 0.047, 0.358, 0.106, 0.035, 0.292

Risk and mean of tangency portfolio : 1.675, 0.12

Sharpe ratio of tangency portfolio : 0.069

Weights minimum variance portfoli : 0.586, 0.169, 0.051, -0.024, -0.052, 0.27

Risk and mean of minimum variance portfolio : 1.287, 0.072

Sharpe ratio of minimum variance portfolio : 0.053

plot of chunk unnamed-chunk-15

percentile method bootstrap confidence interval for sharp ratio with short selling

5%	95%
0.09	0.162

Choose best shrinkage parameter to gain more robust portfolio and sharp ratio

The way of estimating returns on stocks used in the former portfolio analysis is taking mean. However, this estimation tends to overestimate the Sharpe’s ratio. The bootstrapping study which takes price data as real data and then resamples from it shows clearly that the Sharpe’s ratio is overestimated. Figure 9 is the boxplot of the Sharpe’s ratios with short selling computed from 300 resamples. The line stands for real sharp ratio. Figure 10 is the boxplot of the Sharpe’s ratios without short selling computed from 300 resamples. The line stands for real sharp ratio. Comparing these two results, we see that constraining short selling could help stabilize the estimation. However, the estimation is still too volatile. One way to solve this problem is using shrinkage parameter to stabilize estimation. We conduct bootstrapping studies of 8 shrinkage parameters to choose the best one.

Red line stands for original sharpe ratio of tangency portfolio.

Shrinkage parameter = .1

plot of chunk unnamed-chunk-17

5%	95%
0.088	0.172

shrinkage parameter = .2

plot of chunk unnamed-chunk-18

5%	95%
0.083	0.178

shrinkage parameter = .3

plot of chunk unnamed-chunk-19

5%	95%
0.082	0.168

shrinkage parameter = .4

plot of chunk unnamed-chunk-20

5%	95%
0.08	0.166

shrinkage parameter = .5

plot of chunk unnamed-chunk-21

5%	95%
0.079	0.165

shrinkage parameter = .6

plot of chunk unnamed-chunk-22

5%	95%
0.085	0.168

shrinkage parameter = .7

plot of chunk unnamed-chunk-23

5%	95%
0.08	0.164

shrinkage parameter = .8

plot of chunk unnamed-chunk-24

5%	95%
0.086	0.165

By comparing their effects in stabilizing and trade-off between bias and variance, we recommend α=0.5. Still, the estimation tends to be overestimated. To gain more accurate estimation of returns, we could take advantages of factor models to predict returns.

Security Market Lines

The capital asset pricing model (CAPM) provides estimates of expected rates of return on individual stocks. Specifically, the risk premium of an asset is the product of its beta and the risk premium of the market portfolio. We use the following regression model to estimate α and β for each stock, tangency portfolio, and minimum variance portfolio:Rj,t = αj + βjRM,t + εj,t, where Rj,t = Rj,t – μf,t, the excess return on the jth stock at time t, and RM,t = RM,t – μf,t, the excess return on the market portfolio. The regression results are summarized in Table 10 below.

Tabke.10 formula : lm(formula = stockExRet ~ market)

Coefficients	HSY_AC	AME_AC	CMG_AC	FOSL_AC	KSU_AC	MCK_AC	Tangency Portfolio	Minimum Portfolio
(Intercept)	0.0004139	0.0007768	-0.0007538	0.0009624	0.0004658	0.0004705	0.00009	0.0004
market	0.4598089	1.1768854	1.3087824	1.9838916	1.3976591	0.4945886	0.999	0.588

Principle Components Analysis

Principal component analysis, often called PCA, finds structure in the covariance or correlation matrix and uses this structure to locate low-dimensional subspaces containing most of the variation in the data. The covariance matrix is used in our analysis because the variables are comparable and in the same units. From Table 12, one can see that the standard deviation of the first principal component is 4.427 and represents 55.6% of the total variance. The first principal component is the normed linear combination with the greatest variance. Also, the first three principal components have 83.4% of the variation, and this increases to 91.1% for the first four principal components and to 95.6% for the first five. The variances (the squares of the first row in Table 12) are plotted in Figure 7.

Fig.7

plot of chunk unnamed-chunk-26

pc.var = 19.599, 5.276, 4.539, 2.717, 1.606, 1.54 Fig.8

plot of chunk unnamed-chunk-27

Table.12:Standard Deviation and Proportion of Principal Components

Label	Comp.1	Comp.2	Comp.3	Comp.4	Comp.5	Comp.6
Standard deviation	4.427	2.297	2.131	1.648	1.267	1.241
Proportion of Variance	0.556	0.150	0.129	0.077	0.046	0.044
Cumulative Proportion	0.556	0.705	0.834	0.911	0.956	1.000

Based on the results from the principal component analysis, it can be concluded that the first five principal components should be used to explain at least 95% of the variation in the stock returns. Since the first five principal components have almost most of the variation, we can work solely with these principal components and discard the last one. One possible reason that we need to include the first five principal components to explain most of the variation is that those six stocks are not highly correlated with each other.
The first three eigenvectors, labeled “PC,” are plotted in Figure 8. The eigenvectors have interesting interpretations. The first eigenvector has only negative values and returns in this direction are either positive for all of the stocks or negative for all of them. The second eigenvector is negative for FOSL (stock 4) and positive for the other stocks. Variation along this eigenvector has FOSL moving in the opposite direction of the other stocks. Therefore, it is not surprising that the second principal component has 15% of the variation. The third principal component is less easy to interpret, but its loading on CMG (stock 3) is higher than on the other stocks, which might indicate that there is something different about stock CMG.

Dependency Analysis between Assets Using Copulas

A copula is a multivariate probability distribution for which the marginal probability of each variable is uniformly distributed. Copulas are used to describe the dependence between random variables. An appropriate copula model that best describes the dependency among the six stocks in our portfolio can be selected using the maximum likelihood method. Four parametric copulas are fit to the sample data: t, Gaussian, Gumbel, and Clayton, with corresponding AIC values -3852, -3236, -2516, and -2604, respectively. As a result, the t-copula fits data best since it minimizes AIC. The fitted t-copula has degrees of freedom 5.41 and its correlation matrix is displayed in Table 13. According to correlations among the six assets, we can conclude that the six stocks in our portfolio are positively correlated, but not highly correlated.

Table13. Correlation Matrix of the fitted t-copula

STOCKS	HSY_AC	AME_AC	CMG_AC	FOSL_AC	KSU_AC	MCK_AC
HSY_AC	1.000	0.430	0.332	0.369	0.389	0.382
AME_AC	0.430	1.000	0.496	0.576	0.642	0.480
CMG_AC	0.332	0.496	1.000	0.503	0.489	0.316
FOSL_AC	0.369	0.576	0.503	1.000	0.547	0.422
KSU_AC	0.389	0.642	0.489	0.547	1.000	0.424
MCK_AC	0.382	0.480	0.316	0.422	0.424	1.000

Value at Risk

Risk Analyses on $1million Portfolios Table.11

Lable	Tangency Portfolio without shorting	Min. Var Portfolio without shorting
Percentile	5%——–1%	5%———1%
VaR (t dist)	$23,778 $46,615	$18,405 $35,586
ES (t dist)	$39,596 $71,932	$30,251 $54,285
VaR (Normal dist)	$26,275 $37,657	$20,546 $29,366
ES (Normal dist)	$33,254 $43,317	$25,954 $33,752
90% C.I.(Non-parametric Bootstrapping)	$24,722 ~ $27,933	$19,239 ~ $21,753
90% C.I. (Parametric Bootstrapping)	$25,199 ~ $27,247	$19,782 ~ $21,287

Table.11

Label	Tangency Portfolio with shorting	Min.Var Portfolio with shorting
Percentile	5%——–1%	5%——–1%
VaR (t dist)	$23,808 $46,660	$18,241 $34,556
ES (t dist)	$39,634 $71,978	$29,383 $51,621
VaR (Normal dist)	$26,350 $37,755	$20,451 $29,224
ES (Normal dist)	$33,348 $43,430	$25,830 $33,586
90% C.I. (Non-parametric Bootstrapping)	$24,739 ~ $27,886	$19,141 ~ $21,433
90% C.I. (Parametric Bootstrapping)	$25,215 ~ $27,410	$19,700 ~ $21,178

Risk has always been an important factor in finance, Value at Risk (VaR) and Expected Shortfall (ES) are two of the most commonly used method in valuating risk. VaR is a bound such that the loss over the horizon is less than this bound with probability equal to the confidence coefficient. In our example, VaR(0.05) = $23,778 represents a 5% chance of a $1 million tangency portfolio losing more than $23,778. The expected loss given the loss exceeds VaR is denoted as Expected Shortfall or ES. 

Here we considered 4 types of portfolios each having $1 million invested with different asset weights. We have fitted 2 different distributions (t and normal) to estimate the effect of parameter assumption. We also used 500 bootstrap resamples to calculate the 90% confidence interval for VaR (0.05) under parametric and non-parametric models. 

Based on the results in Table 11, it is easily seen that tangency portfolio has greater VaR and ES than minimum variance portfolio regardless of the underlying distribution. As expected, 5% significance level also has less VaR and ES than 1%. An interesting part between the t- distribution and the normal is that at 5%, VaR under t is less than VaR under the normal but the relationship reverses at 1% level. This may be due to the fact that t-distribution has a fatter tail than the normal distribution. The 90% confidence intervals on VaR (0.05) also confirm our estimates of VaR.