Introduction and Definitions

Valuation of the stock market, even if far removed from the daily lives of most people, is nevertheless an important part of the functioning of our society and democracy. It is an interaction between investors that yields some ‘price’, and from there many other possibilities follow. For example: whether the economy could hire extra workers, whether pay could be incerased, leading to higher standard of living and so on. Therefore understanding the mechanism of that process is highly desirable. Fairly valued businesses contribute to a fairer society, stable employment and higher happiness or well-being for every citizen.

Relative valuation is a widespread method for valuing companies and businesses. Even though a sound logic would demand that assessment of any company should be done by gauging its own prospects and details of current situation–the so called ‘intrinsic’ value–in reality that rarely has been done. More often, the companies are compared to other companies and in that way they are ‘relatvely’ valued. Naturally, if the whole market is overvalued, we overvalued every company, by using this method. Most obvious example was the dotcom bubble, and many people argued that recent market is also overvalued.

One important parameter is used to determine how much more or less any business is valued and it is the price to earnings ratio(PE). The idea is very simple–what price we are paying for a unit earnings, the more we pay the higher the valuation of the business, and the reverse. Other parameters are also in use, e.g. price to book(considering the accounting or ‘book’ value of assets), economic value(EV or the sum of equity and debt) to earnings before interest,tax, depreciation, and amortization(EBITDA). Andrew Smithers argues for yet another approach called ‘q-ratio’, which is the quotient between the value of the stock market to the corporate net worth. Again if this ratio is high, that indicates overvaluation and the reverse. He also argues that that ratio has a natural mean(or is mean-reverting), implying that whenever q is too high it is bound to come down. That was most prominently seen in the late 90s and the beginning of this century. Mr. Smithers also argues that the PE ratio if properly adjusted for cyclicality and inflation could yield very similar results to the q.

Prof. Robert Shiller on the other hand publishes an index with a specific methodology that is called cyclically adjusted PE(CAPE). As the name suggests it smooths the ordinary PE to a new ratio that would in a way take away any exessive volatility and cyclicality and yield a fairer gauge for the market. The CAPE ratio is calculated for a quite a long period from the late 19 century to this day, which implicitly means that probably there is some ‘average’ value that is kind of ‘normal’. Although prof. Shiller stresses that his paper does not imply that, i.e. he states that there could be a diffrent regime, or maybe the relation is non-linear(he uses linear models), nevertheless he is arguing that the current values of the CAPE are significantly higher than the average of a more than 100 years.

Of course all these values are determined by the market, meaning that it all depends on how would market participants interpret all these measures. If investors believe that higher PE ratios are justified it will stay that way. The idea of the averages is an appealing one, not least because it is widely used in natural sciences(e.g. phisics). For example the rate of radioactive decay of a particular element has, as a property, some average number or value, determined by clear and objective characteristics. So, when thinking of a social/economic characteristic like PE we tend to assume the same logic as the one in the nature. That may be warranted sometimes but not always. For example governments have a duty to provide economic growth and opportunities for each citizens, but that cannot be taken as granted. In order to achieve growth in economy some parts of it must prosper and inevitably some would be less in favor. In democratic society that issue may not always be resolved, it has been done successfully so far, and maybe would be done somehow in the future, but it is by no means guaranteed. We have seen very low growth rates recently in Europe, Japan and even in US. If the governments manage to revive the economy we can see the high valuation multiples endure, but this remains to be seen.

I would argue with this analysis that other factors also play a role in determining how much a company should be valued, ‘on average’, that is(or relatively), and that by controlling for these factors one would gain a better perception about the market conditions. I would confine my analysis only to the PE ratio, and trying to show that, maybe, looking at the multiple only by itself, would not yield any precise insight as to how a company is valued. So, being above or below some average is not indicative as to how ‘fair’ a business is valued. Rather many other factors play a role in the formation of any number af any particular multiple. For the PE, it is important how fast the company will grow in the future, what it is paying its shreholders, how risky it is, and many more.

The Data Analysis

Data collection

Data are gathered from Capital IQ a financial information service provider(parent organization S&P, McGraw Hill). It concerns mainly major companies around the world, i.e. all with very large market capitalzations (in US dollar). Financial institutions are excluded. A simple summation of the variable of market capitalization yields approximately 17.6 trillion dollars. The units of observation are individual companies. So for each firm a bunch of financial information is provided.

Variables

The variables consist of different financilal measures about each of the companies. Most of them are numerical. For example the essential numbers from the balance sheet, income and cash flow statements are gathered. In addition, other statistics like measures of risk(beta or the standard deviation), expected growth for different variables from analysts, and also some information about corporate governance and ownership. In total 124 variables, and not all are used for the analysis. New variables are created for the various valuation multiples, i.e price/earnings, price/book, payout ratio, etc.

The type of study

The type of study is firmly observaional. The companies are not chosen randomly, indeed they are selected based on their market capitalization, i.e. deliberately excluding small firms. Therefore many potenial biases could be discerned here, from the purely statistical vantige point, though in business many people completely ignore this facet and act as though they have an ideal conditions.

Scope of inference–generalizability

The population of intrest could be described as relations between underlying factors and valuation multiple of a company. From theoretical point of view there should be strict rules that determine how a firm is valued. Since the sample is not random, we cannot generalize our findings for all companies in the market. And because we are dealing with only very large firms, the results maybe generalized only for that subsegment, though on the other hand the fact that different countries are involved could also induce some extra bias owing to different reporting standards.

Scope of inference–causality

Causation is concerned with random assignment, and here that is not the case. We certainly could not maintain that based on this analysis some factor ‘caused’ our multiple. Only a degree of association could validly be established, with the caveat that our sample represents the situation in the market at particular time. It goes without saying that in different conditions the ‘assosiation’ could and almost certainly should change. I must emphasize here that many people sometimes overlook all this ‘obvious’ elements of analysis before making conclusions.

Summary statistics–interpretation

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.899  15.890  21.730  34.550  29.140 911.700       6

From the summary above can be seen that the data is skewed–mean is 34.44, whereas median is 21.73, on the other hand the range is huge around 900. All that suggests that the median gives much clear picture about the PE multiple. The skewness of the data could also somewhat distort the end results since it would almost certainly depart from ‘normality’. That should be considered further. On the ther hand the ‘explanatory’ variables are important in themselvs, namely the correlation between them. That could be seen here:

##           [,1]       [,2]       [,3]
## [1,] 1.0000000  0.1073924  0.1685057
## [2,] 0.1073924  1.0000000 -0.1388013
## [3,] 0.1685057 -0.1388013  1.0000000

In order for the multiple regression to yield correct result the factors should be correlated at the minimum. Strangely enough, here our variables display that–in my view correlation of .15 is not very high and bodes well for accuracy of the final conclusions.

Visualizations

The histogram and scatter plot of the PE multiple confirm the findings from the summary–highly skewed data with a long right tail and a few significant outliers. All this could really in some way distort the assumptions for a diligent analysis, but nothing can be done since the data is what it is. Also a scatter plot of each variable against our dependent one is provided here:

Again, it could be seen–some significant outliers, as well as–could be discerned, some positive relation between the growth rate, the payout ratio and the PE multiple. On the other hand the risk, measured by ‘beta’ appears to be relatively constant, regardless of the value of the PE. The significant outliers and departure of ‘normlity’, suggests that the residuals of the regression should be analysed. Positive numbers should be expected as a coefficients for the growth rate and payout ratios, based on the scatter plots. And also that the risk measure, for some reason, shows that the companies that are as risky as the market as a whole(i.e. have ‘beta’ equal approximately to one) have higher PE ratios.

Inference and/or modeling

Stating the hypothesis

The null hypothesis is that variables growth, risk, and payout do not influence/determine the valuation multiple PE. With resect to the researh question that would mean that we could not control for extra variables and distinguish between ddifferent numbers of PE. The alternative hypothesis is that at least some variables indeed, could ‘explain’ our multiple.

Conditions for statistical inference

First condition about the random sample is not matched, but maybe big companies represent better what a company should be doing. Anyways, we could expect some notion of unreliability. The number of observations n=184, which is maybe big enough. Bearing in mind that there are thousands of listed companies, and much more private or unlisted. And the condition of independence–again maybe it is not rigorously fulfilled, but it could be stated that the observation are relatively independent, since large companies are in different industries, and being large probably does not influence other factors.

The method used

Multiple linear regression is used to disentangle the relations between the PE ratio and factors like risk, potential growth, and payout ratio. That is a common aproach, used widely, and assumes a linear relation between variables. The method works in such a way that it calculates the influence of one variable, excluding the others. It should be noted that there is inference of the model as a whole, and inference of individual coefficients. Also the selection of the ‘best’ model is important part of the analysis. I tried many different models and the presented one works best. For example: for measure of risk the standard deviation of returns could be used, but I found that the ‘beta’ works better. Also other variables could be included, or a weighted regression could be used with weights equal to the size of each company.

Interpretations of the regression output

The theoretical model of the regression is: \[PE = \gamma_0 + \gamma_1 beta + \gamma_2 growth + \gamma_3 payout + u\] The output:

##                   Estimate  Std. Error    t value     Pr(>|t|)
## (Intercept)     -3.2239330 3.933777711 -0.8195514 4.137303e-01
## beta            -2.8462458 3.868408513 -0.7357666 4.629847e-01
## I(100 * growth)  0.9853907 0.198102995  4.9741335 1.722502e-06
## I(100 * payout)  0.3302793 0.008073118 40.9109980 5.748259e-85

It should be noted that I multiplied the growth and payout variables by 100 in order to obtain smaller coefficients and the ease of interpretation.
First the hypothesis that no variable is significant is rejected by the highly significant F-statistic. The individual coefficients of variables ‘growth’ and ‘payout’ are also ‘significant’ i.e. they are not zero. On the other hand the ‘beta’ turned out not to be ‘significant’, i.e. we cannot reject the hypotesis that the coefficient is obtained by pure luck. Nevertheless I decided that it should be in the model, since it is an important paramter and the interaction with the other variables maybe beneficial. The p-values could be interpreted as follows: for the ‘beta’ there is .463 chance to obtain such number by chance, which is quite high, normally .05 is considered ‘significant’, i.e. the probability should be so low that we reject the null. That is the case with ‘growth’and ’payout’ variables: the probability of obtaining such numbes is 1.72e-06 and 2e-16, indeed vanishingly small. Therefore we conclude that these values ‘could be believed’. A 95% two-sided confidence intervals could be see in the following output:

##               [,1]      [,2]
## PE     -10.4878502 4.7953586
## growth   0.5940606 1.3767209
## payout   0.3143318 0.3462269

Once again it could be ascertained that 0 is part of the possible values of the ‘beta’ and not of the other variables. These numbers could be interpreted as follows: we have 95% confidence that ‘true’ values lie in those boundaries. Finally, the economical understanding of the coefficients is that one percentage point increase of growth prospects of a firm leads to .99 increase of the resulting PE ratio. And similarly one percentage point increase in the payout ratio yields .33 increase of the valuation multiple. The risk coefficient even insignificant could be read that the increase of the risk would decrese the PE by -2.85, though that number could be a result of a mere chance, nevertheless that gives some intuition. At the very last: a quick look at the picture of residuals–not very bad–yes we see that there are some large outliers, but all in all they look somewhat ‘normal-ish’ distributed and kind of random, which is to be desired.

Potential Criticism

Probably some readers could object that my findings are too broad, and that maybe it is obvious that key characteristics should influence any multiple. That is true, but finding the correct way to quantifying that influence is the key. My analysis is one way of thinking, of course others could have a different approach. On the other hand prof. Shiller’s index concerns time series data, whereas my data are cross-sectional, nonetheless I still think that the sentiment of the market at different times determines the valuation multiple, and it is still crutial to take that information into account.

Conclusion

Summary

Based on the data at hand for the 184 large companies, it could be concluded that the valuation multiple do indeed reflect the underlying specifics of different businesses. Intuitively, the higher growth that is expected for the future do on average yield a higher valuation, and also the more a company pays its shareholders the higher it is valued. That make sense. In this sample the risk assocoated with the company is not found to bother investors, which could maybe be attributed to a specific market condition at the time.

The average value over time of a partucular multiple(in this case PE), may not always be a good indicator of whether businesses are ‘fairly’ valued. Ingrained in any multiple are implicit factors which derermine the ultimate value. Therefore, given these factors we could expect different values for different indicators. Again, everything is dependent on the market–if it really recognizes such factors–at some points it does, and sometimes it des not. The goal of every investor is to make a careful judgement about the particular market condition at different times. That condition, needless to say, could change, leading perhaps to a different conclusion. This analysis, pertains to only a specific point in time. Taking a different snapshot at different time could result in differnt conclusion, nevertheless it still supports the main idea that any multiple is driven by specific characteristics of a company and taking them into account would give as a robust yardstick for measuring the real value of a business.

Discussion about what was learned

The model discovered seem to explain the data relativly well. The R-squared of .91 is really high for economic data, the residuals look also fine. So, we could use the coefficients thus found, to contol for other companies(risk, growth, payout) and determine whether they are relatively over- or undervalued as opposed the the market average. That way a somewhat better insight coud be gleaned with the help of simple basic statistcs.

Ideas of possible future research

There could be indeed much for further research. More and differnt valuation multiples could be investigated. The data could be divided into groups: emerging/developed markets, or by sectors. Financial institutions could be studied as well, since they are excluded from this sample due to their different business model.

References

Diez D., Barr C., Cetinkaya-Rundel M., OpenIntro Statistics, Second Edition 2014
Damodaran A., Valuation: Packet 2 Relative valuation, asset based valuation, and private company valuation, September 2015
Campbell J.Y. and Shiller R.J., Valuation ratios and the long-run stock market outlook: An update, NBER working paper 8221 April 2001
Smithers A. and Wright S., Valuing Wall Street. Protecting wealth in turbulent markets, 2000
Fernandez P., Valuation using multiples. How do analysts reach their conclusions?, IESE Business School 2015
Asness C., Fighting the Fed model. The relationship between stock market yields, bond market yields and future returns, AQR Capital Management, 2002

Appendix

The Data Source

Excerpt of part of the transformed data:

##                                                    company    PE beta
## 1                          Wal-Mart Stores Inc. (NYSE:WMT) 12.16 0.69
## 2              China Petroleum & Chemical Corp. (SEHK:386) 26.14 1.07
## 3                      Toyota Motor Corporation (TSE:7203) 10.64 0.93
## 4                          McKesson Corporation (NYSE:MCK) 23.53  0.8
## 5              Samsung Electronics Co. Ltd. (KOSE:A005930)  9.94  0.9
## 6                                    Daimler AG (XTRA:DAI) 10.55 1.46
## 7                         General Motors Company (NYSE:GM) 12.52 1.35
## 8                        CVS Health Corporation (NYSE:CVS) 22.28 0.87
## 9               UnitedHealth Group Incorporated (NYSE:UNH) 17.94  0.9
## 10                                 Ford Motor Co. (NYSE:F) 12.11 0.97
## 11        Hon Hai Precision Industry Co., Ltd. (TSEC:2317)  8.86 0.56
## 12                   Verizon Communications Inc. (NYSE:VZ) 18.46 0.76
## 13            Costco Wholesale Corporation (NasdaqGS:COST) 29.05 0.65
## 14                                  Phillips 66 (NYSE:PSX) 10.81 1.46
## 15                         China Mobile Limited (SEHK:941) 14.31 0.45
## 16           Walgreens Boots Alliance, Inc. (NasdaqGS:WBA)  21.6 0.75
## 17         Express Scripts Holding Company (NasdaqGS:ESRX) 27.19 0.71
## 18                        Nissan Motor Co. Ltd. (TSE:7201)  9.65  NaN
## 19         Public Joint Stock Company Gazprom (MICEX:GAZP)  7.38  0.8
## 20    Bayerische Motoren Werke Aktiengesellschaft (DB:BMW)  9.69 1.38
## 21                                  Nestlé S.A. (SWX:NESN) 15.84 0.55
## 22                            The Boeing Company (NYSE:BA) 18.43 1.08
## 23 Open Joint Stock Company Rosneft Oil Company (LSE:ROSN)  8.91 1.58
## 24   Nippon Telegraph and Telephone Corporation (TSE:9432)  17.9 0.67
## 25                   Microsoft Corporation (NasdaqGS:MSFT) 36.03 1.32
##    growth payout
## 1    0.03    0.4
## 2    0.04   1.11
## 3    0.07   0.28
## 4    0.13   0.13
## 5    0.05   0.14
## 6    0.14   0.34
## 7    0.15   0.47
## 8    0.14    0.3
## 9    0.12   0.27
## 10    0.2   0.48
## 11   0.04   0.38
## 12   0.06   0.89
## 13   0.09   0.28
## 14   0.07   0.25
## 15   0.07   0.43
## 16   0.13   0.34
## 17   0.12    NaN
## 18   0.12   0.29
## 19   0.43   0.43
## 20   0.05   0.31
## 21   0.04   0.49
## 22    0.1   0.43
## 23   0.01   0.31
## 24    0.1   0.35
## 25   0.09   0.85