I recently decided to start investing, and I wanted to write about the strategy I am devising here. Disclaimer: I do not professionally work in finance, and my background is instead in economics. This is largely an interest for me.
I am of the view, similar to other economists, that equity markets are generally ‘efficient’. That is, prices reflect current market information and, as a result, prices reflect the risk that the market observes. There has been much research supporting this viewpoint, showing that the financial wizards with their 5 computer screens full of charts and tables tend to do negligibly better than passively investing in the entire market.
While this is likely true, I wanted to set out choosing a portfolio based on a few general rules. I will highlight these rules through R, in which I use both the Quandl and BatchGetSymbols packages to obtain stock information.
In the code below, I load up the various packages used and access the stock fundamentals dataset from Quandl:
#packages needed
library(Quandl)
library(dplyr)
library(BatchGetSymbols)
library(Hmisc)
library(ggplot2)
#Quandl fundamentals dataset (you need to purchase your own key)
Quandl.api_key("ABCDEFGHIJKLMNOP") #Need API key to access quandl
stocks <- Quandl.datatable("SHARADAR/SF1", paginate = TRUE)
stocks <- kwan %>%
filter(dimension == "ARY")
A stock is a share of ownership within a company. The goal of a company is to make money. Hence, the first rule I decided I would use is to only choose stocks that have positive earnings (net income) over the last five years. This might sound odd to some - Why would you invest in a company that does not make money? Yet, when we look at some major companies gaining attention today: Amazon, Uber, etc.. Many of them did not make any money for long periods of time, and some of them continue to run in the red today.
My choice of five years of positive earnings is arbitrary, and there are plenty of great companies that might have had negative earnings in some of those years. For the sake of simplicity, I will look at that time frame.
The fundamentals dataset from Quandl that I am using has many pieces of information for each company:
colnames(stocks)
## [1] "X" "ticker" "dimension" "calendardate"
## [5] "datekey" "reportperiod" "lastupdated" "accoci"
## [9] "assets" "assetsavg" "assetsc" "assetsnc"
## [13] "assetturnover" "bvps" "capex" "cashneq"
## [17] "cashnequsd" "cor" "consolinc" "currentratio"
## [21] "de" "debt" "debtc" "debtnc"
## [25] "debtusd" "deferredrev" "depamor" "deposits"
## [29] "divyield" "dps" "ebit" "ebitda"
## [33] "ebitdamargin" "ebitdausd" "ebitusd" "ebt"
## [37] "eps" "epsdil" "epsusd" "equity"
## [41] "equityavg" "equityusd" "ev" "evebit"
## [45] "evebitda" "fcf" "fcfps" "fxusd"
## [49] "gp" "grossmargin" "intangibles" "intexp"
## [53] "invcap" "invcapavg" "inventory" "investments"
## [57] "investmentsc" "investmentsnc" "liabilities" "liabilitiesc"
## [61] "liabilitiesnc" "marketcap" "ncf" "ncfbus"
## [65] "ncfcommon" "ncfdebt" "ncfdiv" "ncff"
## [69] "ncfi" "ncfinv" "ncfo" "ncfx"
## [73] "netinc" "netinccmn" "netinccmnusd" "netincdis"
## [77] "netincnci" "netmargin" "opex" "opinc"
## [81] "payables" "payoutratio" "pb" "pe"
## [85] "pe1" "ppnenet" "prefdivis" "price"
## [89] "ps" "ps1" "receivables" "retearn"
## [93] "revenue" "revenueusd" "rnd" "roa"
## [97] "roe" "roic" "ros" "sbcomp"
## [101] "sgna" "sharefactor" "sharesbas" "shareswa"
## [105] "shareswadil" "sps" "tangibles" "taxassets"
## [109] "taxexp" "taxliabilities" "tbvps" "workingcapital"
Some of the companies shown have only been around for a short period of time, say 1-2 years. To make things easier, I only want to look at companies who have been on the public market for atleast the last five years:
counts <- data.frame(table(stocks$ticker))
#generates two column df of ticker symbol with count of frequencies it appears
colnames(counts)[1] <- "ticker" #need to match ticker column of stocks for join
stocks <- merge(stocks, counts, by = "ticker")
#now stocks has a column designating frequency of ticker
stocks <- stocks %>%
filter(Freq > 4) #this will filter to companies that have been public for atleast 5 years
Some of the stocks will have a frequency greater than 5 due to double counting. The dplyr distinct function helps simplify things:
stocks <- stocks %>%
group_by(ticker) %>%
distinct(calendardate, .keep_all = TRUE)
#We now have a dataframe including fundamentals on each company corresponding to their last five years of annual reports (each report on 12/31 of the corresponding year)
We now have a dataframe including fundamentals on each company corresponding to their last five years of annual reports (each report on 12/31 of the corresponding year). Next, I want to generate a column that will produce that change in earnings(net income) from one year to the next. I produce a small example table using the earnings and annual change in earnings for American Airlines stock over the last five years:
stocks <- stocks %>% #generating change in net income
group_by(ticker) %>%
mutate(dNetInc = netinc - lead(netinc)) %>% #using lead function since dates descending
ungroup
#Example - American Airlines
stocks %>% filter(ticker == "AAL") %>% select(ticker, netinc, dNetInc)
## # A tibble: 5 x 3
## ticker netinc dNetInc
## <fct> <dbl> <dbl>
## 1 AAL 1686000000 274000000
## 2 AAL 1412000000 -507000000
## 3 AAL 1919000000 -757000000
## 4 AAL 2676000000 -4934000000
## 5 AAL 7610000000 NA
We begin by filtering down to see how many companies had positive net income for all five of the past years:
stocks <- stocks %>%
filter(netinc >0) #will get rid of any rows where net income is negative
counts <- data.frame(table(stocks$ticker)) #need to get rid of companies with rows removed
colnames(counts)[1] <- "ticker" #need to match ticker column of stocks for join
stocks <- merge(stocks, counts, by = "ticker")
stocks <- stocks %>%
filter(Freq.y == 5) #since some companies with negative net income at some point remain(years with positive income), we use this to remove them entirely
We now have companies that have had positive net income for each year. Along with having positive net income, we also want companies that have had consistent earnings growth. To do this, we will filter out any companies which exhibit a negative value in the dNetInc column that we created (while ignoring NA values due to no change in income for the oldest year (2015)):
stocks <- stocks %>%
filter(dNetInc > 0) #only want years where companies show an increase in annual earnings
counts1 <- data.frame(table(stocks$ticker)) #Each company has "NA" for 2015. Therefore, a count of 4 years of growth means consistent growth
#Before moving on, let us see how many companies have consistent growth over all years (frequency of value 4)
describe(counts1)
## counts1
##
## 2 Variables 6632 Observations
## ---------------------------------------------------------------------------
## Var1
## n missing distinct
## 6632 0 6632
##
## lowest : A AA AAAP AACG AACH , highest: ZVO ZXAIY ZYME ZYNE ZYXI
## ---------------------------------------------------------------------------
## Freq
## n missing distinct Info Mean Gmd
## 6632 0 5 0.596 0.6763 1.066
##
## lowest : 0 1 2 3 4, highest: 0 1 2 3 4
##
## Value 0 1 2 3 4
## Frequency 4897 204 603 637 291
## Proportion 0.738 0.031 0.091 0.096 0.044
## ---------------------------------------------------------------------------
We can see that 291 companies fit the bill of having consistent growth in earnings over the last five years. This has narrowed our options down by quite a bit.
colnames(counts1)[1] <- "ticker" #need to match ticker column of stocks for join
stocks <- merge(stocks, counts1, by = "ticker")
stocks <- stocks %>%
filter(Freq > 3) #this will narrow down only to those stocks showing 4 years of positive income change (5 years technically, due to NA in fifth year)
We have now narrowed our list to companies with both (1) positive net income in all of the last five years, and (2) consistent earnings growth over last five years.
Further along in scoring which stocks appear most promising, I may be interested in knowing what earnings growth looks like as a %change. Therefore, I will also create a column presenting the average percent change in earnings over the last five years:
stocks <- stocks %>%
group_by(ticker) %>%
mutate(percentdNetInc = sum(dNetInc)/sum(netinc))
At this point, I want to focus on other important aspects of these companies, but ones that do not vary as much with time. With that being said, I will ignore time varying qualities, and subset the data to the most recent annual data:
stocks_recent <- stocks %>%
filter(calendardate == '2019-12-31')
Along with this, I am interested in current price data. To gain this, I use the BatchGetSymbols package.
prices <- BatchGetSymbols(tickers = stocks_recent$ticker, #retrieves prices based on tickers in stocks_recent dataframe
first.date = "2020-04-21",
last.date = "2020-04-22",
cache.folder = file.path(tempdir(), 'BGS_Cache') )
prices1 <- prices[["df.tickers"]] #retrieves price dataframe from list
prices1 <- prices1[c(4,8)] #we only want closing price and ticker
stocks_recent <- merge(prices1, stocks_recent, by = "ticker")
Let us now use the prices to generate a P/E (Price-to-earnings) ratio. This is calculated via taking the current share price and dividing it by earnings-per-share. Let us generate the PE’s of our 291 choices and see how the distribution looks:
stocks_recent$PE <- stocks_recent$price.close / stocks_recent$eps
ggplot(stocks_recent, aes(x = PE)) +
geom_histogram(bins = 60)
As you can see above, most price-to-earnings ratios in our list are around the 10-30 range, which are solidly priced. As a rule, I aim to stay away from anything priced with a PE over 30. Hence, I will trim down the list more:
stocks_recent <- stocks_recent %>%
filter(PE <= 30)
Next, I want to divide total debt by net income. This will tell me (assuming net income stays around where it currently is) how many years of earnings it would take to cover the company’s total debt burden.
#dividing total debt by net income to generate years it would take to pay off
stocks_recent$debt_to_inc <- stocks_recent$debt/stocks_recent$netinc
#Let's look at the median number of years
median(stocks_recent$debt_to_inc)
## [1] 4.792629
#plotting histogram
ggplot(stocks_recent, aes(x = debt_to_inc)) +
geom_histogram(bins = 60)
From this, we see that the median number of years of earnings that would cover total debt is ~4.8 years. Let us continue to chop down our list by splitting at this median amount:
stocks_recent <- stocks_recent %>%
filter(debt_to_inc <= median(debt_to_inc))
Lastly, I want to list off stocks presented based on their firm size (in terms of net income):
picks <- stocks_recent[order(-stocks_recent$netinc),]
print(picks %>%
select(ticker, price.close, PE, percentdNetInc))
## ticker price.close PE percentdNetInc
## 78 NVO 64.10 3.906155 0.02662822
## 120 UNH 274.19 18.844674 0.18493088
## 97 RY 59.12 6.733485 0.06232349
## 53 HD 200.61 19.495627 0.10867779
## 39 FCAU 7.73 1.827423 0.40505024
## 100 SCHW 35.94 13.360594 0.19704907
## 108 STZ 153.02 8.389255 0.31116384
## 109 TJX 46.05 16.992620 0.08849615
## 2 ATHM 76.15 2.821415 0.23754784
## 1 AMTD 37.49 9.419598 0.25857275
## 74 MTB 104.63 7.603924 0.13818971
## 96 ROST 82.70 17.861770 0.11176300
## 30 DHI 38.02 8.760369 0.17343859
## 6 AZO 978.85 15.110373 0.08344178
## 83 ORLY 358.30 19.828444 0.09409121
## 13 CBRE 39.97 10.463351 0.20371829
## 103 SIVB 165.49 7.556621 0.26574465
## 73 MNST 59.04 28.941177 0.15439483
## 86 PAYX 64.02 22.229166 0.10149060
## 93 RJF 61.28 8.371585 0.17402201
## 69 MAS 38.46 11.870370 0.21537319
## 91 PSXP 41.15 9.247191 0.29375252
## 79 NVR 2805.44 11.625875 0.18783719
## 17 CHKP 101.79 18.574818 0.04404533
## 16 CDW 98.85 19.458661 0.14339120
## 36 FAF 40.12 6.408946 0.21526447
## 119 ULTA 201.57 16.508600 0.16567389
## 105 SNA 110.54 8.779984 0.08670030
## 22 CPRT 66.15 25.739300 0.22214895
## 68 LPLA 50.80 7.492625 0.27346827
## 34 ESNT 23.43 4.125000 0.24509345
## 122 WAL 30.89 6.355967 0.20057701
## 9 BR 107.36 25.807693 0.12627080
## 85 PACW 16.75 4.294872 0.10281096
## 41 FCNCA 329.05 8.015834 0.17555039
## 63 LII 172.95 16.487130 0.16437241
## 56 ICLR 151.86 22.169343 0.10839799
## 33 EME 61.08 10.567474 0.15018049
## 28 CW 91.02 12.641666 0.16449868
## 61 LECO 72.66 15.361523 0.16142161
## 12 CATY 22.96 6.578796 0.13082587
## 113 TTC 60.83 23.669261 0.06929898
## 89 POOL 184.76 28.121765 0.15933083
## 67 LOPE 75.32 13.896679 0.15209656
## 66 LOGI 45.92 29.435896 0.31377163
## 51 GNRC 94.15 23.019560 0.23283158
## 126 WSO 145.45 22.342549 0.08298661
## 72 MMS 62.76 16.825737 0.09778151
## 8 BOH 59.80 10.697674 0.08034148
## 24 CSFL 15.48 8.234043 0.38764748
## 40 FCN 129.60 22.003397 0.26866600
## 4 AWI 76.62 17.413637 0.18230035
## 110 TNET 44.58 14.664474 0.28023519
## 26 CVBF 19.33 13.060811 0.19212969
## 55 IBTX 22.77 5.105381 0.34131698
## 44 FIBK 28.79 10.137324 0.17337588
## 117 UFPI 35.37 12.154639 0.18044810
## 62 LGIH 48.28 6.270130 0.24084744
## 115 TXRH 44.64 18.072874 0.13376659
## 20 CNXM 11.87 4.864754 0.19826405
## 14 CBU 59.67 18.303680 0.13142294
## 52 GWB 18.57 6.337884 0.09859298
## 57 INDB 66.43 13.206759 0.22237928
## 43 FFIN 24.66 20.213115 0.11918534
## 49 FRME 25.66 8.018750 0.19786707
## 90 PPBI 17.79 6.790077 0.35016086
## 114 TTEK 74.71 25.851211 0.24053113
## 5 AX 17.17 6.868000 0.12929541
## 77 NSP 39.70 10.672043 0.25568259
## 102 SFBS 30.30 10.860215 0.18600598
## 101 SEDG 86.69 28.330066 0.28757139
## 82 OLLI 57.94 25.982062 0.22716456
## 7 BANF 33.53 8.118644 0.16445192
## 107 SSD 58.00 19.333333 0.14920784
## 80 NWLI 176.70 4.747448 0.07226656
## 123 WGO 35.69 10.053521 0.21326904
## 95 ROLL 120.23 27.831019 0.14362814
## 10 BUSE 16.16 8.595745 0.20345783
## 98 SBCF 18.11 9.432292 0.32173084
## 48 FOXF 40.15 16.522635 0.26606247
## 37 FBNC 22.64 7.303225 0.25513312
## 106 SRCE 31.20 8.739496 0.11483257
## 64 LKFN 36.37 10.697059 0.14692710
## 21 CNXN 38.80 12.435897 0.14132198
## 125 WMS 33.90 27.560977 0.36265425
## 15 CBZ 21.67 16.669231 0.16435890
## 27 CVCO 121.50 16.071429 0.22787611
## 54 HFWA 18.21 9.896739 0.14935204
## 94 ROCK 42.80 21.293532 0.18496902
## 25 CTBI 31.02 8.521978 0.08134515
## 58 IPAR 44.46 23.156249 0.15776300
## 76 NCBS 51.19 8.964974 0.29338919
## 59 JOUT 58.10 11.216216 0.28987495
## 42 FDEF 14.24 5.718876 0.14641103
## 3 AVAV 54.67 27.334999 0.50075884
## 19 CMCL 11.70 3.062827 0.46254841
## 75 NC 25.99 4.575704 0.13134764
## 71 MLR 26.20 7.638484 0.19979274
## 87 PETS 32.87 17.864130 0.16989507
## 46 FMNB 11.12 8.620155 0.24825936
## 31 EBTC 22.39 7.720689 0.17833539
## 124 WINA 150.15 17.939067 0.09490040
## 70 MCBC 7.23 7.691489 0.21175263
## 11 CASS 34.75 16.469194 0.06677936
## 104 SMBC 22.08 7.031847 0.18989693
## 112 TSBK 16.65 5.761246 0.24173865
## 118 UFPT 41.20 15.488722 0.23725142
## 29 CZFS 51.50 9.296029 0.12445598
## 60 LCNB 11.93 8.284722 0.12561855
## 45 FMAO 21.70 13.072290 0.13963800
## 50 GCBC 21.75 10.609756 0.19782099
## 35 EVBN 23.39 6.740634 0.17595595
## 121 UTGN 27.00 5.454545 0.44324847
## 18 CIX 14.35 11.124031 0.12512959
## 88 PLBC 17.00 5.647841 0.21462572
## 116 UBFO 6.44 7.155556 0.18494272
## 47 FNRN 8.05 7.000000 0.17698713
## 84 OVLY 13.07 8.487013 0.18587716
## 99 SBFG 12.91 7.549708 0.10018408
## 92 PVBC 8.56 14.266667 0.20798558
## 23 CSBB 35.00 9.210526 0.13046190
## 38 FCAP 48.50 15.645161 0.15132375
## 111 TRNS 26.32 26.585859 0.14364666
## 32 EDUC 5.04 6.146341 0.34487622
## 65 LOAN 3.64 7.744681 0.15071955
## 81 OBCI 5.18 14.000000 0.27636785
While this list is still of a relatively large size, it serves as a nice starting point for evaluating those stocks that might be a good choice in more detail. In regards to further analysis, I plan to look into how to generate a portfolio in which (returns are uncorrelated) and including a momentum statistic to add in rating which stocks to choose.