I recently decided to start investing, and I wanted to write about the strategy I am devising here. Disclaimer: I do not professionally work in finance, and my background is instead in economics. This is largely an interest for me.

I am of the view, similar to other economists, that equity markets are generally ‘efficient’. That is, prices reflect current market information and, as a result, prices reflect the risk that the market observes. There has been much research supporting this viewpoint, showing that the financial wizards with their 5 computer screens full of charts and tables tend to do negligibly better than passively investing in the entire market.

While this is likely true, I wanted to set out choosing a portfolio based on a few general rules. I will highlight these rules through R, in which I use both the Quandl and BatchGetSymbols packages to obtain stock information.

In the code below, I load up the various packages used and access the stock fundamentals dataset from Quandl:

#packages needed
library(Quandl)
library(dplyr)
library(BatchGetSymbols)
library(Hmisc)
library(ggplot2)

#Quandl fundamentals dataset (you need to purchase your own key)
Quandl.api_key("ABCDEFGHIJKLMNOP") #Need API key to access quandl
stocks <- Quandl.datatable("SHARADAR/SF1", paginate = TRUE)
stocks <- kwan %>%
  filter(dimension == "ARY")

1: Positive Earnings

A stock is a share of ownership within a company. The goal of a company is to make money. Hence, the first rule I decided I would use is to only choose stocks that have positive earnings (net income) over the last five years. This might sound odd to some - Why would you invest in a company that does not make money? Yet, when we look at some major companies gaining attention today: Amazon, Uber, etc.. Many of them did not make any money for long periods of time, and some of them continue to run in the red today.

My choice of five years of positive earnings is arbitrary, and there are plenty of great companies that might have had negative earnings in some of those years. For the sake of simplicity, I will look at that time frame.

The fundamentals dataset from Quandl that I am using has many pieces of information for each company:

colnames(stocks)
##   [1] "X"              "ticker"         "dimension"      "calendardate"  
##   [5] "datekey"        "reportperiod"   "lastupdated"    "accoci"        
##   [9] "assets"         "assetsavg"      "assetsc"        "assetsnc"      
##  [13] "assetturnover"  "bvps"           "capex"          "cashneq"       
##  [17] "cashnequsd"     "cor"            "consolinc"      "currentratio"  
##  [21] "de"             "debt"           "debtc"          "debtnc"        
##  [25] "debtusd"        "deferredrev"    "depamor"        "deposits"      
##  [29] "divyield"       "dps"            "ebit"           "ebitda"        
##  [33] "ebitdamargin"   "ebitdausd"      "ebitusd"        "ebt"           
##  [37] "eps"            "epsdil"         "epsusd"         "equity"        
##  [41] "equityavg"      "equityusd"      "ev"             "evebit"        
##  [45] "evebitda"       "fcf"            "fcfps"          "fxusd"         
##  [49] "gp"             "grossmargin"    "intangibles"    "intexp"        
##  [53] "invcap"         "invcapavg"      "inventory"      "investments"   
##  [57] "investmentsc"   "investmentsnc"  "liabilities"    "liabilitiesc"  
##  [61] "liabilitiesnc"  "marketcap"      "ncf"            "ncfbus"        
##  [65] "ncfcommon"      "ncfdebt"        "ncfdiv"         "ncff"          
##  [69] "ncfi"           "ncfinv"         "ncfo"           "ncfx"          
##  [73] "netinc"         "netinccmn"      "netinccmnusd"   "netincdis"     
##  [77] "netincnci"      "netmargin"      "opex"           "opinc"         
##  [81] "payables"       "payoutratio"    "pb"             "pe"            
##  [85] "pe1"            "ppnenet"        "prefdivis"      "price"         
##  [89] "ps"             "ps1"            "receivables"    "retearn"       
##  [93] "revenue"        "revenueusd"     "rnd"            "roa"           
##  [97] "roe"            "roic"           "ros"            "sbcomp"        
## [101] "sgna"           "sharefactor"    "sharesbas"      "shareswa"      
## [105] "shareswadil"    "sps"            "tangibles"      "taxassets"     
## [109] "taxexp"         "taxliabilities" "tbvps"          "workingcapital"

Some of the companies shown have only been around for a short period of time, say 1-2 years. To make things easier, I only want to look at companies who have been on the public market for atleast the last five years:

counts <- data.frame(table(stocks$ticker)) 
#generates two column df of ticker symbol with count of frequencies it appears
colnames(counts)[1] <- "ticker"   #need to match ticker column of stocks for join
stocks <- merge(stocks, counts, by = "ticker")
#now stocks has a column designating frequency of ticker
stocks <- stocks %>%
  filter(Freq > 4)   #this will filter to companies that have been public for atleast 5 years

Some of the stocks will have a frequency greater than 5 due to double counting. The dplyr distinct function helps simplify things:

stocks <- stocks %>%
  group_by(ticker) %>%
  distinct(calendardate, .keep_all = TRUE) 
#We now have a dataframe including fundamentals on each company corresponding to their last five years of annual reports (each report on 12/31 of the corresponding year)

We now have a dataframe including fundamentals on each company corresponding to their last five years of annual reports (each report on 12/31 of the corresponding year). Next, I want to generate a column that will produce that change in earnings(net income) from one year to the next. I produce a small example table using the earnings and annual change in earnings for American Airlines stock over the last five years:

stocks <- stocks %>%   #generating change in net income 
  group_by(ticker) %>%
  mutate(dNetInc = netinc - lead(netinc)) %>%  #using lead function since dates descending
  ungroup

#Example - American Airlines
stocks %>% filter(ticker == "AAL") %>% select(ticker, netinc, dNetInc)
## # A tibble: 5 x 3
##   ticker     netinc     dNetInc
##   <fct>       <dbl>       <dbl>
## 1 AAL    1686000000   274000000
## 2 AAL    1412000000  -507000000
## 3 AAL    1919000000  -757000000
## 4 AAL    2676000000 -4934000000
## 5 AAL    7610000000          NA

We begin by filtering down to see how many companies had positive net income for all five of the past years:

stocks <- stocks %>%
  filter(netinc >0)  #will get rid of any rows where net income is negative
counts <- data.frame(table(stocks$ticker)) #need to get rid of companies with rows removed
colnames(counts)[1] <- "ticker"   #need to match ticker column of stocks for join
stocks <- merge(stocks, counts, by = "ticker") 
stocks <- stocks %>%
  filter(Freq.y == 5)  #since some companies with negative net income at some point remain(years with positive income), we use this to remove them entirely

2: Earnings Growth

We now have companies that have had positive net income for each year. Along with having positive net income, we also want companies that have had consistent earnings growth. To do this, we will filter out any companies which exhibit a negative value in the dNetInc column that we created (while ignoring NA values due to no change in income for the oldest year (2015)):

stocks <- stocks %>%
  filter(dNetInc > 0)  #only want years where companies show an increase in annual earnings
counts1 <- data.frame(table(stocks$ticker)) #Each company has "NA" for 2015. Therefore, a count of 4 years of growth means consistent growth
#Before moving on, let us see how many companies have consistent growth over all years (frequency of value 4)
describe(counts1)
## counts1 
## 
##  2  Variables      6632  Observations
## ---------------------------------------------------------------------------
## Var1 
##        n  missing distinct 
##     6632        0     6632 
## 
## lowest : A     AA    AAAP  AACG  AACH , highest: ZVO   ZXAIY ZYME  ZYNE  ZYXI 
## ---------------------------------------------------------------------------
## Freq 
##        n  missing distinct     Info     Mean      Gmd 
##     6632        0        5    0.596   0.6763    1.066 
## 
## lowest : 0 1 2 3 4, highest: 0 1 2 3 4
##                                         
## Value          0     1     2     3     4
## Frequency   4897   204   603   637   291
## Proportion 0.738 0.031 0.091 0.096 0.044
## ---------------------------------------------------------------------------

We can see that 291 companies fit the bill of having consistent growth in earnings over the last five years. This has narrowed our options down by quite a bit.

colnames(counts1)[1] <- "ticker"   #need to match ticker column of stocks for join
stocks <- merge(stocks, counts1, by = "ticker") 
stocks <- stocks %>%
  filter(Freq > 3)  #this will narrow down only to those stocks showing 4 years of positive income change (5 years technically, due to NA in fifth year)

We have now narrowed our list to companies with both (1) positive net income in all of the last five years, and (2) consistent earnings growth over last five years.

Further along in scoring which stocks appear most promising, I may be interested in knowing what earnings growth looks like as a %change. Therefore, I will also create a column presenting the average percent change in earnings over the last five years:

stocks <- stocks %>%
  group_by(ticker) %>%
  mutate(percentdNetInc = sum(dNetInc)/sum(netinc))

3: Low Price-to-Earnings Ratios

At this point, I want to focus on other important aspects of these companies, but ones that do not vary as much with time. With that being said, I will ignore time varying qualities, and subset the data to the most recent annual data:

stocks_recent <- stocks %>%
  filter(calendardate == '2019-12-31')

Along with this, I am interested in current price data. To gain this, I use the BatchGetSymbols package.

prices <- BatchGetSymbols(tickers = stocks_recent$ticker, #retrieves prices based on tickers in stocks_recent dataframe
                        first.date = "2020-04-21",
                        last.date = "2020-04-22", 
                        cache.folder = file.path(tempdir(), 'BGS_Cache') )
prices1 <- prices[["df.tickers"]]  #retrieves price dataframe from list
prices1 <- prices1[c(4,8)] #we only want closing price and ticker
stocks_recent <- merge(prices1, stocks_recent, by = "ticker")

Let us now use the prices to generate a P/E (Price-to-earnings) ratio. This is calculated via taking the current share price and dividing it by earnings-per-share. Let us generate the PE’s of our 291 choices and see how the distribution looks:

stocks_recent$PE <- stocks_recent$price.close / stocks_recent$eps
ggplot(stocks_recent, aes(x = PE)) +
  geom_histogram(bins = 60)

As you can see above, most price-to-earnings ratios in our list are around the 10-30 range, which are solidly priced. As a rule, I aim to stay away from anything priced with a PE over 30. Hence, I will trim down the list more:

stocks_recent <- stocks_recent %>%
  filter(PE <= 30)

4: Earnings to Debt Ratio

Next, I want to divide total debt by net income. This will tell me (assuming net income stays around where it currently is) how many years of earnings it would take to cover the company’s total debt burden.

#dividing total debt by net income to generate years it would take to pay off 
stocks_recent$debt_to_inc <- stocks_recent$debt/stocks_recent$netinc
#Let's look at the median number of years
median(stocks_recent$debt_to_inc)
## [1] 4.792629
#plotting histogram
ggplot(stocks_recent, aes(x = debt_to_inc)) +
  geom_histogram(bins = 60)

From this, we see that the median number of years of earnings that would cover total debt is ~4.8 years. Let us continue to chop down our list by splitting at this median amount:

stocks_recent <- stocks_recent %>%
  filter(debt_to_inc <= median(debt_to_inc))

4: Large Size

Lastly, I want to list off stocks presented based on their firm size (in terms of net income):

picks <- stocks_recent[order(-stocks_recent$netinc),]
print(picks %>%
    select(ticker, price.close, PE, percentdNetInc))
##     ticker price.close        PE percentdNetInc
## 78     NVO       64.10  3.906155     0.02662822
## 120    UNH      274.19 18.844674     0.18493088
## 97      RY       59.12  6.733485     0.06232349
## 53      HD      200.61 19.495627     0.10867779
## 39    FCAU        7.73  1.827423     0.40505024
## 100   SCHW       35.94 13.360594     0.19704907
## 108    STZ      153.02  8.389255     0.31116384
## 109    TJX       46.05 16.992620     0.08849615
## 2     ATHM       76.15  2.821415     0.23754784
## 1     AMTD       37.49  9.419598     0.25857275
## 74     MTB      104.63  7.603924     0.13818971
## 96    ROST       82.70 17.861770     0.11176300
## 30     DHI       38.02  8.760369     0.17343859
## 6      AZO      978.85 15.110373     0.08344178
## 83    ORLY      358.30 19.828444     0.09409121
## 13    CBRE       39.97 10.463351     0.20371829
## 103   SIVB      165.49  7.556621     0.26574465
## 73    MNST       59.04 28.941177     0.15439483
## 86    PAYX       64.02 22.229166     0.10149060
## 93     RJF       61.28  8.371585     0.17402201
## 69     MAS       38.46 11.870370     0.21537319
## 91    PSXP       41.15  9.247191     0.29375252
## 79     NVR     2805.44 11.625875     0.18783719
## 17    CHKP      101.79 18.574818     0.04404533
## 16     CDW       98.85 19.458661     0.14339120
## 36     FAF       40.12  6.408946     0.21526447
## 119   ULTA      201.57 16.508600     0.16567389
## 105    SNA      110.54  8.779984     0.08670030
## 22    CPRT       66.15 25.739300     0.22214895
## 68    LPLA       50.80  7.492625     0.27346827
## 34    ESNT       23.43  4.125000     0.24509345
## 122    WAL       30.89  6.355967     0.20057701
## 9       BR      107.36 25.807693     0.12627080
## 85    PACW       16.75  4.294872     0.10281096
## 41   FCNCA      329.05  8.015834     0.17555039
## 63     LII      172.95 16.487130     0.16437241
## 56    ICLR      151.86 22.169343     0.10839799
## 33     EME       61.08 10.567474     0.15018049
## 28      CW       91.02 12.641666     0.16449868
## 61    LECO       72.66 15.361523     0.16142161
## 12    CATY       22.96  6.578796     0.13082587
## 113    TTC       60.83 23.669261     0.06929898
## 89    POOL      184.76 28.121765     0.15933083
## 67    LOPE       75.32 13.896679     0.15209656
## 66    LOGI       45.92 29.435896     0.31377163
## 51    GNRC       94.15 23.019560     0.23283158
## 126    WSO      145.45 22.342549     0.08298661
## 72     MMS       62.76 16.825737     0.09778151
## 8      BOH       59.80 10.697674     0.08034148
## 24    CSFL       15.48  8.234043     0.38764748
## 40     FCN      129.60 22.003397     0.26866600
## 4      AWI       76.62 17.413637     0.18230035
## 110   TNET       44.58 14.664474     0.28023519
## 26    CVBF       19.33 13.060811     0.19212969
## 55    IBTX       22.77  5.105381     0.34131698
## 44    FIBK       28.79 10.137324     0.17337588
## 117   UFPI       35.37 12.154639     0.18044810
## 62    LGIH       48.28  6.270130     0.24084744
## 115   TXRH       44.64 18.072874     0.13376659
## 20    CNXM       11.87  4.864754     0.19826405
## 14     CBU       59.67 18.303680     0.13142294
## 52     GWB       18.57  6.337884     0.09859298
## 57    INDB       66.43 13.206759     0.22237928
## 43    FFIN       24.66 20.213115     0.11918534
## 49    FRME       25.66  8.018750     0.19786707
## 90    PPBI       17.79  6.790077     0.35016086
## 114   TTEK       74.71 25.851211     0.24053113
## 5       AX       17.17  6.868000     0.12929541
## 77     NSP       39.70 10.672043     0.25568259
## 102   SFBS       30.30 10.860215     0.18600598
## 101   SEDG       86.69 28.330066     0.28757139
## 82    OLLI       57.94 25.982062     0.22716456
## 7     BANF       33.53  8.118644     0.16445192
## 107    SSD       58.00 19.333333     0.14920784
## 80    NWLI      176.70  4.747448     0.07226656
## 123    WGO       35.69 10.053521     0.21326904
## 95    ROLL      120.23 27.831019     0.14362814
## 10    BUSE       16.16  8.595745     0.20345783
## 98    SBCF       18.11  9.432292     0.32173084
## 48    FOXF       40.15 16.522635     0.26606247
## 37    FBNC       22.64  7.303225     0.25513312
## 106   SRCE       31.20  8.739496     0.11483257
## 64    LKFN       36.37 10.697059     0.14692710
## 21    CNXN       38.80 12.435897     0.14132198
## 125    WMS       33.90 27.560977     0.36265425
## 15     CBZ       21.67 16.669231     0.16435890
## 27    CVCO      121.50 16.071429     0.22787611
## 54    HFWA       18.21  9.896739     0.14935204
## 94    ROCK       42.80 21.293532     0.18496902
## 25    CTBI       31.02  8.521978     0.08134515
## 58    IPAR       44.46 23.156249     0.15776300
## 76    NCBS       51.19  8.964974     0.29338919
## 59    JOUT       58.10 11.216216     0.28987495
## 42    FDEF       14.24  5.718876     0.14641103
## 3     AVAV       54.67 27.334999     0.50075884
## 19    CMCL       11.70  3.062827     0.46254841
## 75      NC       25.99  4.575704     0.13134764
## 71     MLR       26.20  7.638484     0.19979274
## 87    PETS       32.87 17.864130     0.16989507
## 46    FMNB       11.12  8.620155     0.24825936
## 31    EBTC       22.39  7.720689     0.17833539
## 124   WINA      150.15 17.939067     0.09490040
## 70    MCBC        7.23  7.691489     0.21175263
## 11    CASS       34.75 16.469194     0.06677936
## 104   SMBC       22.08  7.031847     0.18989693
## 112   TSBK       16.65  5.761246     0.24173865
## 118   UFPT       41.20 15.488722     0.23725142
## 29    CZFS       51.50  9.296029     0.12445598
## 60    LCNB       11.93  8.284722     0.12561855
## 45    FMAO       21.70 13.072290     0.13963800
## 50    GCBC       21.75 10.609756     0.19782099
## 35    EVBN       23.39  6.740634     0.17595595
## 121   UTGN       27.00  5.454545     0.44324847
## 18     CIX       14.35 11.124031     0.12512959
## 88    PLBC       17.00  5.647841     0.21462572
## 116   UBFO        6.44  7.155556     0.18494272
## 47    FNRN        8.05  7.000000     0.17698713
## 84    OVLY       13.07  8.487013     0.18587716
## 99    SBFG       12.91  7.549708     0.10018408
## 92    PVBC        8.56 14.266667     0.20798558
## 23    CSBB       35.00  9.210526     0.13046190
## 38    FCAP       48.50 15.645161     0.15132375
## 111   TRNS       26.32 26.585859     0.14364666
## 32    EDUC        5.04  6.146341     0.34487622
## 65    LOAN        3.64  7.744681     0.15071955
## 81    OBCI        5.18 14.000000     0.27636785

While this list is still of a relatively large size, it serves as a nice starting point for evaluating those stocks that might be a good choice in more detail. In regards to further analysis, I plan to look into how to generate a portfolio in which (returns are uncorrelated) and including a momentum statistic to add in rating which stocks to choose.