This is an INDIVIDUAL workshop. In this workshop we review the market regression model to calculate aplha and beta, and also we learn how to run the model many times using a loop. We use the results of alpha and beta to select stocks.
1 The Linear regression model
The simple linear regression model is used to understand the linear relationship between two variables assuming that one variable, the independent variable (IV), can be used as a predictor of the other variable, the dependent variable (DV). In this part we illustrate a simple regression model with the Market Model.
The Market Model states that the expected return of a stock is given by its alpha coefficient (b0) plus its market beta coefficient (b1) multiplied times the market return. In mathematical terms:
E[R_i] = α + β(R_M)
We can express the same equation using B0 as alpha, and B1 as market beta:
E[R_i] = β_0 + β_1(R_M)
We can estimate the alpha and market beta coefficient by running a simple linear regression model specifying that the market return is the independent variable and the stock return is the dependent variable. It is strongly recommended to use continuously compounded returns instead of simple returns to estimate the market regression model. The market regression model can be expressed as:
r_{(i,t)} = b_0 + b_1*r_{(M,t)} + ε_t
Where:
ε_t is the error at time t. Thanks to the Central Limit Theorem, this error behaves like a Normal distributed random variable ∼ N(0, σ_ε); the error term ε_t is expected to have mean=0 and a specific standard deviation σ_ε (also called volatility).
r_{(i,t)} is the return of the stock i at time t.
r_{(M,t)} is the market return at time t.
b_0 and b_1 are called regression coefficients.
2 Running a market regression model with real data
2.1 Data collection
We first load the quantmod package and download monthly price data for Tesla and the S&P500 market index. We also merge both datasets into one:
#Merge both xts-zoo objects into one dataset, but selecting only adjusted prices:adjprices<-Ad(merge(TSLA,GSPC))
2.2 Return calculation
We calculate continuously returns for both, Tesla and the S&P500:
returns <-diff(log(adjprices)) #I dropped the na's:returns <-na.omit(returns)#I renamed the columns:colnames(returns) <-c("TSLA", "GSPC")
2.3 Visualize the relationship
Do a scatter plot putting the S&P500 returns as the independent variable (X) and the stock return as the dependent variable (Y). We also add a line that better represents the relationship between the stock returns and the market returns.Type:
# As you see, I indicated that the Market returns goes in the X axis and # Tesla returns in the Y axis. # In the market model, the independent variable is the market returns, while# the dependent variable is the stock return
Sometimes graphs can be deceiving. Always check the Y sale and the X scale. In this case, the X goes from -0.10 to 0.10, while the Y scale goes from -0.20 to 0.40. Then, the real slope of the line should be steeper.
We can change the X scale so that both Y and X axis have similar ranges:
# I indicate that X axis goes from -0.7 to 0.7plot.default(x=returns$GSPC,y=returns$TSLA, xlim=c(-0.7,0.7))abline(lm(returns$TSLA ~ returns$GSPC),col='blue')
Now we see that the the market and stock returns have a similar scale. With this we can better appreciate their linear relationship.
WHAT DOES THE PLOT TELL YOU? BRIEFLY EXPLAIN
2.4 Running the market regression model
We can run the market regression model with the lm() function. The first parameter of the function is the DEPENDENT VARIABLE (in this case, the stock return), and the second parameter must be the INDEPENDENT VARIABLE, also named the EXPLANATORY VARIABLE (in this case, the market return).
What you will get is called The Market Regression Model. You are trying to examine how the market returns can explain stock returns from Jan 2017 to Dec 2020.
Assign your market model to an object named “model1”:
Call:
lm(formula = TSLA ~ GSPC, data = returns)
Residuals:
Min 1Q Median 3Q Max
-0.35579 -0.07596 -0.00766 0.10567 0.40883
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03627 0.02386 1.520 0.136
GSPC 2.16271 0.42682 5.067 8.13e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1584 on 43 degrees of freedom
Multiple R-squared: 0.3739, Adjusted R-squared: 0.3593
F-statistic: 25.67 on 1 and 43 DF, p-value: 8.126e-06
YOU HAVE TO INTERPRET THIS MODEL. MAKE SURE YOU RESPOND THE FOLLOWING THE QUESTIONS:
- IS THE STOCK SIGNIFICANTLY OFFERING RETURNS OVER THE MARKET?
SINCE BETA0 IS POSITIVE WE CAN SAY THAT, ON AVERAGE, TSLA IS OFFERING RETURNS OVER THE MARKET. HOWEVER, IT IS NOT SIGNIFICANTLY OFFERING RETURNS OVER THE MARKET AT THE 95% CONFIDENCE LEVEL SINCE ITS PVALUE IS GREATER THAN 0.05 (IT IS 0.136).
- IS THE STOCK SIGNIFICANTLY RISKIER THAN THE MARKET?
BETA1 IS 2.1627124, SO IT IS RISKIER THAN THE MARKET ON AVERAGE. TO KNOW WHETHER IT IS SIGNIFICANTLY RISKIER THAN THE MARKET WE NEED TO GET THE 95% CONFIDENCE INTERVAL (CI) AND CHECK WHETHER THIS INTERVAL IS GREATER THAN 1. IF THE MINIMUM OF THIS 95% CI IS GREATER THAN 1, THEN WE CAN SAY THAT THE STOCK IS SIGNIFICANTLY RISKIER THAN THE MARKET.
THEN, THE MINIMUM OF BETA 1 IS 2.1627124 MINUS 2 TIMES ITS STANDARD ERROR 0.4268225. THEN, THE MINIMUM 95%CI OF BETA1 IS 1.3090674. THEN, SINCE WE GOT A MINIMUM OF BETA1 THAT IS GREATER THAN 1, WE CAN SAY THAT TESLA IS SIGNIFICANTLY RISKIER THAN THE MARKET.
We can save the regression summary in another variable and display the regression coefficients:
s <-summary(model1)s$coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03626792 0.02386133 1.519945 1.358444e-01
GSPC 2.16271240 0.42682249 5.067007 8.125628e-06
The beta0 coefficient is in the first row (the intercept), while the beta1 coefficient is in the second raw.
The regression coefficients will be in the first column of this matrix. The second column is the standard error of the coefficients, which are the standard deviation of the coefficients. The third column is the t-value of each coefficient, and finally their p-values in the fourth column.
We can get beta0, beta1, standard errors, t-values and p-values and store in other variables:
The reference [2,1] of the matrix s$coefficients means that I want to get the element of the matrix that is in the row #2 and the column #1. In this position we can find the beta1 of the model.
Then, any R matrix you can make reference as matrix[#row,#column] to get the value of any cell according to its position in row and column numbers.
We can also get the coefficients applying the function coef to the regression object:
coefs <-coef(model1)coefs
(Intercept) GSPC
0.03626792 2.16271240
b0 = coefs[1]b1 = coefs[2]
3 Automating the calculation of market models for many stocks
3.1 Automating downloading and return calculation
If we want to analyze the market risk of many stocks, we need to calculate the market model for many stocks. The challenge here will be to automate the process of estimating market models for any list of tickers.
Let’s start with an easy example of 2 stocks:
Here we will learn how to automate the downloading of many stock prices and calculate returns starting with a definition of a set of tickers.
We define a vector with the tickers for Oracle, Netflix, and the S&P500, and download the monthly stock prices:
# Set the ticker ^GSPC as GSPC (previously defined name)# The seSymbolLookup function helps to avoid having problems with tickers# with special characters such as ^GSPCsetSymbolLookup(GSPC =list(name="^GSPC"))#Now I can keep using the ticker without the ^character, which sometimes # cause problems# I define a list of tickers including the market index GSPC:tickers <-c("ORCL","NFLX", "GSPC")# I get monthly data for the 3 tickers:getSymbols(tickers, periodicity ="monthly",from ="2019-01-01", to ="2022-10-31")
[1] "ORCL" "NFLX" "GSPC"
We can clone a dataset with the get function:
mktindex <-get("GSPC")
The get function receives a name of a dataset and then create a new dataset.
Now mktindex is a clone of the GSPC dataset. Why this can be useful?
As programmers, sometimes we do not know which stocks (tickers) we will be using in our program. Then, it is a good idea to find a way to clone a dataset that has a specific name, which can be any ticker name.
We can clone 2 datasets as data1 and data2 for the stocks:
data1 =get("ORCL")data2 =get("NFLX")
To avoid writing one get function for each ticker, we can use the lapply function, which applies a function to a list of elements of a vector.
We apply the get function to the tickers vector:
datasets_list <-lapply(tickers, get)
Now we have the 3 datasets inside a list; each element of the list is an xts dataset with the stock price data.
Another faster way to do this is using the do.call function, which runs a function like merge with more than 1 argument:
prices <-do.call(merge, datasets_list)
I do this to generalize our program for any number of tickers! The do.call function is similar to the lapply function. For the do.call function, the first argument is the function, and the second argument is the list of datasets.
Now I select adjusted prices and name the columns with the ticker names:
# We select only Adjusted prices:prices <-Ad(prices)# We change the name of the columns to be equal to the tickers vector:names(prices)<-tickers
With this code, you can change the list of tickers from 2 to any set of tickers, and you will end up with a clean dataset of only adjusted prices for all the tickers.
Another much efficient way to do the previous data management is using tidyverse. Let’s see a chunk that performs all the previous processes from getSymbols to merge:
Call:
lm(formula = returns$NFLX ~ returns$GSPC)
Residuals:
Min 1Q Median 3Q Max
-0.54727 -0.04276 0.00577 0.06175 0.19953
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.01343 0.01860 -0.722 0.474206
returns$GSPC 1.26228 0.33270 3.794 0.000459 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1235 on 43 degrees of freedom
Multiple R-squared: 0.2508, Adjusted R-squared: 0.2334
F-statistic: 14.4 on 1 and 43 DF, p-value: 0.0004587
We can also run these models using the reference of the columns:
# Since ORCL is in the 1st column, and the GSPC index in the 3rd column:model1 <-lm(returns[,1] ~ returns[,3])# Since NFLX is in the 2nd column, and the GSPC index in the 3rd column:model2 <-lm(returns[,2] ~ returns[,3])summary(model1)
Call:
lm(formula = returns[, 1] ~ returns[, 3])
Residuals:
Min 1Q Median 3Q Max
-0.100983 -0.034276 -0.005166 0.033795 0.166705
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.003327 0.008236 0.404 0.688
returns[, 3] 0.983316 0.147328 6.674 3.82e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.05469 on 43 degrees of freedom
Multiple R-squared: 0.5088, Adjusted R-squared: 0.4974
F-statistic: 44.55 on 1 and 43 DF, p-value: 3.819e-08
summary(model2)
Call:
lm(formula = returns[, 2] ~ returns[, 3])
Residuals:
Min 1Q Median 3Q Max
-0.54727 -0.04276 0.00577 0.06175 0.19953
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.01343 0.01860 -0.722 0.474206
returns[, 3] 1.26228 0.33270 3.794 0.000459 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1235 on 43 degrees of freedom
Multiple R-squared: 0.2508, Adjusted R-squared: 0.2334
F-statistic: 14.4 on 1 and 43 DF, p-value: 0.0004587
The only thing that changes in the lm models is the column name used for the dependent variable (1 or 2); the rest of the parameters is exactly the same.
How can we run these models using a loop?
A loop helps us to execute a set of commands many times. Let’s see this example:
for (i in1:2) { model <-lm(returns[,i] ~ returns[,3])print(summary(model))}
Call:
lm(formula = returns[, i] ~ returns[, 3])
Residuals:
Min 1Q Median 3Q Max
-0.100983 -0.034276 -0.005166 0.033795 0.166705
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.003327 0.008236 0.404 0.688
returns[, 3] 0.983316 0.147328 6.674 3.82e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.05469 on 43 degrees of freedom
Multiple R-squared: 0.5088, Adjusted R-squared: 0.4974
F-statistic: 44.55 on 1 and 43 DF, p-value: 3.819e-08
Call:
lm(formula = returns[, i] ~ returns[, 3])
Residuals:
Min 1Q Median 3Q Max
-0.54727 -0.04276 0.00577 0.06175 0.19953
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.01343 0.01860 -0.722 0.474206
returns[, 3] 1.26228 0.33270 3.794 0.000459 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1235 on 43 degrees of freedom
Multiple R-squared: 0.2508, Adjusted R-squared: 0.2334
F-statistic: 14.4 on 1 and 43 DF, p-value: 0.0004587
4 Use a Loop to generate information and store it in a Matrix.
An analyst made an analysis and tells you that financial leverage has a negative quadratic effect on the expected financial performance. This is a typical inverted U-shaped relationship. In other words, financial leverage has a positive effect on performance up to a certain level of leverage; after this level the relationship becomes negative: the more financial leverage, the less the firm performance. The equation is the following:
Performance=20-(leverage-4)^{2}
Leverage is measure from level 0 (no leverage) up to the maximum leverage =10. Calculate the expected firm performance for leverage levels 0 to 10 and store the results in a matrix:
# I initialize the matrix as empty:matrix_results =c()# I loop over i that will take values from 0 to 1 jumping by 1:for (i inseq(from=0,to=10, by=1)) {# I calculate the performance according to the leverage level i: performance =20- (i -4)^2# I put the leverage and performance in a vector: result =c(i,performance)# I append/ attach the result vector to the matrix# rbind stands for row bind, so I attach the result vector as a row below to whatever the matrix has # at the moment of the iteration matrix_results =rbind(matrix_results,result)}# I change the matrix to a data frame; data frames have advantages compared to matricesmatrix_results =as.data.frame(matrix_results)# I set the names for the columns:names(matrix_results) =c("leverage","performance")matrix_results
The problem here is that we run both models to create the linear regressions, but we only displayed the models, but did not save the beta information for each stock. Adjust the loop of section 4.2 to store the data in a matrix that contains B0, and B1 coefficients, standard errors and PValues for each company.
Answer the following questions: HOW CAN WE STORE THIS INFORMATION FOR BOTH STOCKS? WHICH STRUCTURE CAN YOU USE?
HOW CAN YOU SELECT THOSE STOCKS THAT HAVE A SIGNIFICANT BETA0, GREATER THA ZERO?
6 CHALLENGE 1 SOLUTION
I first write the STEPS of the algorithm:
For each stock I need to store beta0, beta1, and their standard errors and pvalues. I start initializing an empty matrix betas where I will be storing the beta results for each stock. Then, I write a for loop from i=1 to 2, and in each iteration I do the following:
Run the market regression model for stock i
Save the summary of the regression along with the coefficients matrix
From the coefficients matrix extract b0,b1,stderrb0,stderrb1,pvalueb0, pvalueb1
Save these values in a temporal vector
Accumulate this vector into the beta matrix
After the loop ends, rename the rows with the ticker names, and the columns with the corresponding naems beta0, stdb0, etc.
Finally, I identify whether any of these stocks are offering returns over the market by selecting the stocks with beta0 that are positive and significant.
# I initialize an empty vector/matrix betas: betas =c()for (i in1:2) {# I run the market regression model for stock i: model <-lm(returns[,i] ~ returns$GSPC)#print(summary(model))# I store the summary of the model in a temporal object: smodel =summary(model)# From the coefficients matrix I extract the coefficients along with their standard errors and pvalues: b0 = smodel$coefficients[1,1] b1 = smodel$coefficients[2,1] seb0 = smodel$coefficients[1,2] seb1 =smodel$coefficients[2,2] pb0 = smodel$coefficients[1,4] pb1 = smodel$coefficients[2,4]# I put the values together in a temporal vector: vectorbetas =c(b0,b1,seb0,seb1,pb0,pb1)# I bind this vector with the betas cumulative matrix: betas =rbind(betas,vectorbetas) }# Finally I rename columns and rows of the betas matrix:rownames(betas) = tickers[1:2]colnames(betas) =c("b0","b1","seb0","seb1","pvalueb0","pvalueb1")# I create a data frame from the beta matrix:betasdf =data.frame(betas)betasdf
Looking at the content of the data frame, none of these 2 stocks have a significant and positive beta0 since their pvalues are greater than 0.05. If I had not only 2 stocks, but many stocks, I can write a query with dplyr to check which stocks have significant and positive beta0:
In this case, the stocks_over_market data frame ended up empty since there is no stocks with positive and significant beta0
7 CHALLENGE 2
Based on the preliminary selection of 100 stocks on the Challenge of workshop 1:
Create a list with the ticker of each company
Using a Loop, generate the market model for each company and store the data
Based on the data of each company select a set of 10 companies based on any of these criteria:
CRITERIA A: Stocks that are SIGNIFICANTLY offering returns over the market
CRITERIA B: Stocks that are SIGNIFICANTLY less risky than the market according to the market model regressions.
8 CHALLENGE 2 SOLUTION
I will use the data management code of Workshop 1 to prepare the data for the logistic model:
# Load the datasets uspanel =read.csv("dataus2024.csv")usfirms =read.csv("firmsus2024.csv")
I calculate the period EBIT for all stocks-quarters:
# I check the cases with sgae=NA and revenue>0: uspanel |>filter(q=="2024q2", !is.na(revenue), is.na(sgae)) |>select(firm,q,revenue,cogs,sgae)
firm q revenue cogs sgae
1 AFL 2024q2 10575000 0 NA
2 AIG 2024q2 13323000 0 NA
3 AJX 2024q2 -28436 54044 NA
4 ALL 2024q2 30973000 0 NA
5 AMSF 2024q2 156319 0 NA
6 AMTB 2024q2 324942 273607 NA
7 ARR 2024q2 271405 305239 NA
8 ASB 2024q2 1183806 932729 NA
9 AUBN 2024q2 20596 16641 NA
10 AXP 2024q2 36204000 26732000 NA
11 BAC 2024q2 96600000 78951000 NA
12 BCBP 2024q2 97605 80773 NA
13 BFH 2024q2 2418000 1438000 NA
14 BFIN 2024q2 37737 32798 NA
15 BK 2024q2 19542000 16664000 NA
16 BOH 2024q2 508258 410187 NA
17 BOKF 2024q2 1738434 1404510 NA
18 BWB 2024q2 122860 100648 NA
19 BXMT 2024q2 269165 63041 NA
20 C 2024q2 86453000 72758000 NA
21 CCBG 2024q2 133289 98574 NA
22 CFB 2024q2 253627 203081 NA
23 CFFI 2024q2 81836 65557 NA
24 CFR 2024q2 1404352 1038203 NA
25 CINF 2024q2 5479000 0 NA
26 CMA 2024q2 2517000 2067000 NA
27 CNA 2024q2 6963000 0 NA
28 CNOB 2024q2 267861 212534 NA
29 COF 2024q2 26331000 17506000 NA
30 CPF 2024q2 173607 129861 NA
31 CRBG 2024q2 9546000 0 NA
32 DCOM 2024q2 342702 281600 NA
33 DFS 2024q2 11656000 6946000 NA
34 DX 2024q2 199844 168029 NA
35 EFSC 2024q2 447019 328628 NA
36 EIG 2024q2 440100 0 NA
37 EQH 2024q2 5740000 0 NA
38 ERIE 2024q2 357926 0 NA
39 FAF 2024q2 3036900 0 NA
40 FBIZ 2024q2 127875 100864 NA
41 FCBC 2024q2 91419 57560 NA
42 FFIN 2024q2 363818 228605 NA
43 FHN 2024q2 2546000 1927000 NA
44 FIBK 2024q2 733400 564000 NA
45 FLIC 2024q2 89680 79505 NA
46 FNM_old 2024q2 75100000 64543000 NA
47 FRE_old 2024q2 59507000 52018000 NA
48 FRME 2024q2 530253 404995 NA
49 GBCI 2024q2 615429 513073 NA
50 GNW 2024q2 3633000 0 NA
51 GS 2024q2 63089000 53336000 NA
52 HBAN 2024q2 5814000 4511000 NA
53 HBCP 2024q2 96888 73966 NA
54 HBT 2024q2 140021 92846 NA
55 HIG 2024q2 12905000 0 NA
56 HOMB 2024q2 728791 452487 NA
57 HTGC 2024q2 217855 85039 NA
58 HTH 2024q2 791680 715973 NA
59 IBOC 2024q2 513523 243673 NA
60 ISTR 2024q2 77010 67871 NA
61 JPM 2024q2 142257000 96593000 NA
62 KEY 2024q2 5394000 4580000 NA
63 KNSL 2024q2 757344 0 NA
64 KREF 2024q2 315704 255682 NA
65 L 2024q2 8498000 0 NA
66 LADR 2024q2 258691 198479 NA
67 LMND 2024q2 241100 0 NA
68 LNC 2024q2 9269000 0 NA
69 MBWM 2024q2 176152 120850 NA
70 MCB 2024q2 241239 191005 NA
71 MCY 2024q2 2579080 0 NA
72 MET 2024q2 33880000 0 NA
73 MKL 2024q2 8168498 0 NA
74 MTG 2024q2 599638 0 NA
75 MYFW 2024q2 90701 83301 NA
76 NAVI 2024q2 2252000 2090000 NA
77 NBHC 2024q2 295902 222484 NA
78 NLY 2024q2 2775496 2308167 NA
79 NMFC 2024q2 185144 108916 NA
80 NMIH 2024q2 318375 0 NA
81 NRIM 2024q2 90138 68049 NA
82 NTRS 2024q2 8263400 6799700 NA
83 NYMT 2024q2 81792 185701 NA
84 OCFC 2024q2 344299 269833 NA
85 OMF 2024q2 2746000 1444000 NA
86 ONB 2024q2 1424437 1060081 NA
87 ORC 2024q2 128037 113240 NA
88 ORI 2024q2 3887600 0 NA
89 PEBO 2024q2 307846 222333 NA
90 PFG 2024q2 8364100 0 NA
91 PMT 2024q2 488771 427781 NA
92 PNC 2024q2 17058000 13193000 NA
93 PPBI 2024q2 465481 340164 NA
94 PRK 2024q2 310538 214501 NA
95 PRU 2024q2 38392000 0 NA
96 RDN 2024q2 640565 0 NA
97 RGA 2024q2 11215000 0 NA
98 RLI 2024q2 861273 0 NA
99 SBSI 2024q2 228225 172634 NA
100 SFST 2024q2 105096 97479 NA
101 SIGI 2024q2 2360964 0 NA
102 SLM 2024q2 1620997 866094 NA
103 SNV 2024q2 1573989 1354650 NA
104 SOV_old 2024q2 8692481 7090498 NA
105 SRCE 2024q2 283014 190749 NA
106 SSB 2024q2 1195162 852221 NA
107 STBA 2024q2 282654 197576 NA
108 STEL 2024q2 312314 239700 NA
109 STT 2024q2 10728000 9218000 NA
110 SYF 2024q2 10850000 4723000 NA
111 TCBI 2024q2 931189 798657 NA
112 TFIN 2024q2 241128 222726 NA
113 THG 2024q2 3087800 0 NA
114 TIPT 2024q2 1044894 0 NA
115 TRTX 2024q2 184198 142430 NA
116 TRV 2024q2 22511000 0 NA
117 TWO 2024q2 686827 400731 NA
118 UBFO 2024q2 32421 20325 NA
119 UMBF 2024q2 1362510 1077676 NA
120 UNM 2024q2 6433700 0 NA
121 UVE 2024q2 748173 0 NA
122 UVSP 2024q2 245016 194729 NA
123 VBTX 2024q2 393074 312049 NA
124 VEL 2024q2 236625 192296 NA
125 VOYA 2024q2 4084000 0 NA
126 WAFD 2024q2 1042814 849924 NA
127 WAL 2024q2 2447600 1915600 NA
128 WFC 2024q2 63126000 49205000 NA
129 WRB 2024q2 6570805 0 NA
130 WSBC 2024q2 460311 367352 NA
131 WSFS 2024q2 694314 481972 NA
132 WTBA 2024q2 96783 83220 NA
uspanel = uspanel |>mutate(sgae =ifelse(!is.na(revenue) &is.na(sgae),0,sgae),# for some reason there are some cases of firms with revenue and cogs, but with sgae=NA;# this should be the case for service firms, so I will replace sgae=0 when revenue>0;# if I do not replace sgae=0 for these cases, then YTDebit cannot be calculated! YTDebit = revenue - cogs - sgae, ebitp =ifelse(fiscalq==1,YTDebit, YTDebit -lag(YTDebit)))
I create a column for the industry in the uspanel dataset by pulling the industry from the usfirms dataset:
# I create a temporal usfirms1 dataset with only 2 columns. I rename the empresa column with firm to have the same name for the ticker in both files:usfirms1 = usfirms |>mutate(firm = empresa) |>select(firm,naics1, naics2)# I merge the usfirms1 with the uspanel to pull the industry:uspanel =left_join(uspanel,usfirms1,by='firm')
I calculate annual stock return for all firms-quarters:
uspanel = uspanel |># I sort by firm-quarter:arrange(firm,q) |># I group by firm to make sure that the lag function works only on the # quarters of the firm : group_by(firm) |># I create market value, operating earnings per share deflated by stock price, and stock annual return:mutate(marketvalue = originalprice * sharesoutstanding,oepsp = (YTDebit / sharesoutstanding) / originalprice,# The annual return is equal to the current stock price minus its# stock price of 4 quarters ago:r =log(adjprice) - dplyr::lag(log(adjprice),4) ) |># I do ungroup to undo the group_byungroup()
For each quarter-industry I calculate the median return and the binary variable rabove that indicates whether the stock returns beats the industry returns:
I calculate another variable equal to the rabove, but 1 quarter in the future (1 lag in the future):
uspanel = uspanel |>group_by(firm) |>mutate(F1rabove =lead(rabove,1))# The lead function gets the future value of the variable# I check how many rows/cases have 1 and 0's:table(uspanel$F1rabove)
0 1
115114 114211
I run the logit model to examine whether oepspw is related to the probability of a stock to beat its industry return 1 quarter in the future:
Call:
glm(formula = F1rabove ~ oepspw, family = "binomial", data = uspanel,
na.action = na.omit)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.020277 0.004256 -4.764 1.9e-06 ***
oepspw 0.884870 0.017116 51.699 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 311904 on 224995 degrees of freedom
Residual deviance: 308217 on 224994 degrees of freedom
(332185 observations deleted due to missingness)
AIC: 308221
Number of Fisher Scoring iterations: 4
coefficients =coef(logitm1)# The coefficients for the odd ratio interpretation are:exp(0.1*coefficients)
(Intercept) oepspw
0.9979744 1.0925201
9 Prediction with the logit model
We can use the last model to predict the probability whether the firm will beat the return of the industry 1 year in the future. For this prediction, it might be a good idea to select only the firms in the last quarter of the dataset (2024Q2), so we can predict which firm will beat the industry in 2025:
The type=c(“response”) parameter indicates to get the prediction in probability. If I do not specify this option, the prediction is calculated in the logarithm of ODDS RATIO.
9.1 Selection of best stocks based on results of the logit model
We can sort the firms according to the predicted probability of beating the benchmark, and then select the top 100 with the highest probability:
top110<- uspanel2024 |>merge(usfirms1,by="firm") |># merge with firms1 to pull firm name and industryarrange(desc(pred)) |># sort by probability, from highest to lowest#slice_head(prop=0.05) |> # select the top 5% firms head(n=110)# Show the # of firms selected:nrow(top110)
[1] 110
I extract the 100 firms (tickers) column and put it into a char vector:
tickers =as.vector(top110$firm)
I selected the top 110 just in case there are some tickers that do not exist in Yahoo.
Now I write a for loop to get the historical prices of the 100 tickers. Here is a difficult problem to solve related to data validation. It is possible that some of the tickers are not in Yahoo Finance. If I use getSymbols with the ticker vector and it is one ticker that does not exist in Yahoo, then the getSymbols will stop running. Then, I can write a loop to run the getSymbols for each ticker and detect when a ticker is not in Yahoo.
The tryCatch function is used to run a set of R lines of code in the try mode, so if there is an error in any R code, R DOES NOT STOP RUNNING AND I can cach the error and do a specific action:
# I do a loop to run getSymbols and in each iteration I can detect which one has an error because# it does not exist in Yahoo Finance:# I initialize an empty list where I will be storing the tickers that exist in Yahoo Finance:finaltickerlist =c()# I initilize a counter i=0 and tottickers=0. # The counter i will be the counter of ticker #, while the counter tottickers # will count the tickers that are found in Yahoo to know when to stop the loop after finding 100 tickers in Yahoo:i=0tottickers =0library(quantmod)#for (t in tickers) {# I will use a while loop instead of a for to make sure that I get exactly 100 tickers:while (tottickers<100& i<110) {# I increment i in 1: i = i +1# I save the ticker i in t: t = tickers[i]tryCatch( {getSymbols(t, from="2020-01-01", to="2024-10-31",periodicity="monthly")# Since the ticker was found in Yahoo Finance, then: tottickers=tottickers +1# I add the ticker found in Yahoo into the finaltickerlist vector: finaltickerlist =c(finaltickerlist,t)cat("The ticker ", t, " was found in Yahoo \n")if (tottickers==1) {# If I am in the ticker 1 I only get the Adjusted price, but I do not merge# the historical prices with the prices dataset: temp =get(t)# prices will be the final big dataset with all prices for all stocks: prices =Ad(temp) }else { # if tottickers>1, so this means that I already created the price dataset, # so I need to do a merge: temp=get(t) prices =merge(prices,Ad(temp)) } }, error =function(e) {cat("THE TICKER ",t," WAS NOT FOUND IN YAHOO FINANCE \n") } )}
The ticker AJG was found in Yahoo
The ticker APA was found in Yahoo
The ticker LNC was found in Yahoo
The ticker TIPT was found in Yahoo
The ticker UVE was found in Yahoo
The ticker GNW was found in Yahoo
THE TICKER OCN WAS NOT FOUND IN YAHOO FINANCE
The ticker CYH was found in Yahoo
The ticker PRU was found in Yahoo
The ticker MCY was found in Yahoo
The ticker RGA was found in Yahoo
The ticker ALL was found in Yahoo
The ticker THG was found in Yahoo
The ticker MET was found in Yahoo
The ticker UNM was found in Yahoo
The ticker ATUS was found in Yahoo
THE TICKER VIA__old WAS NOT FOUND IN YAHOO FINANCE
The ticker VOYA was found in Yahoo
The ticker CNA was found in Yahoo
The ticker FAF was found in Yahoo
The ticker CRBG was found in Yahoo
The ticker L was found in Yahoo
The ticker ORI was found in Yahoo
The ticker TRV was found in Yahoo
The ticker PFG was found in Yahoo
The ticker BFH was found in Yahoo
The ticker HIG was found in Yahoo
The ticker EQH was found in Yahoo
The ticker SIGI was found in Yahoo
The ticker EIG was found in Yahoo
The ticker UNIT was found in Yahoo
The ticker MKL was found in Yahoo
The ticker WRB was found in Yahoo
The ticker VNCE was found in Yahoo
The ticker SYF was found in Yahoo
The ticker THC was found in Yahoo
The ticker CINF was found in Yahoo
The ticker DTIL was found in Yahoo
The ticker ILPT was found in Yahoo
The ticker WLFC was found in Yahoo
The ticker AIG was found in Yahoo
The ticker VGZ was found in Yahoo
The ticker RM was found in Yahoo
The ticker BIPC was found in Yahoo
The ticker OMF was found in Yahoo
The ticker TWO was found in Yahoo
The ticker AFL was found in Yahoo
The ticker LMND was found in Yahoo
The ticker OI was found in Yahoo
The ticker AAL was found in Yahoo
The ticker AMSF was found in Yahoo
The ticker SLM was found in Yahoo
The ticker COF was found in Yahoo
The ticker ENVA was found in Yahoo
The ticker SABR was found in Yahoo
The ticker EZPW was found in Yahoo
The ticker CHTR was found in Yahoo
The ticker DIT was found in Yahoo
The ticker DFS was found in Yahoo
The ticker HZO was found in Yahoo
The ticker GMS was found in Yahoo
The ticker RDN was found in Yahoo
The ticker KEQU was found in Yahoo
The ticker GM was found in Yahoo
The ticker JACK was found in Yahoo
The ticker HOV was found in Yahoo
The ticker DAKT was found in Yahoo
The ticker RLI was found in Yahoo
The ticker SR was found in Yahoo
The ticker AMWD was found in Yahoo
The ticker GBX was found in Yahoo
The ticker JBL was found in Yahoo
The ticker UAL was found in Yahoo
The ticker AL was found in Yahoo
The ticker MIND was found in Yahoo
The ticker DLHC was found in Yahoo
The ticker INN was found in Yahoo
The ticker GPI was found in Yahoo
The ticker PDCO was found in Yahoo
The ticker CIVI was found in Yahoo
The ticker NMIH was found in Yahoo
The ticker CABO was found in Yahoo
The ticker CZR was found in Yahoo
The ticker C was found in Yahoo
The ticker SJM was found in Yahoo
The ticker PHM was found in Yahoo
The ticker VLGEA was found in Yahoo
The ticker TNL was found in Yahoo
The ticker OPY was found in Yahoo
The ticker FDP was found in Yahoo
The ticker MCB was found in Yahoo
The ticker MHO was found in Yahoo
The ticker VGR was found in Yahoo
The ticker TRN was found in Yahoo
The ticker DLX was found in Yahoo
The ticker MTG was found in Yahoo
The ticker CALM was found in Yahoo
The ticker SNCY was found in Yahoo
The ticker MTN was found in Yahoo
The ticker NAVI was found in Yahoo
The ticker COOP was found in Yahoo
The ticker CFFI was found in Yahoo
I check whether I ended up with a price dataset of 100 columns, one for each ticker:
ncol(prices)
[1] 100
I create a dataset for returns of these 100 stocks:
returns =diff(log(prices))# finaltickerlist has the tickers that were found in Yahoo Finance:colnames(returns) = finaltickerlist
Now I get the price data of the market index and calculate its monthly returns:
I now run a loop to estimate all market regression models and store them in a data frame:
# I initialize an empty vector/matrix betas: betas =c()for (i in1:ncol(returns)) {# I run the market regression model for stock i: model <-lm(returns[,i] ~ sp500ret$GSPC)#print(summary(model))# I store the summary of the model in a temporal object: smodel =summary(model)# From the coefficients matrix I extract the coefficients along with their standard errors and pvalues: b0 = smodel$coefficients[1,1] b1 = smodel$coefficients[2,1] seb0 = smodel$coefficients[1,2] seb1 =smodel$coefficients[2,2] pb0 = smodel$coefficients[1,4] pb1 = smodel$coefficients[2,4]# I save the number of valid observations (months) used in the regression: N = model$df.residual +2# I save N to later drop those tickers with very few history available since their beta estimations will not be reliable# I put the values together in a temporal vector: vectorbetas =c(b0,b1,seb0,seb1,pb0,pb1, N)# I bind this vector with the betas cumulative matrix: betas =rbind(betas,vectorbetas) }# Finally I rename columns and rows of the betas matrix:# I use the finaltickerlist since it has the tickers that where FOUND IN YAHOO FINANCE:rownames(betas) = finaltickerlist[1:ncol(returns)]colnames(betas) =c("b0","b1","seb0","seb1","pb0","pb1","N")# I create a data frame from the beta matrix:betasdf =data.frame(betas)
betasdf
b0
b1
seb0
seb1
pb0
pb1
N
AJG
0.0113380
0.7393096
0.0071630
0.1330144
0.1191870
0.0000008
57
APA
-0.0345240
3.3370945
0.0360089
0.6686753
0.3418732
0.0000064
57
LNC
-0.0223796
1.8342301
0.0159093
0.2954323
0.1651429
0.0000001
57
TIPT
0.0087913
1.1767830
0.0165935
0.3081359
0.5983811
0.0003425
57
UVE
-0.0089182
0.9499332
0.0154648
0.2871777
0.5665114
0.0016617
57
GNW
-0.0001226
0.8838090
0.0169328
0.3144380
0.9942496
0.0068344
57
CYH
-0.0185765
1.7556430
0.0313874
0.5828554
0.5563784
0.0039148
57
PRU
-0.0035397
1.2985852
0.0081189
0.1507666
0.6645560
0.0000000
57
MCY
0.0006371
0.8561270
0.0099851
0.1854206
0.9493542
0.0000238
57
RGA
-0.0006142
0.9260017
0.0115070
0.2136816
0.9576226
0.0000629
57
ALL
0.0052743
0.4791604
0.0087641
0.1627479
0.5497784
0.0047364
57
THG
-0.0041453
0.7352850
0.0080893
0.1502162
0.6103921
0.0000090
57
MET
0.0005412
1.0361110
0.0099504
0.1847764
0.9568183
0.0000007
57
UNM
0.0107548
0.8126791
0.0128720
0.2390300
0.4070415
0.0012612
57
ATUS
-0.0584363
1.5950116
0.0214332
0.3980079
0.0085722
0.0001862
57
VOYA
-0.0038743
1.0277517
0.0078811
0.1463499
0.6249618
0.0000000
57
CNA
0.0007392
0.6721973
0.0073793
0.1370322
0.9205692
0.0000087
57
FAF
-0.0103280
1.3776770
0.0067747
0.1258053
0.1331174
0.0000000
57
CRBG
0.0114968
0.6665792
0.0187263
0.4615248
0.5455533
0.1627472
24
L
-0.0005005
0.8390366
0.0067960
0.1261994
0.9415607
0.0000000
57
ORI
0.0042975
0.9366377
0.0082018
0.1523056
0.6024027
0.0000001
57
TRV
0.0063405
0.6503174
0.0074961
0.1392006
0.4013040
0.0000197
57
PFG
-0.0010812
1.2059120
0.0091295
0.1695326
0.9061577
0.0000000
57
BFH
-0.0287154
2.1501943
0.0216137
0.4013606
0.1894753
0.0000017
57
HIG
0.0032758
0.9601577
0.0093990
0.1745376
0.7287737
0.0000010
57
EQH
-0.0006974
1.4085885
0.0098480
0.1828744
0.9438045
0.0000000
57
SIGI
0.0009735
0.5749528
0.0084151
0.1562667
0.9083283
0.0005334
57
EIG
0.0033086
0.2181556
0.0106267
0.1973353
0.7567107
0.2737523
57
UNIT
-0.0115029
1.4583343
0.0230401
0.4278492
0.6195911
0.0012288
57
MKL
-0.0029650
0.7759603
0.0075975
0.1410843
0.6978552
0.0000010
57
WRB
0.0048429
0.6560227
0.0082789
0.1537367
0.5609585
0.0000787
57
VNCE
-0.0558057
1.8868295
0.0346208
0.6428990
0.1127056
0.0048607
57
SYF
-0.0054996
1.7062420
0.0127636
0.2370161
0.6682390
0.0000000
57
THC
0.0064087
2.1458508
0.0155155
0.2881181
0.6811760
0.0000000
57
CINF
0.0003360
0.7061352
0.0105423
0.1957681
0.9746898
0.0006687
57
DTIL
-0.0716242
1.2937344
0.0287758
0.5343592
0.0158637
0.0188011
57
ILPT
-0.0476199
1.7844570
0.0226104
0.4198693
0.0397732
0.0000833
57
WLFC
0.0080878
1.1833150
0.0222588
0.4133396
0.7177353
0.0059303
57
AIG
-0.0025594
1.2047644
0.0124726
0.2316120
0.8381706
0.0000030
57
VGZ
-0.0169125
1.6111618
0.0187612
0.3483901
0.3712742
0.0000232
57
RM
-0.0133970
1.6511920
0.0163004
0.3026945
0.4146951
0.0000012
57
BIPC
-0.0060592
1.3036751
0.0118454
0.2457618
0.6111498
0.0000024
54
OMF
-0.0023353
1.5450776
0.0139289
0.2586556
0.8674655
0.0000002
57
TWO
-0.0439982
2.4489830
0.0205205
0.3810600
0.0364597
0.0000000
57
AFL
0.0048980
0.9596290
0.0078087
0.1450049
0.5330902
0.0000000
57
LMND
-0.0343659
1.6666532
0.0302652
0.6291802
0.2618073
0.0108976
50
OI
-0.0143482
1.2178123
0.0149605
0.2778133
0.3417204
0.0000531
57
AAL
-0.0272285
1.5097844
0.0155172
0.2881509
0.0848761
0.0000026
57
AMSF
-0.0010850
0.3538947
0.0100081
0.1858480
0.9140621
0.0621182
57
SLM
0.0015965
1.2379177
0.0121828
0.2262315
0.8962208
0.0000011
57
COF
-0.0049734
1.5078013
0.0122869
0.2281643
0.6872175
0.0000000
57
ENVA
0.0075924
1.4216987
0.0150296
0.2790961
0.6154638
0.0000044
57
SABR
-0.0531091
2.0126269
0.0260138
0.4830690
0.0460022
0.0001102
57
EZPW
0.0009183
0.9842858
0.0141193
0.2621924
0.9483806
0.0004213
57
CHTR
-0.0188152
1.0789512
0.0120383
0.2235479
0.1238023
0.0000115
57
DIT
0.0048551
0.4953523
0.0161000
0.2989720
0.7641270
0.1032439
57
DFS
-0.0010414
1.4975458
0.0131692
0.2445494
0.9372593
0.0000001
57
HZO
-0.0125688
1.9216400
0.0165536
0.3073959
0.4509275
0.0000001
57
GMS
0.0042707
1.7003253
0.0112299
0.2085371
0.7051886
0.0000000
57
RDN
-0.0024941
1.1346986
0.0112744
0.2093626
0.8257416
0.0000014
57
KEQU
0.0088525
0.7396444
0.0179389
0.3331211
0.6236399
0.0305344
57
GM
-0.0068359
1.4796292
0.0122657
0.2277702
0.5795693
0.0000000
57
JACK
-0.0270817
1.9923884
0.0161809
0.3004750
0.0998722
0.0000000
57
HOV
0.0050693
2.8768282
0.0299148
0.5555090
0.8660591
0.0000033
57
DAKT
0.0027270
1.1335075
0.0165543
0.3074088
0.8697611
0.0005202
57
RLI
0.0079262
0.3608798
0.0094364
0.1752318
0.4045696
0.0441982
57
SR
-0.0068911
0.5478912
0.0073282
0.1360831
0.3511486
0.0001751
57
AMWD
-0.0210336
1.7696686
0.0142693
0.2649776
0.1461706
0.0000000
57
GBX
0.0042405
1.4256692
0.0156000
0.2896870
0.7867707
0.0000082
57
JBL
0.0081466
1.2517571
0.0098525
0.1829589
0.4118946
0.0000000
57
UAL
-0.0147082
1.5492610
0.0183997
0.3416773
0.4275133
0.0000317
57
AL
-0.0144037
1.6639299
0.0120150
0.2231146
0.2357412
0.0000000
57
MIND
-0.0944896
1.7749395
0.0479676
0.8907463
0.0538982
0.0512723
57
DLHC
-0.0006415
1.1622485
0.0181187
0.3364592
0.9718843
0.0010695
57
INN
-0.0299584
2.1157988
0.0155649
0.2890368
0.0594413
0.0000000
57
GPI
0.0083219
1.4810674
0.0143332
0.2661637
0.5638814
0.0000008
57
PDCO
-0.0082330
1.0626405
0.0117684
0.2185363
0.4871346
0.0000101
57
CIVI
0.0081122
1.4536967
0.0187859
0.3488485
0.6675553
0.0001099
57
NMIH
-0.0099721
1.3334456
0.0147531
0.2739616
0.5019157
0.0000099
57
CABO
-0.0356232
0.8512585
0.0120768
0.2242626
0.0046638
0.0003689
57
CZR
-0.0397440
3.2698129
0.0243790
0.4527115
0.1087619
0.0000000
57
C
-0.0140461
1.4632023
0.0104668
0.1943659
0.1851203
0.0000000
57
SJM
0.0019157
0.2254668
0.0077158
0.1432808
0.8048457
0.1213152
57
PHM
0.0020400
1.7541107
0.0124673
0.2315148
0.8706258
0.0000000
57
VLGEA
0.0048138
0.2857615
0.0090340
0.1677588
0.5962813
0.0941369
57
TNL
-0.0147136
1.7793744
0.0129833
0.2410970
0.2620176
0.0000000
57
OPY
0.0034775
1.1347066
0.0107546
0.1997094
0.7476556
0.0000005
57
FDP
-0.0014145
0.3716580
0.0119085
0.2211380
0.9058802
0.0985002
57
MCB
-0.0102722
1.1753253
0.0186966
0.3471917
0.5849451
0.0013182
57
MHO
-0.0024442
2.3977187
0.0165943
0.3081516
0.8834399
0.0000000
57
VGR
0.0040314
1.0809234
0.0111707
0.2074369
0.7195642
0.0000029
57
TRN
-0.0011655
1.3427481
0.0109939
0.2041545
0.9159580
0.0000000
57
DLX
-0.0274098
1.4926078
0.0126830
0.2355206
0.0350550
0.0000000
57
MTG
-0.0016569
1.4118011
0.0116711
0.2167298
0.8876271
0.0000000
57
CALM
0.0198805
-0.1541520
0.0108815
0.2020658
0.0731274
0.4487946
57
SNCY
-0.0363092
1.4843406
0.0200047
0.4175285
0.0770229
0.0009883
42
MTN
-0.0161188
1.1945487
0.0090680
0.1683895
0.0810054
0.0000000
57
NAVI
-0.0115599
1.5118823
0.0109989
0.2042466
0.2978506
0.0000000
57
COOP
0.0197037
1.4805047
0.0141093
0.2620058
0.1681725
0.0000006
57
CFFI
0.0031484
0.3621739
0.0115736
0.2149178
0.7866149
0.0976215
57
Based on the data of each company select a set of 10 companies based on any of these criteria:
CRITERIA A: Stocks that are SIGNIFICANTLY offering returns over the market
CRITERIA B: Stocks that are SIGNIFICANTLY less risky than the market according to the market model regressions.
There is only 0 stock(s) that have a positive and significant beta0.
Filtering the stocks according to CRITERIA B:
# I first add the 95% minimum and the maximum values of the C.I. for beta1:betasdf <- betasdf |>mutate(minb1= b1 -2*stderr_b1,maxb1= b1 +2*stderr_b1)# I filter those that might be significantly less risky than the market:less_risky_market <- betasdf |>filter(maxb1<1) |>arrange(b1)less_risky_market
There is only 1 stock(s) that are significantly less risky than the market.
Applying these 2 criteria I only got 2 stocks. If I want to get the best 10 stocks with these criteria, I can relax both criteria and do a filter by steps. Let’s do the following:
I will first use beta0 information to select the best 20 stocks. I can sort all stocks by the minimum of the 95% CI of beta0 (by the maximum to the minimum value):
# I first add the 95% minimum and the maximum values of the C.I. for beta1:betasdf <- betasdf |>mutate(minb0= b0 -2*stderr_b0,maxb0= b0 +2*stderr_b0)
# I do the first filter selecting 20 stocks based on the min of b0. # I will add the restriction that the estimation of the market regression model had more than 30 months, to ensure that the beta estimations are more stable:filter1 <- betasdf |>filter(N>30) |>arrange(desc(minb0)) |>head(20)
filter1
b0
b1
seb0
seb1
pb0
pb1
N
minb1
maxb1
minb0
maxb0
CALM
0.0198805
-0.1541520
0.0108815
0.2020658
0.0731274
0.4487946
57
-1.0077970
0.699493
-0.0278422
0.0676031
COOP
0.0197037
1.4805047
0.0141093
0.2620058
0.1681725
0.0000006
57
0.6268597
2.334150
-0.0280189
0.0674264
AJG
0.0113380
0.7393096
0.0071630
0.1330144
0.1191870
0.0000008
57
-0.1143354
1.592955
-0.0363847
0.0590607
UNM
0.0107548
0.8126791
0.0128720
0.2390300
0.4070415
0.0012612
57
-0.0409659
1.666324
-0.0369679
0.0584775
KEQU
0.0088525
0.7396444
0.0179389
0.3331211
0.6236399
0.0305344
57
-0.1140005
1.593289
-0.0388702
0.0565751
TIPT
0.0087913
1.1767830
0.0165935
0.3081359
0.5983811
0.0003425
57
0.3231380
2.030428
-0.0389314
0.0565139
GPI
0.0083219
1.4810674
0.0143332
0.2661637
0.5638814
0.0000008
57
0.6274224
2.334712
-0.0394008
0.0560445
JBL
0.0081466
1.2517571
0.0098525
0.1829589
0.4118946
0.0000000
57
0.3981121
2.105402
-0.0395761
0.0558692
CIVI
0.0081122
1.4536967
0.0187859
0.3488485
0.6675553
0.0001099
57
0.6000517
2.307342
-0.0396104
0.0558349
WLFC
0.0080878
1.1833150
0.0222588
0.4133396
0.7177353
0.0059303
57
0.3296700
2.036960
-0.0396349
0.0558104
RLI
0.0079262
0.3608798
0.0094364
0.1752318
0.4045696
0.0441982
57
-0.4927652
1.214525
-0.0397965
0.0556488
ENVA
0.0075924
1.4216987
0.0150296
0.2790961
0.6154638
0.0000044
57
0.5680537
2.275344
-0.0401303
0.0553151
THC
0.0064087
2.1458508
0.0155155
0.2881181
0.6811760
0.0000000
57
1.2922058
2.999496
-0.0413140
0.0541313
TRV
0.0063405
0.6503174
0.0074961
0.1392006
0.4013040
0.0000197
57
-0.2033276
1.503962
-0.0413821
0.0540632
ALL
0.0052743
0.4791604
0.0087641
0.1627479
0.5497784
0.0047364
57
-0.3744846
1.332805
-0.0424484
0.0529970
HOV
0.0050693
2.8768282
0.0299148
0.5555090
0.8660591
0.0000033
57
2.0231832
3.730473
-0.0426534
0.0527919
AFL
0.0048980
0.9596290
0.0078087
0.1450049
0.5330902
0.0000000
57
0.1059840
1.813274
-0.0428247
0.0526207
DIT
0.0048551
0.4953523
0.0161000
0.2989720
0.7641270
0.1032439
57
-0.3582927
1.348997
-0.0428676
0.0525777
WRB
0.0048429
0.6560227
0.0082789
0.1537367
0.5609585
0.0000787
57
-0.1976223
1.509668
-0.0428797
0.0525656
VLGEA
0.0048138
0.2857615
0.0090340
0.1677588
0.5962813
0.0941369
57
-0.5678835
1.139406
-0.0429089
0.0525365
After this first screening using beta0 information, I now use the information of beta1 to further select only 10 stocks.
For beta1 I will consider conservative stocks that are not too risky compared to the market, assuming that I will create a portfolio in the sort-term:
filter2 <- filter1 |>arrange(maxb1) |>head(10)
filter2
b0
b1
seb0
seb1
pb0
pb1
N
minb1
maxb1
minb0
maxb0
CALM
0.0198805
-0.1541520
0.0108815
0.2020658
0.0731274
0.4487946
57
-1.0077970
0.699493
-0.0278422
0.0676031
VLGEA
0.0048138
0.2857615
0.0090340
0.1677588
0.5962813
0.0941369
57
-0.5678835
1.139406
-0.0429089
0.0525365
RLI
0.0079262
0.3608798
0.0094364
0.1752318
0.4045696
0.0441982
57
-0.4927652
1.214525
-0.0397965
0.0556488
ALL
0.0052743
0.4791604
0.0087641
0.1627479
0.5497784
0.0047364
57
-0.3744846
1.332805
-0.0424484
0.0529970
DIT
0.0048551
0.4953523
0.0161000
0.2989720
0.7641270
0.1032439
57
-0.3582927
1.348997
-0.0428676
0.0525777
TRV
0.0063405
0.6503174
0.0074961
0.1392006
0.4013040
0.0000197
57
-0.2033276
1.503962
-0.0413821
0.0540632
WRB
0.0048429
0.6560227
0.0082789
0.1537367
0.5609585
0.0000787
57
-0.1976223
1.509668
-0.0428797
0.0525656
AJG
0.0113380
0.7393096
0.0071630
0.1330144
0.1191870
0.0000008
57
-0.1143354
1.592955
-0.0363847
0.0590607
KEQU
0.0088525
0.7396444
0.0179389
0.3331211
0.6236399
0.0305344
57
-0.1140005
1.593289
-0.0388702
0.0565751
UNM
0.0107548
0.8126791
0.0128720
0.2390300
0.4070415
0.0012612
57
-0.0409659
1.666324
-0.0369679
0.0584775
These will be my selected stocks based on their beta0 and beta1 information. These stocks can be candidates to include them in a winning portfolio.
I can show the name of these firms and their industry: