Whenever the unfortunate idea to invest in the stock market forms in my head, I first start by playing with the google finance stock screener.
This time, instead of burning my money, i thought my time would be better spent building a stock screener of my own.
What follows is a quite simple tutorial on just how to do so. It is the first of a series of posts that will be related to stocks and stock-screening.
library(rvest)
library(magrittr)
library(stringr)
library(dplyr)
library(ggplot2)
get_symbols <- function(letter){
url <- sprintf(paste0("https://en.wikipedia.org/wiki/",
"Companies_listed_on_the_Toronto_Stock_Exchange_(%s)"),
letter)
html <- read_html(url)
df <- html %>% html_nodes("table") %>% extract2(2) %>% html_table()
colnames(df) <- c("stock", "symbol")
df$link <- paste0("http://web.tmxmoney.com/quote.php?qm_symbol=",
df$symbol)
df
}
#loop over letters to get all stocks
all_stocks <- lapply(toupper(letters), get_symbols)
#put all results in a dataframe
stocks_df <- do.call(rbind, all_stocks)
head(stocks_df)
Source: http://web.tmxmoney.com
stock_info_basic <- function(symbol){
print(symbol)
url <- paste0("http://web.tmxmoney.com/quote.php?qm_symbol=", symbol)
html <- read_html(url)
outer_table <- html %>% html_nodes(".quote-tabs-content table")
info_tables <- outer_table %>% html_nodes("table") %>% html_table()
df = NULL
try(
df <- data.frame(
symbol=symbol,
beta=(info_tables[[1]] %>% filter(X1=="Beta:"))[,2],
dividend=(info_tables[[3]] %>% filter(X1=="Dividend:"))[,2] %>%
str_extract("\\d+\\.\\d+") %>% as.numeric(),
div_freq=(info_tables[[3]] %>% filter(X1=="Div. Frequency:"))[,2],
PE=(info_tables[[3]] %>% filter(X1=="P/E Ratio:"))[,2],
EPS=(info_tables[[3]] %>% filter(X1=="EPS:"))[,2],
yield=(info_tables[[4]] %>% filter(X1=="Yield:"))[,2],
market_cap=(info_tables[[4]] %>% filter(X1=="Market Cap:"))[,2] %>%
str_replace_all(",", "") %>% as.numeric(),
PB=(info_tables[[4]] %>% filter(X1=="P/B Ratio:"))[,2]
)
)
df
}
Notice that sometimes, we don’t find the information for a given stock symbol.
stocks_info_basic_df <- do.call(rbind, lapply(stocks_df$symbol, stock_info_basic))
stocks_info_basic_df <- stocks_df %>% inner_join(stocks_info_basic_df)
head(stocks_info_basic_df)
## symbol beta dividend div_freq PE EPS yield market_cap PB
## 1 AW.UN 0.628 0.125 Monthly 20.1 1.16 5.199 349990111 3.597
## 2 FAP 0.485 0.040 Monthly NA -0.06 10.000 250630157 0.939
## 3 AAB 1.917 NA N/A NA -0.05 NA 16717597 0.625
## 4 ABT 0.653 0.080 Quarterly 17.9 0.38 4.720 263363045 -4.878
## 5 ADN 0.347 0.250 Quarterly 21.4 0.83 5.634 296979084 1.112
## 6 AEF.A 0.058 NA N/A NA -1.69 NA 390425000 40.417
And because no R-related post would be complete without at least a simple chart:
plot_data <- subset(stocks_info_basic_df, PE>1 & PE<100 & PB>0.25 & PB<10)
label_data <- subset(plot_data, PE<15 & 1/PB>1.5)
ggplot(data=plot_data, aes(x=PE, y=1/PB)) +
geom_point(aes(size=market_cap, col=factor(is.na(dividend))), alpha=0.3) +
scale_size(range=c(3,10)) +
labs(title="TSX Stocks", x="Price/Earnings", y="BookValue/Price",
col="Dividend", size="Market Cap") +
geom_text(data=label_data, aes(label=symbol), check_overlap=TRUE)
In the above plot, I’ve removed outliers and displayed some stock symbols in the north-west corner, i.e. stocks with low Price/Earnings ratio and high BookValue/Price ratio.
In the next installment, I’ll be adding lots of interesting financial information to the stock database (revenue, income, expenses, etc.). You guessed it…we wil actually build a full-fledged stock-screener. Stay tuned!