Executive Summary

We interested in building a Annual dataset of f compounded rates of return for each stock: perm, ticker, CUSIP, year, Market Cap, rate of return. The data was extracted from WRDS cRSP dataset. The Variables extracted are following: PERMNO: CRSP Company Number. date: TICKER: The combination of ticker, exchange, and date uniquely identifies a security.

PERMCO: CRSP Permanent Company Number. PERMCO is a unique permanent identifier assigned by CRSP to all companies with issues on a CRSP file.

CUSIP: The latest eight-character CUSIP identifier for the security through the end of the file.

PRC: Price. The closing price or the negative bid/ask average for a trading day.

RET: Holding Period Return.

SHROUT: Number of Shares Outstanding. The number of publicly held shares, recorded in thousands.

The data range is from December 31, 1925 to December 31, 2019, containing 4606907 records with 31087 unique stocks.

Methodology

  1. Annual Return Under compounded rate of returns assumption, cumulative return as computed as following:

\[\bar{R}^C = \prod^{T}_{i = 1}(1+r_i)-1\]

  1. Market Cap

\[\text{Market Cap} = \text{Stock Price} \times \text{Shares Outstanding} \]

Tables and Figures:

Here the example taken are:

AAPL: Apple Inc.

ZOOM: Zoom Video Communications, Inc. 

AMZN: Amazon.com, Inc.

IBM: International Business Machines Corporation

MSFT: Microsoft Corporation

The first table shows the average annual return and market cap of each stock, as well as its standard deviation.

TICKER Average Return Return Standard Deviation Average Market Cap standard Deviation of Market Cap
AAPL 0.2457951 0.6896768 149436929.45 249527917.59
AMZN 0.4884565 1.1091943 128717196.41 196371286.56
IBM 0.0357643 0.3159650 88884174.47 62146653.21
MSFT 0.0899685 0.3235945 239073890.63 201260979.32
ZOOM 1.1032041 3.4564208 36054.78 30779.21

Computer code

data<- read.csv("Stock Data.csv")
data$date <- ymd(data$date)
data$year <- year(data$date)
rtn<- xts(data$RET,order.by = data$date)
#Calculate Annualized Return
ann.rtn <- data %>% group_by(TICKER,year) %>% summarize(annual_return = prod (1 + return, na.rm = T)-1)
#Calculate Market Cap
mkt.cap <- data %>% group_by(TICKER,year) %>% summarize(market_cap = PRC*SHROUT, na.rm = T)
#Creating Data frame to store msg
N<-nrow(dataset)
Newdata <- matrix(numeric(7*N),nrow=N)
Newdata[,1] <- as.numeric(data$PERMNO)
Newdata[,2] <- data$year
Newdata[,3] <- data$TICKER
Newdata[,4] <- as.numeric(data$PERMCO)
Newdata[,5] <- data$CUSIP
Newdata[,6] <- ann.rtn
Newdata[,7] <- mkt.cap
Newdata <- as.data.frame(Newdata)
Newdata <- Newdata %>% filter(PERMCO != 0 )
Newdata <- na.omit(Newdata)
#Summary Statistics
sum<- Newdata %>% 
  filter(TICKER == "AAPL"|TICKER =="ZOOM"|TICKER =="AMZN"|TICKER =="IBM"|TICKER =="MSFT") %>% 
  group_by(TICKER) %>% 
  summarise(avg.rtn = mean(as.numeric(Annual_Return)),
            sd.rtn = sd(as.numeric(Annual_Return)),
            avg.mkt.cap = mean(as.numeric(`Market Cap`)),
            sd.mkt.cap = sd(as.numeric(`Market Cap`)),
  )
colnames(sum) <- c("TICKER","Average Return", "Return Standard Deviation", "Average Market Cap", "standard Deviation of Market Cap") 
knitr::kable(sum)


#Annual Return Visualized
Newdata %>% filter(TICKER == "AAPL"|TICKER =="ZOOM"|TICKER =="AMZN"|TICKER =="IBM"|TICKER =="MSFT")%>%
ggplot(aes(x=as.numeric(date),y=as.numeric(Annual_Return),color= TICKER))+ geom_line() +theme_classic()+labs(x= "Date", y= "Annual Return", title = "Example stocks Annual Return Over the years")

#Market Cap Visualized
Newdata %>% filter(TICKER == "AAPL"|TICKER =="ZOOM"|TICKER =="AMZN"|TICKER =="IBM"|TICKER =="MSFT")%>%
ggplot(aes(x=as.numeric(date),y=as.numeric(`Market Cap`),color= TICKER))+ geom_line() +theme_classic()+labs(x= "Date", y= "Market Cap", title = "Example stocks Market Cap Over the years")

Reference

  1. WRDS database https://wrds-web.wharton.upenn.edu/wrds//ds/crsp/stock_a/dsf.cfm