Abstract
This is a data management process to merge the market return into a panel firm data. This type of merge is called 1-to-many merge.You have to import the datasets. We will download these datasets from a web site. Here is the code:
# To avoid scientific notation:
options(scipen=999)
#Read Excel
library(readxl)
# Download the panel dataset from an online excel file:
download.file("http://www.apradie.com/datos/us2020q.xlsx",
"us2020q.xlsx", mode="wb")
# The first parameter is the link and the second is a name for the
# local file in the folder in your computer
#Download the file with a list of all US firms:
download.file("http://www.apradie.com/datos/usfirms.xlsx",
"usfirms.xlsx", mode="wb")
Now we import (into our R environment) the panel quarterly data of all US firms:
<- read_excel("us2020q.xlsx",sheet = "data") uspanel
Now we import the data dictionary for the variables of this dataset:
<- read_excel("us2020q.xlsx",sheet = "data dictionary") dictionary
Now we import that list of all public US firms:
<- read_excel("usfirms.xlsx") usfirms
Now download the S&P500 from Yahoo Finance:
library(quantmod)
getSymbols("^GSPC", from="2010-01-01", to= "2020-12-31",
periodicity="monthly", src="yahoo")
## [1] "^GSPC"
We need to aggregate (collapse) the dataset from monthly to quarterly, keeping the LAST market index per quarter. We can use the to.quarterly function from quantmod:
<- to.quarterly(GSPC,indexAt='startof') QSP500
This function creates an OHLC xts dataset for each quarter. We can have a look of the content:
head(QSP500)
## GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
## 2010-01-01 1116.56 1180.69 1044.50 1169.43 279192470000 1169.43
## 2010-04-01 1171.23 1219.80 1028.33 1030.71 354511440000 1030.71
## 2010-07-01 1031.10 1157.16 1010.91 1141.20 260105810000 1141.20
## 2010-10-01 1143.49 1262.60 1131.87 1257.64 257671870000 1257.64
## 2011-01-01 1257.62 1344.07 1249.05 1325.83 240896240000 1325.83
## 2011-04-01 1329.48 1370.58 1258.07 1320.64 245196520000 1320.64
We see that the Open, High, Low, Close and Adjusted prices for each quarter were calculated. We only need the Adjusted price to calculate the market returns, so we select only the Close column:
= QSP500$GSPC.Adjusted
QSP500 # Change the name of the column:
colnames(QSP500) <- "SP500index"
Now we can calculate cc return of the market
$SPreturns <- diff(log(QSP500$SP500index),lag=4) QSP500
Now we are almost ready to merge this quarterly data to the panel dataset using merge.
We need to have both datasets with a common column. In this case, the common column is quarter. Both datasets must be of the same class, in this case, data frame class.
The QMXX has the quarter as index, but not as part of a column. We create a data frame with the quarter column equal to its index:
# Create a dataframe with a quarter column for the QMXX
<-data.frame(q=index(QSP500),coredata(QSP500))
QSP500.df# I extracted the quarter from the index
# coredata is a function that extract the data of an object
# without extracting formatting
The common column must be of the same type, so we make sure that the column quarter of dataset is a Date type:
Now we do a MANY-TO-ONE merge of both datasets:
$q<-as.Date(uspanel$q)
uspanel
<-merge(uspanel,QSP500.df,by="q") uspanel