According to the FDIC, “every national bank, state member bank, insured state nonmember bank, and savings association is required to file Consolidated Reports of Condition and Income (a”Call Report”“) as of the close of business on the last day of each calendar quarter.” Nonetheless, this is reported on the subsidiary level. On the other hand, public reports submitted to the SEC is disclosed on the bank holding company level. Consistent with the Call Report, the FR Y-9C is a consolidated report that covers the bank holding company along with its subsidiaries.
In one of my early papers, I faced the following problem: I have data of bank failures on the commercial level from the FDIC, but, the financial reports of the public entities take place on the parent level (the holding company). At the beginning, I was linking subsidiary with the parent by matching names manually. However, I found a way to automate such on a larger scale. The key was to look at the commercial bank level identifier from the Call report and identify the id of the high holder. After doing so, all I had to look at the FR Y-9C report and link each subsidiary to the parent. While this seems obvious now, it was kind of dim when I started working on the project then.
In this article, I demonstrate how to do so using public data from
the Federal Reserve Bank of Chicago and R coding. The main
contribution is to provide a link between the parents and their
subsidiaries. The resulting data can provide a number of analysis for
the researcher to conduct. For those interested in bank failures, this
data can be very useful. The bank failure information by the FDIC is
reported on the subsidiary level, whereas stock data and SEC filings
take place on the parent company level. Additionally, the proposed
dataset allows users to assess the complexity of the bank holding
companies, by providing an additional perspective on the subsidiary
structure of the holding company.
The datasets used in this article are public and can be downloaded
from this link
for the commercial banks level and this link
for the bank holding company level. In either case, the variables names
follow suit and can be identified using the website data dictionary.
After downloading the data, I place the commercial and parent zipped
files into two folders, named COM and BHC,
respectively.
To get started, I load the usual suspects plyr and
lubridate. Additionally, I refer to Hmisc and
SASxport packages in order to deal with the SAS formatted
files for the Call data.
library(plyr)
library(lubridate)
library(Hmisc)
library(haven)
rm(list = ls())
data.dir <- getwd()
From my own experience, I found it very useful to create a list of variables of interest and, hence, work with a smaller subset, especially when the regulatory data have thousands of variables.
var_list_file <- "/home/simaan/Dropbox/FNA/Data/var_list.csv"
var.ds <- read.csv(var_list_file,stringsAsFactors = FALSE)
var.ds
In this case, I focus only on identifying information. Moreover, I use the same variable list for both the Call and the FR Y-9C data.
I start with the Call reports and focus on the last four quarters of data.
com.dir <- paste(data.dir,"COM",sep = "/")
zip.files <- paste(com.dir,list.files(com.dir),sep = "/")
zip.l <- sapply(zip.files,unzip)
zip.l
## /home/simaan/Dropbox/RMarkdown/Identify_BHC_Sub/COM/call1703-zip.zip
## "./call1703.xpt"
## /home/simaan/Dropbox/RMarkdown/Identify_BHC_Sub/COM/call1706-zip.zip
## "./call1706.xpt"
## /home/simaan/Dropbox/RMarkdown/Identify_BHC_Sub/COM/call1709-zip.zip
## "./call1709.xpt"
## /home/simaan/Dropbox/RMarkdown/Identify_BHC_Sub/COM/call1712-zip.zip
## "./call1712.xpt"
The above command unzips the listed files for R to read
them. Note that each zipped file corresponds to a given YYMM quarter.
Next, we select the variables of interest while reading each file
separately:
var.sel <- var.ds$var
# read into a list
com.l <- lapply(zip.l, read_xpt )
com.l <- lapply(com.l, function(ds.i) ds.i[,names(ds.i) %in% var.sel] )
# stack altogether
com.ds <- ldply(com.l,data.frame)
com.ds$.id <- NULL
dim(com.ds)
## [1] 24209 13
head(com.ds)
After reading the above data, I replace each variable’s name with its
label using the var.ds object. In particular, I do the
following
var.ds <- merge(var.ds,data.frame(var = names(com.ds)), by = "var")
com.ds <- com.ds[,var.ds$var]
names(com.ds) <- var.ds$name
summary(com.ds)
## lei id country name
## Length:24209 Min. : 37 Length:24209 Length:24209
## Class :character 1st Qu.: 328357 Class :character Class :character
## Mode :character Median : 652977 Mode :character Mode :character
## Mean :1010749
## 3rd Qu.: 966731
## Max. :5143788
##
## name_legal fed_district_code org_type charter_type
## Length:24209 Min. : 0.000 Min. : 0.000 Min. : 0.0
## Class :character 1st Qu.: 6.000 1st Qu.: 1.000 1st Qu.:200.0
## Mode :character Median : 7.000 Median : 1.000 Median :200.0
## Mean : 7.158 Mean : 1.302 Mean :208.5
## 3rd Qu.:10.000 3rd Qu.: 1.000 3rd Qu.:200.0
## Max. :12.000 Max. :11.000 Max. :400.0
## NA's :3 NA's :3 NA's :3
## city state fed_regulator2 parent_id
## Length:24209 Length:24209 Length:24209 Min. : 0
## Class :character Class :character Class :character 1st Qu.:1056804
## Mode :character Mode :character Mode :character Median :1203853
## Mean :1700863
## 3rd Qu.:2744883
## Max. :5170805
## NA's :3
## date
## Min. :20170331
## 1st Qu.:20170331
## Median :20170630
## Mean :20170777
## 3rd Qu.:20170930
## Max. :20171231
##
It looks like that some the parent’s id are given zero. If that’s the case, I assign them the same value of the subsidiary id.
com.ds[com.ds$parent_id %in% 0,"parent_id"] <- com.ds[com.ds$parent_id %in% 0,"id"]
com.ds <- na.omit(com.ds)
In addition, I drop all missing values from the com.ds
data frame.
The com.ds provides identification of all banks on the
subsidiary level with a reference to the parent. In order to link the
subsidiaries along with their parents, we move to the holding company
data. Following approach as above, I read the parent BHCs data with
respect to the description provided here:
# unzip
bhc.dir <- paste(data.dir,"BHC",sep = "/")
zip.files <- paste(bhc.dir,list.files(bhc.dir),sep = "/")
zip.files <- tail(zip.files,4)
zip.l <- sapply(zip.files,unzip)
# load the selected variables
var.ds <- read.csv(var_list_file ,stringsAsFactors = F)
var.sel <- var.ds$var
# read each file
read.bhc <- function(x) read.table(x, sep="^", comment.char="", header=TRUE, quote="", na.strings="--------", as.is=TRUE)
bhc.l <- lapply(zip.l,read.bhc)
bhc.l <- lapply(bhc.l,function(x) x[,names(x) %in% var.sel] )
# put list in data.frame
bhc.ds <- ldply(bhc.l,data.frame)
bhc.ds$.id <- NULL
# change var names to the given labels
var.ds <- read.csv(var_list_file ,stringsAsFactors = F)
var.ds <- merge(var.ds,data.frame(var = names(bhc.ds)), by = "var")
bhc.ds <- bhc.ds[,var.ds$var]
names(bhc.ds) <- var.ds$name
summary(bhc.ds)
## lei fed_regulator id country
## Length:10852 Length:10852 Min. :1020180 Length:10852
## Class :character Class :character 1st Qu.:1132140 Class :character
## Mode :character Mode :character Median :2090779 Mode :character
## Mean :2283760
## 3rd Qu.:3253678
## Max. :5170805
## NA's :4
## name name_legal fed_district_code org_type
## Length:10852 Length:10852 Min. : 0.00 Min. : 1.000
## Class :character Class :character 1st Qu.: 6.00 1st Qu.: 1.000
## Mode :character Mode :character Median : 8.00 Median : 1.000
## Mean : 7.46 Mean : 1.424
## 3rd Qu.:10.00 3rd Qu.: 1.000
## Max. :12.00 Max. :99.000
## NA's :4 NA's :4
## charter_type city state parent_id
## Min. :500.0 Length:10852 Length:10852 Min. : 0
## 1st Qu.:500.0 Class :character Class :character 1st Qu.: 0
## Median :500.0 Mode :character Mode :character Median : 0
## Mean :500.7 Mean : 213353
## 3rd Qu.:500.0 3rd Qu.: 0
## Max. :720.0 Max. :5141195
## NA's :4 NA's :4
## date
## Min. :20170331
## 1st Qu.:20170630
## Median :20170630
## Mean :20170884
## 3rd Qu.:20171231
## Max. :20171231
## NA's :4
It looks like most parent ids are zeroes. If that’s the case, then I assign the parent id number the same value of the id column, i.e.
bhc.ds[bhc.ds$parent_id %in% 0,"parent_id"] <- bhc.ds[bhc.ds$parent_id %in% 0,"id"]
bhc.ds <- na.omit(bhc.ds)
Given the datasets com.ds and bhc.ds, I
merge them in the following fashion:
names(com.ds) <- paste(names(com.ds),"L1",sep = "_")
names(com.ds)[names(com.ds) == "date_L1"] <- "date"
com.ds$common_id_L1 <- com.ds$parent_id_L1
bhc.ds$common_id_L1 <- bhc.ds$id
com.bhc.ds <- merge(com.ds,bhc.ds, by = c("common_id_L1","date") )
head(com.bhc.ds)
Let’s compute the number of the subsidiaries for this child-parent dataset
no.sub1 <- ddply(com.bhc.ds,c("common_id_L1","date"), function(ds.i) length(unique(ds.i$id_L1)) )
names(no.sub1)[3] <- "number_subs_1"
summary(no.sub1)
## common_id_L1 date number_subs_1
## Min. :1020180 Min. :20170331 Min. : 1.000
## 1st Qu.:1129186 1st Qu.:20170630 1st Qu.: 1.000
## Median :1944204 Median :20170630 Median : 1.000
## Mean :2210358 Mean :20170888 Mean : 1.122
## 3rd Qu.:3157150 3rd Qu.:20171231 3rd Qu.: 1.000
## Max. :5170805 Max. :20171231 Max. :17.000
We observe that, mainly, there is a one-to-one relationship, with the exception for a number of cases. For instance, we note that in the last quarter of 2017, the Wintrust Financial Corporation had 17 subsidiaries:
id17 <- unique(no.sub1[no.sub1$number_subs_1 == 17,"common_id_L1"])
WT.list <- com.bhc.ds[com.bhc.ds$common_id_L1 == id17 & com.bhc.ds$date == 20171231,"name_L1"]
cat(sort(as.character(WT.list)),sep = "\n")
## BARRINGTON B&TC NA
## BEVERLY B&TC NA
## CHICAGO TC NA
## CRYSTAL LAKE B&TC NA
## FIFC EDGE INTL CORP
## HINSDALE B&TC
## LAKE FOREST B&TC NA
## LIBERTYVILLE B&TC
## NORTHBROOK B&TC
## OLD PLANK TRAIL CMNTY BK NA
## SCHAUMBURG B&TC NA
## ST CHARLES B&TC
## STATE BK OF THE LAKES
## TOWN BK
## VILLAGE B&TC
## WHEATON B&TC
## WINTRUST BK
Relatively, it is considered a large BHC with more than $20 billion in assets. To confirm such, I look into the 10-K annual filing by the corporation to confirm whether this the case. The report points out that:
“We now own fifteen banks, including eight Illinois-chartered banks: Hinsdale Bank and Trust Company (“Hinsdale Bank”), Wintrust Bank, Libertyville Bank and Trust Company (“Libertyville Bank”), Northbrook Bank & Trust Company (“Northbrook Bank”), Village Bank & Trust (“Village Bank”), Wheaton Bank & Trust Company (“Wheaton Bank”), State Bank of the Lakes and St. Charles Bank & Trust Company (“St. Charles Bank”). In addition, we have one Wisconsin-chartered bank, Town Bank, and six nationally chartered banks: Lake Forest Bank, Barrington Bank, Crystal Lake Bank & Trust Company, N.A. (“Crystal Lake Bank”), Schaumburg Bank & Trust Company, N.A. (“Schaumburg Bank”), Beverly Bank & Trust Company, N.A. (“Beverly Bank”) and Old Plank Trail Community Bank, N.A. (“Old Plank Trail Bank”). As of December 31, 2017, we had 157 banking locations.”
On the other hand, if we look at JP Morgan, for instance, we observe that it has two main subsidiaries at the end of 2017. This is in line with the 10-K report from the SEC (see link). Nevertheless, it is also appears that each subsidiary has a number of other subsidiaries in the annual report, which our data does not seem to convey such complexity in the structure:
JP.list <- com.bhc.ds[com.bhc.ds$common_id_L1 == 1039502 & com.bhc.ds$date == 20171231,"name_legal_L1"]
cat(sort(as.character(JP.list)),sep = "\n")
## J.P. MORGAN INTERNATIONAL FINANCE LIMITED
## JPMORGAN CHASE BANK, NATIONAL ASSOCIATION
To finalize, I consider the date of the report, the name and id of both the subsidiary and the parent, as well as the LEI code for each one. I also add the number of subsidiaries for each BHC in the sample
ds.final <- com.bhc.ds[,c("date","name_legal_L1","lei_L1","id_L1","name_legal","lei","id")]
no_sub <- ddply(ds.final,c("id","date"),function(x) data.frame(no_sub = nrow(x)) )
no_sub <- no_sub[!is.na(no_sub$id),]
ds.final <- merge(ds.final,no_sub,by = c("id","date"),all = T)
Note that the LEI on the subsidiary level has multiple zeroes, whereas this does not seem to be the case for the parent.
In total, we have
length(unique(ds.final$id))
## [1] 4291
BHCs. However, only a small proportions are publicly traded. To identify these banks, I refer to the Banking Research Datasets from the Federal Reserve Bank of New York :
frb_crsp_file <- "https://www.newyorkfed.org/medialibrary/media/research/banking_research/data/crsp_20161231.csv?la=en"
crsp <- read.csv(frb_crsp_file,stringsAsFactors = F)
ds.final <- ds.final[ds.final$id %in% crsp$entity,]
In total, we identify 419 publicly traded companies according to 2017
length(unique(ds.final$id))
## [1] 419
Recall that commercial banks file the Call report on the lower level, whereas SEC filings and stock data take place at the parent level (see this link for a further discussion about the link data). Hence, for researchers interested in stock data, the final dataset would constitute a much smaller sample size.
This article provides a key link between bank subsidiaries and their holding companies. The key to establish so is utilizing regulatory filing data that allow users to link the legal id number of subsidiaries to the parent entities. By doing so, users can have a broader perspective on banks along with their complexity. The final proposed merged data can be utilized to achieve such purpose.