https://rpubs.com/staszkiewicz/cw_2_EN_R
Data: https://www.dropbox.com/scl/fo/sajrtchhqeftivhd4zwqu/h?rlkey=u37l184e2khoxfufulsyq00kc&dl=0
File: Bank.csv
We have four basic objects in R - Vector - c() - Matrix - matrix() - Data frame - data.frame() - List - list()
We define a vector with the function c() e.g. c(“a”,12, “#45:”)
wektor<- c("a",12,"#45:")
wektor
## [1] "a" "12" "#45:"
A vector can be either numeric or character, or mixed. The elements
of the vector are referred to by the index vector[1]
denotes the first word of the vector
wektor[1]
## [1] "a"
Several modes of addressing
# addressing consecutive elements, e.g. from second to third
wektor[2:3]
## [1] "12" "#45:"
# last element - using the length`length()` function.
wektor[length(wektor)]
## [1] "#45:"
# last element - using the tail function
tail(wektor, 1)
## [1] "#45:"
# adesing all but the first two we use the minus sign
wektor[-c(1,2)]
## [1] "#45:"
# or beyond the third element
wektor[-3]
## [1] "a" "12"
Addressing a vector by referring to a name:
name the vector
names(vector)<-c("ala", "ma", "cat")
.
refer to vector[name], e.g. vector[“ma”]
names(wektor)<-c("ala", "ma", "kota")
wektor["ma"]
## ma
## "12"
Please note that the name is given by “” or ’’ .
For a vector, the use of double square brackets [[]] returns the same value as for []. In addition, the pluck() function allows you to return an element from a vector.
# Direclty
wektor[[1]]
## [1] "a"
library(dplyr) # to get infix operator `%>%` - pipline
##
## Dołączanie pakietu: 'dplyr'
## Następujące obiekty zostały zakryte z 'package:stats':
##
## filter, lag
## Następujące obiekty zostały zakryte z 'package:base':
##
## intersect, setdiff, setequal, union
library(purrr) # to get the pluck() function
wektor %>% pluck(3)
## [1] "#45:"
A matrix is defined by the function matrix()
description
help(matrix). Let’s define a 3x3 matrix from 1 to 9. It requires
specifying the data set, the number of columns, or beliefs, and how the
data is to be entered by columns or by beliefs.
macierz <- matrix (1:9, nrow=3, byrow = TRUE)
The matrix is a two-dimensional stucture, it cannot concatenate alphanumeric and numeric values an attempt to define it in this way results in the conversion of all values to alphanumeric values.
test<- matrix(c(1,2,3,"ala",5,4), ncol=2)
test
## [,1] [,2]
## [1,] "1" "ala"
## [2,] "2" "5"
## [3,] "3" "4"
An element of a matrix is referenced by row number, and column number
test[2,1]
## [1] "2"
# to nie to samo co
test[1,2]
## [1] "ala"
To refer to a given column or row we leave a free space after or before the comma, for example, the first column in the matrix.
macierz
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
# entire first row
macierz[1,]
## [1] 1 2 3
# entire second columne
macierz[,2]
## [1] 2 5 8
We can also name the columns and rows, e.g. kol1, kol2, kol3
colnames(macierz)<-c("col1","Col2","col3")
macierz
## col1 Col2 col3
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
and address as with a vector by name, e.g., all of column 2. Note: the name must be in “” or ’’, R is sensitive to lowercase and uppercase letters
macierz[,"Col2"]
## [1] 2 5 8
Note the use of [[]] in an array is a little different. Well, it returns the next element counting by rows.
macierz
## col1 Col2 col3
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
macierz[[7]] # returns the seventh word after the lines, i.e. the first line of the third column
## [1] 3
macierz[2][1] # returns the first column from the second row
## [1] 4
macierz[2,1] # Which is equivalent to the comma notation, but note in this case we get the column name
## col1
## 4
Other addressing schemes in the matrix are analogous to the vector.
The matrix stores only one data type either numeric or alphanumeric.
In a situation where you want to have different data types in the
columns, e.g. the balance total one column and the name of the company
the matrix will return everything as an alphanumeric value
company <-c(‘coca’,‘abb’,‘fb’) total_balance <- c(100,200,80)
firma <-c('coca','abb','fb')
class(firma) # shows data class
## [1] "character"
suma_bilansowa <- c(100,200,80)
class(suma_bilansowa)
## [1] "numeric"
dane<- matrix(c(firma, suma_bilansowa),ncol=3,byrow=TRUE)
class(dane)
## [1] "matrix" "array"
dane
## [,1] [,2] [,3]
## [1,] "coca" "abb" "fb"
## [2,] "100" "200" "80"
We will not have such a problem in the case of a data frame functio
data.frame()
.
df<- data.frame(firma, suma_bilansowa)
class(df)
## [1] "data.frame"
df
## firma suma_bilansowa
## 1 coca 100
## 2 abb 200
## 3 fb 80
Indexing a data frame by [] or [[]] or call is analogous to an array.
Except that in the case of a data frame, we can also use the dollar sign
$
which refers to a column that is actually clocked as
either a numeric, factor, or alphanumeric name variable.
# all frame
df
## firma suma_bilansowa
## 1 coca 100
## 2 abb 200
## 3 fb 80
#Only firm names
df$firma
## [1] "coca" "abb" "fb"
df$suma_bilansowa
## [1] 100 200 80
#Check of the variable data class
class(df$firma)
## [1] "character"
class(df$suma_bilansowa)
## [1] "numeric"
# Total balance assets value for Abb
df[2,2]
## [1] 200
#Assin names of the coys to the names of cases (rows)
rownames(df) <- df$firma
df
## firma suma_bilansowa
## coca coca 100
## abb abb 200
## fb fb 80
# totoal assets of fb
df[3,2]
## [1] 80
# all data (row) for fb
df['fb',]
## firma suma_bilansowa
## fb fb 80
# call bilnas sum by referring to names ! note "" or `` because names are not numbers
df['fb','suma_bilansowa']
## [1] 80
The array can store one type of data, the frame different types in different columns (variables). But both the matrix and the data frame store data in columns of the same length. These data are in fact vectors. From time to time we need a structure that will store different objects of different lengths, such as an array and a vector and a data frame. This is what a list is for. Let’s define our list as an object containing, vector, matrix and data frame.
lista<-list(wektor, macierz, df)
lista
## [[1]]
## ala ma kota
## "a" "12" "#45:"
##
## [[2]]
## col1 Col2 col3
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
##
## [[3]]
## firma suma_bilansowa
## coca coca 100
## abb abb 200
## fb fb 80
Please note that to address a list we need [[]]. Well, [[]] refers to a basic object, such as a vector, matrix, or data frame. In our case, list[[1]] - is our vector and then we need to index the vector, e.g. list[[1]][2] i.e. in the first object of the list - the vector - show the second element.
# lista
lista
## [[1]]
## ala ma kota
## "a" "12" "#45:"
##
## [[2]]
## col1 Col2 col3
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
##
## [[3]]
## firma suma_bilansowa
## coca coca 100
## abb abb 200
## fb fb 80
# fist object
lista[[1]]
## ala ma kota
## "a" "12" "#45:"
# first object second element
lista[[1]][2]
## ma
## "12"
When referencing a data frame in our list, we need to specify objects two-dimensionally - first the head object and then the address in that object (vector, data frame or array)
# ramka dancyh
lista[[3]]
## firma suma_bilansowa
## coca coca 100
## abb abb 200
## fb fb 80
# suma bilansowa dla coca coli
lista[[3]][1,2]
## [1] 100
A lot of calculation results R stores in the form of a list, to see
its strucre we can “ungroup” in the Envriment panel the arrow next to
the object or use from the console function str()
.
str(lista)
## List of 3
## $ : Named chr [1:3] "a" "12" "#45:"
## ..- attr(*, "names")= chr [1:3] "ala" "ma" "kota"
## $ : int [1:3, 1:3] 1 4 7 2 5 8 3 6 9
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:3] "col1" "Col2" "col3"
## $ :'data.frame': 3 obs. of 2 variables:
## ..$ firma : chr [1:3] "coca" "abb" "fb"
## ..$ suma_bilansowa: num [1:3] 100 200 80
This function shows the object type and the type of internal data, and if they exist, the name scheme.
The task is to analyse the structure of auditor in the US market in
the banking industry. We will use primary data from the article
Staszkiewicz, P., & Karkowska, R. (2021). Audit fee and banks’
communication sentiment. Economic Research-Ekonomska Istraživanja. https://doi.org/10.1080/1331677X.2021.1985567
Please note that the data can used after class for non-commercial
purposes only, after indication of the original source.
# from the basic functions of the system we load a classic csv file but in such a way
# that we choose from a window the location of the file on our own computer
# that's why we nest the "file.choose()" command
bank <- read.csv(file.choose())
Let’s see the structure of the database you can do it from the “Environment” panel by clicking on the arrow next to the database name “bank” or by entering the command
str(bank)
## 'data.frame': 5356 obs. of 197 variables:
## $ CIK_Code : int 7789 7789 7789 7789 7789 7789 7789 7789 7789 7789 ...
## $ Year_Ended : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
## $ Companyx : chr "ASSOCIATED BANC-CORP" "ASSOCIATED BANC-CORP" "ASSOCIATED BANC-CORP" "ASSOCIATED BANC-CORP" ...
## $ Tickerx : chr "ASB" "ASB" "ASB" "ASB" ...
## $ Marketx : chr "NYSE" "NYSE" "NYSE" "NYSE" ...
## $ IRS_Number : int 391098068 391098068 391098068 391098068 391098068 391098068 391098068 391098068 391098068 391098068 ...
## $ Bus_Street_1x : chr "433 MAIN STREET" "433 MAIN STREET" "433 MAIN STREET" "433 MAIN STREET" ...
## $ Bus_Street_2x : chr NA NA NA NA ...
## $ Cityx : chr "GREEN BAY" "GREEN BAY" "GREEN BAY" "GREEN BAY" ...
## $ Countyx : chr "Brown" "Brown" "Brown" "Brown" ...
## $ State_Codex : chr "WI" "WI" "WI" "WI" ...
## $ State_Namex : chr "WISCONSIN" "WISCONSIN" "WISCONSIN" "WISCONSIN" ...
## $ Region : chr "US Midwest" "US Midwest" "US Midwest" "US Midwest" ...
## $ Zipx : chr "54301" "54301" "54301" "54301" ...
## $ Bus_Phonex : chr "(920) 491-7500" "(920) 491-7500" "(920) 491-7500" "(920) 491-7500" ...
## $ Incorporation_State_Code : chr "WI" "WI" "WI" "WI" ...
## $ Parent_CIKx : int NA NA NA NA NA NA NA NA NA NA ...
## $ Parent_Namex : chr "" "" "" "" ...
## $ SIC_Codex : int 6022 6022 6022 6022 6022 6022 6022 6022 6022 6022 ...
## $ SIC_Descriptionx : chr "State Commercial Banks" "State Commercial Banks" "State Commercial Banks" "State Commercial Banks" ...
## $ NAICS_Codex : int 522110 522110 522110 522110 522110 522110 522110 522110 522110 522110 ...
## $ NAICS_Descriptionx : chr "Commercial Banking" "Commercial Banking" "Commercial Banking" "Commercial Banking" ...
## $ Filer_Actx : int 34 34 34 34 34 34 34 34 34 34 ...
## $ Auditorx : chr "KPMG LLP" "KPMG LLP" "KPMG LLP" "KPMG LLP" ...
## $ Auditor_Keyx : int 4 4 4 4 4 4 4 4 4 4 ...
## $ Audit_Fees_USDx : int 288000 405700 403100 380120 842700 725000 748400 748825 813700 894200 ...
## $ Audit_Related_FeesUSD : int 0 161600 152300 162200 172200 151100 188400 185000 271600 265700 ...
## $ Benefit_Plan_Related_Fees_USD : int 0 0 0 0 0 0 0 0 0 0 ...
## $ FISDI_Fees_USD : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Tax_Related_Fees_USD : int 0 39675 30547 94830 92091 0 0 0 0 0 ...
## $ Tax_Related_Fees__ComplianceUSD: num NA NA NA NA NA NA NA NA NA NA ...
## $ Tax_Related_Fees__NonComplianc : num NA NA NA NA NA NA NA NA NA NA ...
## $ OtherMisc_Fees_USD : int 452048 0 0 0 0 0 0 0 0 0 ...
## $ Total_Non_Audit_Fees_USD : int 452048 201275 182847 257030 264291 151100 188400 185000 271600 265700 ...
## $ Total_Fees_USDx : int 740048 606975 585947 637150 1106990 876100 936800 933825 1085300 1159900 ...
## $ Currency : chr "USD" "USD" "USD" "USD" ...
## $ Fees_Include_Subsidiaries : chr "" "" "" "" ...
## $ Fees_Included_In_Parent_Filings: logi NA NA NA NA NA NA ...
## $ Restatement : int 0 1 0 0 0 0 0 0 0 0 ...
## $ Year_Ended_Datex : chr "12/31/2000" "12/31/2001" "12/31/2002" "12/31/2003" ...
## $ Year_Ended_Month_Ideal : int 12 12 12 12 12 12 12 12 12 12 ...
## $ Fiscal_Year_Ends_Currently_Repo: int 1231 1231 1231 1231 1231 1231 1231 1231 1231 1231 ...
## $ Sourcex : chr "DEF 14A" "DEF 14A" "DEF 14A" "DEF 14A" ...
## $ Source_Datex : int 36972 37700 37700 38061 38427 38775 39154 39518 39895 40253 ...
## $ Stock_Price_USDx : num 19375 0.4 1.53 1.12 5.85 ...
## $ Stock_Price_Datex : int 36891 37256 37621 37986 38352 38686 39051 NA NA NA ...
## $ Market_Cap_USDx : num 14538258 3476132 13333578 10493438 75146070 ...
## $ Market_Cap_USDx_N : num 14538258 3476132 13333578 10493438 75146070 ...
## $ Finacls_Datex : chr "2000-12-31" "2001-12-31" "2002-12-31" "2003-12-31" ...
## $ Revenue_USDx : num 1.12e+09 1.08e+09 1.01e+09 9.74e+08 9.77e+08 ...
## $ Revenue_USDx_N : num 1.12e+09 1.08e+09 1.01e+09 9.74e+08 9.77e+08 ...
## $ Earnings_USDx : num 1.68e+08 1.80e+08 2.11e+08 2.29e+08 2.58e+08 ...
## $ Earnings_USDx_N : num 1.68e+08 1.80e+08 2.11e+08 2.29e+08 2.58e+08 ...
## $ Book_Value_USDx : num 9.69e+08 1.07e+09 1.21e+09 1.24e+09 1.22e+09 ...
## $ Book_Value_USDx_N : num 9.69e+08 1.07e+09 1.21e+09 1.24e+09 1.22e+09 ...
## $ Assets_USDx : num 1.31e+10 1.36e+10 1.50e+10 1.52e+10 2.05e+10 ...
## $ Assets_USDx_N : num 1.31e+10 1.36e+10 1.50e+10 1.52e+10 2.05e+10 ...
## $ State_Region : chr "US Midwest" "US Midwest" "US Midwest" "US Midwest" ...
## $ Audit_Opinion_Key : int 65938 55697 4468 37294 82504 99313 119602 146007 168700 188432 ...
## $ Auditor_City : chr "Chicago" "Chicago" "Chicago" "Chicago" ...
## $ Auditor_State_Code : chr "IL" "IL" "IL" "IL" ...
## $ Auditor_State_Name : chr "ILLINOIS" "ILLINOIS" "ILLINOIS" "ILLINOIS" ...
## $ Auditor_State_Region : chr "US Midwest" "US Midwest" "US Midwest" "US Midwest" ...
## $ Year_Ended_Ideal : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
## $ Month_Ended_Ideal : int 12 12 12 12 12 12 12 12 12 12 ...
## $ Signature_Date : chr "1/18/2001" "1/17/2002" "1/16/2003" "3/8/2004" ...
## $ Sourcey : chr "10-K" "10-K" "10-K" "10-K" ...
## $ Source_Datey : chr "3/22/2001" "3/21/2002" "3/20/2003" "3/15/2004" ...
## $ Going_Concern : chr "No" "No" "No" "No" ...
## $ Going_Concern_Issue_Key_List : chr "" "" "" "" ...
## $ Going_Concern_Issue_Phrase_List: chr "" "" "" "" ...
## $ Accounting_Basis : chr "GAAP - US" "GAAP - US" "GAAP - US" "GAAP - US" ...
## $ Is_Integrated_Audit : chr "No" "No" "No" "No" ...
## $ Is_Additiol_Opinion : chr "No" "No" "No" "No" ...
## $ Additiol_Sigture_Date_1 : chr "" "" "" "" ...
## $ Additiol_Sigture_Date_2 : chr "" "" "" "" ...
## $ Additiol_Sigture_Date_3 : logi NA NA NA NA NA NA ...
## $ Additiol_Sigture_Date_4 : logi NA NA NA NA NA NA ...
## $ Additiol_Sigture_Date_5 : logi NA NA NA NA NA NA ...
## $ Filer_Status : chr NA NA "Accelerated Filer" "Accelerated Filer" ...
## $ Fees_Fiscal_Year_Ended : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
## $ Audit_Fees_USDy : int 288000 405700 403100 380120 842700 725000 748400 748825 813700 894200 ...
## $ NonAudit_Fees_USD : int 452048 201275 182847 257030 264291 151100 188400 185000 271600 265700 ...
## $ Total_Fees_USDy : int 740048 606975 585947 637150 1106990 876100 936800 933825 1085300 1159900 ...
## $ Stock_Price_USDy : num 19375 0.4 1.53 1.12 5.85 ...
## $ Stock_Price_Datey : chr "12/31/2000" "12/31/2001" "12/31/2002" "12/31/2003" ...
## $ Market_Cap_USDy : num 14538258 3476132 13333578 10493438 75146070 ...
## $ Fincials_Datey : chr "2000-12-31" "2001-12-31" "2002-12-31" "2003-12-31" ...
## $ Revenue_USDy : num 1.12e+09 1.08e+09 1.01e+09 9.74e+08 9.77e+08 ...
## $ Earnings_USDy : num 1.68e+08 1.80e+08 2.11e+08 2.29e+08 2.58e+08 ...
## $ Book_Value_USDy : num 9.69e+08 1.07e+09 1.21e+09 1.24e+09 1.22e+09 ...
## $ Assets_USDy : num 1.31e+10 1.36e+10 1.50e+10 1.52e+10 2.05e+10 ...
## $ Country_of_Headquarters : chr "United States of America" "United States of America" "United States of America" "United States of America" ...
## $ Company_Common_Name : chr "Associated Banc-Corp" "Associated Banc-Corp" "Associated Banc-Corp" "Associated Banc-Corp" ...
## $ Total_Assets : num 1.31e+10 1.36e+10 1.50e+10 1.52e+10 2.05e+10 ...
## $ Total_Equity : num 9.69e+08 1.07e+09 1.27e+09 1.35e+09 2.02e+09 ...
## $ ROA : num 1.31 1.34 1.47 1.51 1.44 ...
## $ Net_Interest_Margin_Total_ : num 3.36 3.62 3.95 3.84 3.8 3.64 3.62 3.6 3.65 NA ...
## $ Fee_Revenue_ : num 26.3 28.3 27.9 29.8 25.5 ...
## [list output truncated]
Please note that before the variable name we have a dollar sign which allows us to refer to a variable in a given object in code r. e.g. bank$ROA refers to a column with information about the ratio of result to assets. As you can see above this solution is quite inconvenient why For visualization we rather use View function, which gives similar insight as in excel sheet.
View(bank)
We have about 5000 observations and 200 variables, in Excel processing such data will be cumbersome and difficult to audit, because Excel, apart from the “audit” function doesn’t leave an audit trail. Unlike Knit.
Which bank paid the most for the audit
#first we'll look at what was the highest salary in the base
max(bank$Total_Fees_USDx)
## [1] 128000000
# the $ sign allows you to select a variable from the bank object
Given a value, we generate a name using the subset function and its parameter select (of course this can be done in many ways)
naj<-subset(bank,Total_Fees_USDx >= 128000000, select= Companyx)
naj
## Companyx
## 292 BANK OF AMERICA CORP /DE/
Try to generate all the information that relates to the largest transaction
naj_all<-subset(bank,Total_Fees_USDx >= 128000000)
View(naj_all)
Now let’s see how many auditors (audit firms) served the banking market. For simplicity we will enter the result into the FA object and sorted in descending order, so that we know at once who has what market share in the market.
FA<-table(bank$Auditorx)
View(sort(FA,decreasing = TRUE))
And now we will make a third column that will show us the share percentage, namely
as.data.frame(FA) # sorting the vector was difficult we converted it to a frame
## Var1 Freq
## 1 Alexander Thompson Arnold PLLC 6
## 2 AM Peisch & Company LLP 1
## 3 Anderson & Associates LLP (WA) 2
## 4 Anness Gerlach & Williams Inc 6
## 5 Arnett & Foster PLLC 15
## 6 Arnett Carbis Toothman LLP 2
## 7 Arnett Foster Toothman PLLC 1
## 8 Arthur Andersen LLP 42
## 9 Baker Newman & Noyes LLC 16
## 10 Baker Tilly Virchow Krause LLP 41
## 11 Barfield Murphy Shank & Smith LLC/PC 2
## 12 BDO Seidman LLP 21
## 13 BDO USA LLP 122
## 14 Beard Miller Company LLP 135
## 15 Berry Dunn McNeil & Parker LLC (BDMP) 60
## 16 BKD LLP 287
## 17 Bonadio & Co LLP (Bonadio Group) 5
## 18 Brown Edwards & Company LLP 16
## 19 Burr Pilger Mayer Inc 3
## 20 Carr Riggs & Ingram (CRI) LLC 14
## 21 Castaing Hussey & Lolan LLC 17
## 22 Cherry Bekaert & Holland LLP (Inactive) 10
## 23 Cherry Bekaert LLP (Active) 4
## 24 Clifton D Bodiford CPA 6
## 25 Clifton Gunderson LLP 33
## 26 CliftonLarsonAllen LLP 20
## 27 Cobitz Vandenberg & Fennessy 15
## 28 CPA Associates PA (FL) 1
## 29 Crowe Chizek & Company LLP 245
## 30 Crowe Chizek LLP 6
## 31 Crowe Horwath LLP 395
## 32 Cummings Ristau & Associates PC 7
## 33 Davis Kinard & Co PC 7
## 34 Delap LLP 8
## 35 Deloitte & Touche LLP 7
## 36 Demetrius & Company LLC 10
## 37 Dixon Francis Davis & Company 2
## 38 Dixon Hughes Goodman LLP 71
## 39 Dixon Hughes PLLC 38
## 40 Dixon Odom PLLC 5
## 41 Draffin & Tucker LLP 3
## 42 Eggleston Smith PC 3
## 43 Elliott Davis Decosimo LLC/PLLC 35
## 44 Elliott Davis LLC/PLLC (Inactive) 104
## 45 Ernst & Young LLP 495
## 46 Francis & Company CPAs (GA) 1
## 47 Fust Charles Chambers LLP 2
## 48 Grant Thornton LLP 125
## 49 Grimsley White & Company 2
## 50 Hacker Johnson & Smith PA 18
## 51 Hannis T Bourgeois LLP 3
## 52 Hazlett Lewis & Bieter (HLB) PLLC 7
## 53 Hill Barth & King (HBK) LLC 3
## 54 Horne LLP 28
## 55 Hutchinson & Bloodgood LLP 12
## 56 JD Cloud & Co LLP 14
## 57 JH Williams & Co LLP 23
## 58 Jones & Kirkpatrick PC 5
## 59 JonesBaggett LLP 1
## 60 JW Hunt & Company LLP 15
## 61 Kirkpatrick Phillips & Miller CPAs PC 5
## 62 Knight Vale & Gregory Inc PS/PLLC 1
## 63 KPMG LLP 1001
## 64 Kraft Miles & Tatum LLC 3
## 65 KraftCPAs PLLC 7
## 66 LaPorte APAC 5
## 67 LaPorte Sehrt Romig & Hand PC 2
## 68 Larrowe & Company PLC 2
## 69 Larsson Woodyard & Henson LLP 2
## 70 Maggart & Associates PC 13
## 71 Mauldin & Jenkins LLC 41
## 72 Mazars USA LLP 1
## 73 McGladrey & Pullen LLP 142
## 74 McGladrey LLP 41
## 75 McNair McLemore Middlebrooks & Co LLC/LLP 16
## 76 Michael R Ferraro CPA 6
## 77 Michael Trokey & Company PC 5
## 78 Miller Ellin & Co LLP 1
## 79 Monroe Shine & Co Inc 27
## 80 Moss Adams LLP 129
## 81 Olive LLP 8
## 82 Packer Thomas CPAs 3
## 83 Parente Randolph LLC 11
## 84 Parente Randolph PC 8
## 85 ParenteBeard LLC 123
## 86 Perry-Smith LLP 47
## 87 Piltz Williams LaRosa & Co 6
## 88 Plante & Moran PLLC 79
## 89 Porter Keadle Moore LLC 87
## 90 Postlethwaite & Netterville APAC 4
## 91 PricewaterhouseCoopers LLP 159
## 92 Prociak & Associates LLC 1
## 93 Pugh & Company PC (Inactive) 5
## 94 Radics & Co LLC 3
## 95 Rayburn Bates & Fitzgerald PC 11
## 96 Rayburn Betts & Bates PC 4
## 97 Rayburn Fitzgerald PC 1
## 98 Rehmann Robson LLC 11
## 99 Rowles & Company LLP 28
## 100 RSM US LLP 33
## 101 SB Hoover & Company LLP 3
## 102 Shatswell MacLeod & Company PC 46
## 103 Smith Elliott Kearns & Company LLC 43
## 104 Sperry Cuono Holgate & Churchill CPA 1
## 105 Squar Milner Peterson Miranda & Williamson LLP 4
## 106 SR Snodgrass PC/AC 169
## 107 Stambaugh Ness PC 4
## 108 Stegman & Company PA 54
## 109 Tanner LLC/LC 3
## 110 TE Lott & Company 15
## 111 TGM Group LLC 8
## 112 Thigpen Jones Seaton & Co PC 13
## 113 Thompson Greenspon & Co PC (Inactive) 3
## 114 TJS Deemer Dana LLP 4
## 115 Trice Geary & Myers LLC 9
## 116 UHY LLP 7
## 117 Urbach Kahn & Werlin LLP 4
## 118 Vavrinek Trine Day & Company LLP 43
## 119 Virchow Krause & Company LLP 6
## 120 Whittlesey & Hadley PC 4
## 121 Wipfli LLP 27
## 122 Wipfli Ullrich Bertelson LLP 2
## 123 Witt Mares Eggleston Smith PLC 1
## 124 Wolf & Company PC (MA) 59
## 125 Yount Hyde & Barbour PC 139
FA1<-as.data.frame(FA)
sum(FA1$Freq) #we summarize all audits
## [1] 5356
FA1$RelFreq<-FA1$Freq/sum(FA1$Freq) # generate the third column with the market share
FA1[order(FA1$RelFreq, decreasing=TRUE),]# and sorting downward
## Var1 Freq RelFreq
## 63 KPMG LLP 1001 0.1868932039
## 45 Ernst & Young LLP 495 0.0924197162
## 31 Crowe Horwath LLP 395 0.0737490665
## 16 BKD LLP 287 0.0535847647
## 29 Crowe Chizek & Company LLP 245 0.0457430919
## 106 SR Snodgrass PC/AC 169 0.0315533981
## 91 PricewaterhouseCoopers LLP 159 0.0296863331
## 73 McGladrey & Pullen LLP 142 0.0265123226
## 125 Yount Hyde & Barbour PC 139 0.0259522031
## 14 Beard Miller Company LLP 135 0.0252053771
## 80 Moss Adams LLP 129 0.0240851382
## 48 Grant Thornton LLP 125 0.0233383122
## 85 ParenteBeard LLC 123 0.0229648992
## 13 BDO USA LLP 122 0.0227781927
## 44 Elliott Davis LLC/PLLC (Inactive) 104 0.0194174757
## 89 Porter Keadle Moore LLC 87 0.0162434653
## 88 Plante & Moran PLLC 79 0.0147498133
## 38 Dixon Hughes Goodman LLP 71 0.0132561613
## 15 Berry Dunn McNeil & Parker LLC (BDMP) 60 0.0112023898
## 124 Wolf & Company PC (MA) 59 0.0110156833
## 108 Stegman & Company PA 54 0.0100821509
## 86 Perry-Smith LLP 47 0.0087752054
## 102 Shatswell MacLeod & Company PC 46 0.0085884989
## 103 Smith Elliott Kearns & Company LLC 43 0.0080283794
## 118 Vavrinek Trine Day & Company LLP 43 0.0080283794
## 8 Arthur Andersen LLP 42 0.0078416729
## 10 Baker Tilly Virchow Krause LLP 41 0.0076549664
## 71 Mauldin & Jenkins LLC 41 0.0076549664
## 74 McGladrey LLP 41 0.0076549664
## 39 Dixon Hughes PLLC 38 0.0070948469
## 43 Elliott Davis Decosimo LLC/PLLC 35 0.0065347274
## 25 Clifton Gunderson LLP 33 0.0061613144
## 100 RSM US LLP 33 0.0061613144
## 54 Horne LLP 28 0.0052277819
## 99 Rowles & Company LLP 28 0.0052277819
## 79 Monroe Shine & Co Inc 27 0.0050410754
## 121 Wipfli LLP 27 0.0050410754
## 57 JH Williams & Co LLP 23 0.0042942494
## 12 BDO Seidman LLP 21 0.0039208364
## 26 CliftonLarsonAllen LLP 20 0.0037341299
## 50 Hacker Johnson & Smith PA 18 0.0033607170
## 21 Castaing Hussey & Lolan LLC 17 0.0031740105
## 9 Baker Newman & Noyes LLC 16 0.0029873040
## 18 Brown Edwards & Company LLP 16 0.0029873040
## 75 McNair McLemore Middlebrooks & Co LLC/LLP 16 0.0029873040
## 5 Arnett & Foster PLLC 15 0.0028005975
## 27 Cobitz Vandenberg & Fennessy 15 0.0028005975
## 60 JW Hunt & Company LLP 15 0.0028005975
## 110 TE Lott & Company 15 0.0028005975
## 20 Carr Riggs & Ingram (CRI) LLC 14 0.0026138910
## 56 JD Cloud & Co LLP 14 0.0026138910
## 70 Maggart & Associates PC 13 0.0024271845
## 112 Thigpen Jones Seaton & Co PC 13 0.0024271845
## 55 Hutchinson & Bloodgood LLP 12 0.0022404780
## 83 Parente Randolph LLC 11 0.0020537715
## 95 Rayburn Bates & Fitzgerald PC 11 0.0020537715
## 98 Rehmann Robson LLC 11 0.0020537715
## 22 Cherry Bekaert & Holland LLP (Inactive) 10 0.0018670650
## 36 Demetrius & Company LLC 10 0.0018670650
## 115 Trice Geary & Myers LLC 9 0.0016803585
## 34 Delap LLP 8 0.0014936520
## 81 Olive LLP 8 0.0014936520
## 84 Parente Randolph PC 8 0.0014936520
## 111 TGM Group LLC 8 0.0014936520
## 32 Cummings Ristau & Associates PC 7 0.0013069455
## 33 Davis Kinard & Co PC 7 0.0013069455
## 35 Deloitte & Touche LLP 7 0.0013069455
## 52 Hazlett Lewis & Bieter (HLB) PLLC 7 0.0013069455
## 65 KraftCPAs PLLC 7 0.0013069455
## 116 UHY LLP 7 0.0013069455
## 1 Alexander Thompson Arnold PLLC 6 0.0011202390
## 4 Anness Gerlach & Williams Inc 6 0.0011202390
## 24 Clifton D Bodiford CPA 6 0.0011202390
## 30 Crowe Chizek LLP 6 0.0011202390
## 76 Michael R Ferraro CPA 6 0.0011202390
## 87 Piltz Williams LaRosa & Co 6 0.0011202390
## 119 Virchow Krause & Company LLP 6 0.0011202390
## 17 Bonadio & Co LLP (Bonadio Group) 5 0.0009335325
## 40 Dixon Odom PLLC 5 0.0009335325
## 58 Jones & Kirkpatrick PC 5 0.0009335325
## 61 Kirkpatrick Phillips & Miller CPAs PC 5 0.0009335325
## 66 LaPorte APAC 5 0.0009335325
## 77 Michael Trokey & Company PC 5 0.0009335325
## 93 Pugh & Company PC (Inactive) 5 0.0009335325
## 23 Cherry Bekaert LLP (Active) 4 0.0007468260
## 90 Postlethwaite & Netterville APAC 4 0.0007468260
## 96 Rayburn Betts & Bates PC 4 0.0007468260
## 105 Squar Milner Peterson Miranda & Williamson LLP 4 0.0007468260
## 107 Stambaugh Ness PC 4 0.0007468260
## 114 TJS Deemer Dana LLP 4 0.0007468260
## 117 Urbach Kahn & Werlin LLP 4 0.0007468260
## 120 Whittlesey & Hadley PC 4 0.0007468260
## 19 Burr Pilger Mayer Inc 3 0.0005601195
## 41 Draffin & Tucker LLP 3 0.0005601195
## 42 Eggleston Smith PC 3 0.0005601195
## 51 Hannis T Bourgeois LLP 3 0.0005601195
## 53 Hill Barth & King (HBK) LLC 3 0.0005601195
## 64 Kraft Miles & Tatum LLC 3 0.0005601195
## 82 Packer Thomas CPAs 3 0.0005601195
## 94 Radics & Co LLC 3 0.0005601195
## 101 SB Hoover & Company LLP 3 0.0005601195
## 109 Tanner LLC/LC 3 0.0005601195
## 113 Thompson Greenspon & Co PC (Inactive) 3 0.0005601195
## 3 Anderson & Associates LLP (WA) 2 0.0003734130
## 6 Arnett Carbis Toothman LLP 2 0.0003734130
## 11 Barfield Murphy Shank & Smith LLC/PC 2 0.0003734130
## 37 Dixon Francis Davis & Company 2 0.0003734130
## 47 Fust Charles Chambers LLP 2 0.0003734130
## 49 Grimsley White & Company 2 0.0003734130
## 67 LaPorte Sehrt Romig & Hand PC 2 0.0003734130
## 68 Larrowe & Company PLC 2 0.0003734130
## 69 Larsson Woodyard & Henson LLP 2 0.0003734130
## 122 Wipfli Ullrich Bertelson LLP 2 0.0003734130
## 2 AM Peisch & Company LLP 1 0.0001867065
## 7 Arnett Foster Toothman PLLC 1 0.0001867065
## 28 CPA Associates PA (FL) 1 0.0001867065
## 46 Francis & Company CPAs (GA) 1 0.0001867065
## 59 JonesBaggett LLP 1 0.0001867065
## 62 Knight Vale & Gregory Inc PS/PLLC 1 0.0001867065
## 72 Mazars USA LLP 1 0.0001867065
## 78 Miller Ellin & Co LLP 1 0.0001867065
## 92 Prociak & Associates LLC 1 0.0001867065
## 97 Rayburn Fitzgerald PC 1 0.0001867065
## 104 Sperry Cuono Holgate & Churchill CPA 1 0.0001867065
## 123 Witt Mares Eggleston Smith PLC 1 0.0001867065
Let us try to assess whether there are entities in the market whose audit firm’s remuneration is atypical.
Solution, we are looking for outliers, we will use the simplest mechanism of analysis i.e. a thumbnail plot
boxplot(bank$Total_Fees_USDx)
Not much is clear from this picture then let’s relate it to the total
assets of a given enitity
boxplot(bank$Total_Fees_USDx/bank$Assets_USDx_N)
R as an environment has very good graphics capabilities and here is one
example:
c<- bank$Total_Fees_USDx/bank$Assets_USDx_N
c1<-c[c<max(c, na.rm= TRUE)]
boxplot(c1,
main = "Fee to total assets",
xlab = "",
ylab = "",
col = "blue",
border = "brown",
horizontal = TRUE,
notch = TRUE
)
We got rid of the largest observation for clarity, and arranged the
image horizontally. More help(boxplot) options.
Finding this one entity is relatively easy.
Please list the 6 entities with their CIK number, financial statement year, and company name that paid the most for auditing relative to total assets.
head(bank[order(bank$Total_Fees_USDx/bank$Assets_USDx_N,decreasing=TRUE),1:3])
## CIK_Code Year_Ended Companyx
## 4777 1357231 2006 Solera National Bancorp, Inc.
## 1907 811211 2000 UNIVERSITY BANCORP INC /DE/
## 4778 1357231 2007 Solera National Bancorp, Inc.
## 4530 1276130 2004 TREATY OAK BANCORP INC
## 4951 1428802 2008 Auburn Bancorp, Inc.
## 2851 933136 2014 Mr. Cooper Group Inc.
Case studies from the Polish Kit (assignment book): 4.25 Information Supermarket 5.3 Traffic 5.4 Stranger Dance 7.3 - analysis at the level of bachelor’s, master’s, doctoral thesis
Additionally 1. the relationship between non-revision pay and opinion type 2. audit fees and performance management 3. audit firm specialization in the market
It is worth noting that analysis does not necessarily have to be advanced to generate the information needed in business.
For a doctoral-level research assignment, please try to replicate a study from the article.