Credit Card Transactions
| Variable | Description |
|---|---|
| loan_amnt | The listed amount of the loan applied for by the borrower. If at some point in time, the credit department reduces the loan amount, then it will be reflected in this value. |
| funded_amnt | The total amount committed to that loan at that point in time. |
| disbursement_method | The number of payments on the loan. Values are in months and can be either 36 or 60. |
| term | The method by which the borrower receives their loan. Possible values are: CASH, DIRECT_PAY. |
| int_rate | Interest Rate on the loan |
| installment | Interest Rate on the loan |
| grade | The monthly payment owed by the borrower if the loan originates. |
| emp_title | Lending Club assigned loan grade, The job title supplied by the Borrower when applying for the loan. |
| emp_length | Employment length in years. Possible values are between 0 and 10 where 0 means less than one year and 10 means ten or more years. |
| home_ownership | The home ownership status provided by the borrower during registration or obtained from the credit report. Our values are: RENT, OWN, MORTGAGE, OTHER |
| annual_inc | The self-reported annual income provided by the borrower during registration. |
| annual_inc_joint | The self-reported joint annual income provided by the borrower during registration. |
| loan_status | Current status of the loan: fully-paid, current, charged-off |
| pymnt_plan | Indicates if a payment plan has been put in place for the loan |
| purpose | A category provided by the borrower for the loan request. |
| title | The loan title provided by the borrower |
| addr_state | The state provided by the borrower in the loan application |
| dti | A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income. |
| dti_joint | Joint debt to income ratio. |
| delinq_2yrs | The number of 30+ days past-due incidences of delinquency in the borrower’s credit file for the past 2 years. |
| delinq_amnt | The past-due amount owed for the accounts on which the borrower is now delinquent. |
| fico_range_low | The lower boundary range the borrower’s FICO at loan origination belongs to. |
| fico_range_high | The upper boundary range the borrower’s FICO at loan origination belongs to. |
| inq_last_6mths | The number of inquiries in past 6 months (excluding auto and mortgage inquiries). |
| open_acc | The number of open credit lines in the borrower’s credit file. |
| pub_rec | Number of derogatory public records. |
| revol_bal | Total credit revolving balance. |
| revol_util | Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit. |
| total_acc | The total number of credit lines currently in the borrower’s credit file. |
| total_rev_hi_lim | Total revolving high credit/credit limit. |
| total_rec_late_fee | Late fees received to date. |
| collections_12_mths_ex_med | Number of collections in 12 months excluding medical collections. |
| application_type | Indicates whether the loan is an individual application or a joint application with two co-borrowers. |
| max_bal_bc | Maximum current balance owed on all revolving accounts. |
| inq_fi | Number of personal finance inquiries. |
| avg_cur_bal | Average current balance of all accounts. |
| tax_liens | Number of tax liens. |
| hardship_flag | Flags whether or not the borrower is on a hardship plan. |
## loan_amnt funded_amnt term int_rate
## Min. : 500 Min. : 500 Length:2260701 Min. : 5.31
## 1st Qu.: 8000 1st Qu.: 8000 Class :character 1st Qu.: 9.49
## Median :12900 Median :12875 Mode :character Median :12.62
## Mean :15047 Mean :15042 Mean :13.09
## 3rd Qu.:20000 3rd Qu.:20000 3rd Qu.:15.99
## Max. :40000 Max. :40000 Max. :30.99
## NA's :33 NA's :33 NA's :33
## installment grade emp_title emp_length
## Min. : 4.93 Length:2260701 Length:2260701 Length:2260701
## 1st Qu.: 251.65 Class :character Class :character Class :character
## Median : 377.99 Mode :character Mode :character Mode :character
## Mean : 445.81
## 3rd Qu.: 593.32
## Max. :1719.83
## NA's :33
## home_ownership annual_inc annual_inc_joint loan_status
## Length:2260701 Min. : 0 Min. : 5694 Length:2260701
## Class :character 1st Qu.: 46000 1st Qu.: 83400 Class :character
## Mode :character Median : 65000 Median : 110000 Mode :character
## Mean : 77992 Mean : 123625
## 3rd Qu.: 93000 3rd Qu.: 147995
## Max. :110000000 Max. :7874821
## NA's :37 NA's :2139991
## pymnt_plan purpose title addr_state
## Length:2260701 Length:2260701 Length:2260701 Length:2260701
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## dti dti_joint delinq_2yrs delinq_amnt
## Min. : -1.00 Min. : 0.0 Min. : 0.0000 Min. : 0.00
## 1st Qu.: 11.89 1st Qu.:13.5 1st Qu.: 0.0000 1st Qu.: 0.00
## Median : 17.84 Median :18.8 Median : 0.0000 Median : 0.00
## Mean : 18.82 Mean :19.3 Mean : 0.3069 Mean : 12.37
## 3rd Qu.: 24.49 3rd Qu.:24.6 3rd Qu.: 0.0000 3rd Qu.: 0.00
## Max. :999.00 Max. :69.5 Max. :58.0000 Max. :249925.00
## NA's :1744 NA's :2139995 NA's :62 NA's :62
## fico_range_low fico_range_high inq_last_6mths open_acc
## Min. :610.0 Min. :614.0 Min. : 0.0000 Min. : 0.00
## 1st Qu.:675.0 1st Qu.:679.0 1st Qu.: 0.0000 1st Qu.: 8.00
## Median :690.0 Median :694.0 Median : 0.0000 Median : 11.00
## Mean :698.6 Mean :702.6 Mean : 0.5768 Mean : 11.61
## 3rd Qu.:715.0 3rd Qu.:719.0 3rd Qu.: 1.0000 3rd Qu.: 14.00
## Max. :845.0 Max. :850.0 Max. :33.0000 Max. :101.00
## NA's :33 NA's :33 NA's :63 NA's :62
## pub_rec revol_bal revol_util total_acc
## Min. : 0.0000 Min. : 0 Min. : 0.00 Min. : 1.00
## 1st Qu.: 0.0000 1st Qu.: 5950 1st Qu.: 31.50 1st Qu.: 15.00
## Median : 0.0000 Median : 11324 Median : 50.30 Median : 22.00
## Mean : 0.1975 Mean : 16658 Mean : 50.34 Mean : 24.16
## 3rd Qu.: 0.0000 3rd Qu.: 20246 3rd Qu.: 69.40 3rd Qu.: 31.00
## Max. :86.0000 Max. :2904836 Max. :892.30 Max. :176.00
## NA's :62 NA's :33 NA's :1835 NA's :62
## total_rev_hi_lim total_rec_late_fee collections_12_mths_ex_med
## Min. : 0 Min. : 0.000 Min. : 0.00000
## 1st Qu.: 14700 1st Qu.: 0.000 1st Qu.: 0.00000
## Median : 25400 Median : 0.000 Median : 0.00000
## Mean : 34574 Mean : 1.518 Mean : 0.01815
## 3rd Qu.: 43200 3rd Qu.: 0.000 3rd Qu.: 0.00000
## Max. :9999999 Max. :1484.340 Max. :20.00000
## NA's :70309 NA's :33 NA's :178
## application_type max_bal_bc inq_fi avg_cur_bal
## Length:2260701 Min. : 0 Min. : 0 Min. : 0
## Class :character 1st Qu.: 2284 1st Qu.: 0 1st Qu.: 3080
## Mode :character Median : 4413 Median : 1 Median : 7335
## Mean : 5806 Mean : 1 Mean : 13548
## 3rd Qu.: 7598 3rd Qu.: 1 3rd Qu.: 18783
## Max. :1170668 Max. :48 Max. :958084
## NA's :866162 NA's :866162 NA's :70379
## tax_liens hardship_flag disbursement_method
## Min. : 0.00000 Length:2260701 Length:2260701
## 1st Qu.: 0.00000 Class :character Class :character
## Median : 0.00000 Mode :character Mode :character
## Mean : 0.04677
## 3rd Qu.: 0.00000
## Max. :85.00000
## NA's :138
ggplot(accepted_loans, aes(grade, fill=grade)) +
geom_bar(stat="count", color="white", size=0.25)
factors_affecting_approved_balance_1 <-loan_amnt ~ lm(funded_amnt + annual_inc, data = accepted_loans)
stargazer(factors_affecting_approved_balance_1, type = "text")
##
## % Error: Unrecognized object type.
summary(accepted_loans)
## loan_amnt funded_amnt term int_rate
## Min. : 500 Min. : 500 Length:2260701 Min. : 5.31
## 1st Qu.: 8000 1st Qu.: 8000 Class :character 1st Qu.: 9.49
## Median :12900 Median :12875 Mode :character Median :12.62
## Mean :15047 Mean :15042 Mean :13.09
## 3rd Qu.:20000 3rd Qu.:20000 3rd Qu.:15.99
## Max. :40000 Max. :40000 Max. :30.99
## NA's :33 NA's :33 NA's :33
## installment grade emp_title emp_length
## Min. : 4.93 Length:2260701 Length:2260701 Length:2260701
## 1st Qu.: 251.65 Class :character Class :character Class :character
## Median : 377.99 Mode :character Mode :character Mode :character
## Mean : 445.81
## 3rd Qu.: 593.32
## Max. :1719.83
## NA's :33
## home_ownership annual_inc annual_inc_joint loan_status
## Length:2260701 Min. : 0 Min. : 5694 Length:2260701
## Class :character 1st Qu.: 46000 1st Qu.: 83400 Class :character
## Mode :character Median : 65000 Median : 110000 Mode :character
## Mean : 77992 Mean : 123625
## 3rd Qu.: 93000 3rd Qu.: 147995
## Max. :110000000 Max. :7874821
## NA's :37 NA's :2139991
## pymnt_plan purpose title addr_state
## Length:2260701 Length:2260701 Length:2260701 Length:2260701
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## dti dti_joint delinq_2yrs delinq_amnt
## Min. : -1.00 Min. : 0.0 Min. : 0.0000 Min. : 0.00
## 1st Qu.: 11.89 1st Qu.:13.5 1st Qu.: 0.0000 1st Qu.: 0.00
## Median : 17.84 Median :18.8 Median : 0.0000 Median : 0.00
## Mean : 18.82 Mean :19.3 Mean : 0.3069 Mean : 12.37
## 3rd Qu.: 24.49 3rd Qu.:24.6 3rd Qu.: 0.0000 3rd Qu.: 0.00
## Max. :999.00 Max. :69.5 Max. :58.0000 Max. :249925.00
## NA's :1744 NA's :2139995 NA's :62 NA's :62
## fico_range_low fico_range_high inq_last_6mths open_acc
## Min. :610.0 Min. :614.0 Min. : 0.0000 Min. : 0.00
## 1st Qu.:675.0 1st Qu.:679.0 1st Qu.: 0.0000 1st Qu.: 8.00
## Median :690.0 Median :694.0 Median : 0.0000 Median : 11.00
## Mean :698.6 Mean :702.6 Mean : 0.5768 Mean : 11.61
## 3rd Qu.:715.0 3rd Qu.:719.0 3rd Qu.: 1.0000 3rd Qu.: 14.00
## Max. :845.0 Max. :850.0 Max. :33.0000 Max. :101.00
## NA's :33 NA's :33 NA's :63 NA's :62
## pub_rec revol_bal revol_util total_acc
## Min. : 0.0000 Min. : 0 Min. : 0.00 Min. : 1.00
## 1st Qu.: 0.0000 1st Qu.: 5950 1st Qu.: 31.50 1st Qu.: 15.00
## Median : 0.0000 Median : 11324 Median : 50.30 Median : 22.00
## Mean : 0.1975 Mean : 16658 Mean : 50.34 Mean : 24.16
## 3rd Qu.: 0.0000 3rd Qu.: 20246 3rd Qu.: 69.40 3rd Qu.: 31.00
## Max. :86.0000 Max. :2904836 Max. :892.30 Max. :176.00
## NA's :62 NA's :33 NA's :1835 NA's :62
## total_rev_hi_lim total_rec_late_fee collections_12_mths_ex_med
## Min. : 0 Min. : 0.000 Min. : 0.00000
## 1st Qu.: 14700 1st Qu.: 0.000 1st Qu.: 0.00000
## Median : 25400 Median : 0.000 Median : 0.00000
## Mean : 34574 Mean : 1.518 Mean : 0.01815
## 3rd Qu.: 43200 3rd Qu.: 0.000 3rd Qu.: 0.00000
## Max. :9999999 Max. :1484.340 Max. :20.00000
## NA's :70309 NA's :33 NA's :178
## application_type max_bal_bc inq_fi avg_cur_bal
## Length:2260701 Min. : 0 Min. : 0 Min. : 0
## Class :character 1st Qu.: 2284 1st Qu.: 0 1st Qu.: 3080
## Mode :character Median : 4413 Median : 1 Median : 7335
## Mean : 5806 Mean : 1 Mean : 13548
## 3rd Qu.: 7598 3rd Qu.: 1 3rd Qu.: 18783
## Max. :1170668 Max. :48 Max. :958084
## NA's :866162 NA's :866162 NA's :70379
## tax_liens hardship_flag disbursement_method
## Min. : 0.00000 Length:2260701 Length:2260701
## 1st Qu.: 0.00000 Class :character Class :character
## Median : 0.00000 Mode :character Mode :character
## Mean : 0.04677
## 3rd Qu.: 0.00000
## Max. :85.00000
## NA's :138
# Data Exploration
#1. Geographical Distrubution of Loan
# the Number of Loan Funded in Different States
a=data.table(table(accepted_loans$addr_state))
setnames(a,c("region","count"))
a$region=sapply(state.name[match(a$region,state.abb)],tolower)
all_states <- map_data("state")
total <-merge(all_states,a,by="region")
ggplot(total, aes(x=long, y=lat, map_id = region)) +
geom_map(aes(fill= count), map = all_states)+
labs(title="Loan counts in respective states",x="",y="")+
scale_fill_gradientn("",colours=terrain.colors(10),guide = "legend")+
theme_bw()
accepted_loans%>%
ggplot(aes(x=forcats::fct_infreq(grade), fill=grade)) +
geom_bar(show.legend = F)+
geom_text(stat = 'count',
aes(label=paste0(round(after_stat(prop*100), digits=1), "%"), group=1),
vjust=-0.4,
size=4 )+
labs(x="Grade",
y="Count",
title = "Applicants Distubution in Different Grades")+
theme(
plot.title = element_text(size=20),
axis.text.x = element_text(size=16),
axis.text.y = element_text(size=16))
accepted_loans%>%
ggplot(aes(x=grade,fill=purpose))+
geom_bar(stat = 'count', position = 'fill')+
labs(x='count',
y='Grade',
title = 'Applicant with Different Purposes')+
theme(
axis.text.x = element_text(angle = 45,hjust=1))+coord_flip()+
theme(
plot.title = element_text(size=20),
axis.text.x = element_text(size=16),
axis.text.y = element_text(size=16))
#to see the interest rates cluster over different variable
accepted_loans%>%
ggplot(aes(x=int_rate, color=home_ownership))+
geom_density(adjust=2)+
theme_wsj()
## Warning: Removed 33 rows containing non-finite values (stat_density).