Introduction

This analysis used the LoanData.csv and looked loans that were late or in default. In addition to this, techniques used in this chapter are based off of topics and concepts covered in chapter 7.

library(readr)
LoanData <- read_csv("LoanData.csv")
## Parsed with column specification:
## cols(
##   Status = col_character(),
##   Credit.Grade = col_character(),
##   Amount = col_double(),
##   Age = col_double(),
##   Borrower.Rate = col_double(),
##   Debt.To.Income.Ratio = col_double()
## )
head(LoanData)
## # A tibble: 6 x 6
##   Status  Credit.Grade Amount   Age Borrower.Rate Debt.To.Income.Ratio
##   <chr>   <chr>         <dbl> <dbl>         <dbl>                <dbl>
## 1 Current C              5000     4        0.15                  0.04 
## 2 Current HR             1900     6        0.265                 0.02 
## 3 Current HR             1000     3        0.15                  0.02 
## 4 Current HR             1000     5        0.290                 0.02 
## 5 Current AA             2550     8        0.0795                0.033
## 6 Current NC             1500     2        0.26                  0.03
LoanData<-LoanData[,1:6]
table(LoanData$Status)
## 
## Current Default    Late 
##    5186      75     350
table(LoanData$Credit.Grade)
## 
##    A   AA    B    C    D    E   HR   NC 
##  424  451  553  843  927 1129 1217   67
v1=rep(1,dim(LoanData)[1])
v2=rep(0,dim(LoanData)[1])
LoanData$BadLoanType = ifelse(LoanData$Status %in% c('Default', 'Late'),v1,v2)
head(LoanData)
## # A tibble: 6 x 7
##   Status  Credit.Grade Amount   Age Borrower.Rate Debt.To.Income.Ra~ BadLoanType
##   <chr>   <chr>         <dbl> <dbl>         <dbl>              <dbl>       <dbl>
## 1 Current C              5000     4        0.15                0.04            0
## 2 Current HR             1900     6        0.265               0.02            0
## 3 Current HR             1000     3        0.15                0.02            0
## 4 Current HR             1000     5        0.290               0.02            0
## 5 Current AA             2550     8        0.0795              0.033           0
## 6 Current NC             1500     2        0.26                0.03            0
table(LoanData$BadLoanType,LoanData$Credit.Grade)
##    
##        A   AA    B    C    D    E   HR   NC
##   0  413  446  530  812  881 1018 1033   53
##   1   11    5   23   31   46  111  184   14
table(LoanData$BadLoanType)
## 
##    0    1 
## 5186  425
BadLoans = 425/(5186+425)
BadLoans 
## [1] 0.07574407
m1=glm(BadLoanType~Credit.Grade,family=binomial,data=LoanData)
summary(m1)
## 
## Call:
## glm(formula = BadLoanType ~ Credit.Grade, family = binomial, 
##     data = LoanData)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6847  -0.4550  -0.3190  -0.2737   3.0007  
## 
## Coefficients:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)     -3.6256     0.3055 -11.868  < 2e-16 ***
## Credit.GradeAA  -0.8653     0.5437  -1.592   0.1115    
## Credit.GradeB    0.4882     0.3724   1.311   0.1899    
## Credit.GradeC    0.3600     0.3561   1.011   0.3120    
## Credit.GradeD    0.6731     0.3409   1.975   0.0483 *  
## Credit.GradeE    1.4095     0.3214   4.385 1.16e-05 ***
## Credit.GradeHR   1.9003     0.3158   6.017 1.77e-09 ***
## Credit.GradeNC   2.2943     0.4285   5.354 8.60e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3010.3  on 5610  degrees of freedom
## Residual deviance: 2808.2  on 5603  degrees of freedom
## AIC: 2824.2
## 
## Number of Fisher Scoring iterations: 7
exp(m1$coef[2])
## Credit.GradeAA 
##      0.4209132
exp(m1$coef[3])
## Credit.GradeB 
##      1.629331
exp(m1$coef[4])
## Credit.GradeC 
##      1.433386
exp(m1$coef[5])
## Credit.GradeD 
##      1.960376
exp(m1$coef[6])
## Credit.GradeE 
##      4.093856
exp(m1$coef[7])
## Credit.GradeHR 
##       6.687671
exp(m1$coef[8])
## Credit.GradeNC 
##       9.917667
m2=glm(BadLoanType~Amount+Borrower.Rate,family=binomial,data=LoanData)
summary(m2)
## 
## Call:
## glm(formula = BadLoanType ~ Amount + Borrower.Rate, family = binomial, 
##     data = LoanData)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4187  -0.4539  -0.3058  -0.2105   3.2723  
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   -5.374e+00  2.374e-01 -22.641   <2e-16 ***
## Amount        -1.163e-06  1.356e-05  -0.086    0.932    
## Borrower.Rate  1.317e+01  8.912e-01  14.779   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3010.3  on 5610  degrees of freedom
## Residual deviance: 2737.9  on 5608  degrees of freedom
## AIC: 2743.9
## 
## Number of Fisher Scoring iterations: 6
exp(m2$coef[2])
##    Amount 
## 0.9999988
exp(m2$coef[3])
## Borrower.Rate 
##      525262.8
plot(LoanData$Amount,LoanData$BadLoanType)

plot(LoanData$Borrower.Rate,LoanData$BadLoanType)

Conclusion

It was shown that out of the 5,611 loans, 7.57% of them were late or in default. The plots for amount and borrower rate are very similar. They both show that the amount of money owed and borrowing rate was defendant on the type of loan. Bad loans, which were either late or default, occurred when there were large amounts of money owed or high borrowing rates. Credit Grade was ultimately related to borrowing rate and amount of money for a particular loan since the type of credit grade differed depending their value. The model for credit grade showed that Grade E, HR, and NC were significant as their p values of 1.16e-05, 1.77e-09, and 8.60e-08 were ≤ 0.001. Taking the logits also this their values were also significant in comparison to the previous credit grades as they were 4.093856, 6.687671, and 9.917667. THis was significant when looking at the other grades, which had values under 2. The model for borrower rate and amount showed that borrower rate was significant as its p of <2e-16 was ≤ 0.001. The logit also showed that borrowing rate was high as it was 525262.8. This was significant when comparing it to amount, which was under 1.Overall, it was shown that borrower rate impacts the credit grade and influenced GradeS E, HR, and N