Loading the needed packages

library(tidyverse)
library(tidymodels)
library(janitor)
library(themis)
library(vip)
library(rpart.plot)

Loading the datset

risk <- read_csv("~/Desktop/R/csv/german_credit.csv") %>% clean_names()
df <- risk %>% slice_head(n = 1000)
new_data <- risk %>% slice_tail(n = 10) %>% select(!creditability)

EDA

Plotting the nominal (target) variable

Plotting the numerical (predictor) variables

MODELLING

1.Logistic Regression model

Predicting on new data

## # A tibble: 10 × 3
##    .pred_class .pred_Yes .pred_No
##    <fct>           <dbl>    <dbl>
##  1 No             0.154    0.846 
##  2 No             0.162    0.838 
##  3 Yes            0.955    0.0452
##  4 No             0.332    0.668 
##  5 No             0.0986   0.901 
##  6 Yes            0.608    0.392 
##  7 No             0.484    0.516 
##  8 Yes            0.926    0.0739
##  9 No             0.441    0.559 
## 10 Yes            0.646    0.354

2.Random Forest model

Predicting on new data

## # A tibble: 10 × 3
##    .pred_class .pred_Yes .pred_No
##    <fct>           <dbl>    <dbl>
##  1 Yes             0.527    0.473
##  2 No              0.405    0.595
##  3 Yes             0.596    0.404
##  4 Yes             0.617    0.383
##  5 No              0.270    0.730
##  6 No              0.487    0.513
##  7 Yes             0.606    0.394
##  8 Yes             0.643    0.357
##  9 Yes             0.616    0.384
## 10 Yes             0.560    0.440

3.Decision Trees model

Predicting on new data

## # A tibble: 10 × 3
##    .pred_class .pred_Yes .pred_No
##    <fct>           <dbl>    <dbl>
##  1 No              0.35     0.65 
##  2 No              0.12     0.88 
##  3 No              0.444    0.556
##  4 Yes             0.786    0.214
##  5 No              0.12     0.88 
##  6 No              0.35     0.65 
##  7 Yes             1        0    
##  8 Yes             0.847    0.153
##  9 No              0.136    0.864
## 10 No              0.444    0.556

Credit risk analysis using ML models!

NickD - email: nickydyakov@gmail.com

2022-11-09

Loading the needed packages

Loading the datset

EDA

Plotting the nominal (target) variable

Plotting the numerical (predictor) variables

MODELLING

1.Logistic Regression model

Predicting on new data

2.Random Forest model

Predicting on new data

3.Decision Trees model

Predicting on new data

4.Comparing results

5.Confusion matrices

6.ROC curves

7.Variable Importance

8.Plotting the Decision Tree