Loading the needed packages
library(tidyverse)
library(tidymodels)
library(janitor)
library(themis)
library(vip)
library(rpart.plot)
Loading the datset
risk <- read_csv("~/Desktop/R/csv/german_credit.csv") %>% clean_names()
df <- risk %>% slice_head(n = 1000)
new_data <- risk %>% slice_tail(n = 10) %>% select(!creditability)
EDA
Plotting the nominal (target) variable

Plotting the numerical (predictor) variables

MODELLING
1.Logistic Regression model
Predicting on new data
## # A tibble: 10 × 3
## .pred_class .pred_Yes .pred_No
## <fct> <dbl> <dbl>
## 1 No 0.154 0.846
## 2 No 0.162 0.838
## 3 Yes 0.955 0.0452
## 4 No 0.332 0.668
## 5 No 0.0986 0.901
## 6 Yes 0.608 0.392
## 7 No 0.484 0.516
## 8 Yes 0.926 0.0739
## 9 No 0.441 0.559
## 10 Yes 0.646 0.354
2.Random Forest model
Predicting on new data
## # A tibble: 10 × 3
## .pred_class .pred_Yes .pred_No
## <fct> <dbl> <dbl>
## 1 Yes 0.527 0.473
## 2 No 0.405 0.595
## 3 Yes 0.596 0.404
## 4 Yes 0.617 0.383
## 5 No 0.270 0.730
## 6 No 0.487 0.513
## 7 Yes 0.606 0.394
## 8 Yes 0.643 0.357
## 9 Yes 0.616 0.384
## 10 Yes 0.560 0.440
3.Decision Trees model
Predicting on new data
## # A tibble: 10 × 3
## .pred_class .pred_Yes .pred_No
## <fct> <dbl> <dbl>
## 1 No 0.35 0.65
## 2 No 0.12 0.88
## 3 No 0.444 0.556
## 4 Yes 0.786 0.214
## 5 No 0.12 0.88
## 6 No 0.35 0.65
## 7 Yes 1 0
## 8 Yes 0.847 0.153
## 9 No 0.136 0.864
## 10 No 0.444 0.556
4.Comparing results


7.Variable Importance


8.Plotting the Decision Tree
