NOTE Clear the environment

rm(list = ls(all=TRUE))

Goal

Based on various financial ratios , predict whether the company got bankrupt in the subsequent years or not.

Agenda

Get the data
Data Pre-processing
Build a model
Predictions
Communication

Libraries used

library(ROSE)

## Loaded ROSE 0.0-3

library(corrplot)

## corrplot 0.84 loaded

library(caret)

## Loading required package: lattice

## Loading required package: ggplot2

library(C50)
library(rpart)
library(rpart.plot)
library(DMwR)

## Loading required package: grid

library(class)
library(mice)

## 
## Attaching package: 'mice'

## The following objects are masked from 'package:base':
## 
##     cbind, rbind

library(vegan)

## Loading required package: permute

## This is vegan 2.5-2

## 
## Attaching package: 'vegan'

## The following object is masked from 'package:caret':
## 
##     tolerance

library(randomForest)

## randomForest 4.6-14

## Type rfNews() to see new features/changes/bug fixes.

## 
## Attaching package: 'randomForest'

## The following object is masked from 'package:ggplot2':
## 
##     margin

library(inTrees)
library(e1071)

Reading & Understanding the Data

Read the Data

setwd("C://Users//brbhatta//Desktop//INSOFE//Cute3")
bank_data <- read.csv("train.csv")

Understand the data

Using the str(), summary(), head() and tail() functions to get the dimensions and types of attributes in the dataset
The dataset has 43004 observations and 65 variables

str(bank_data)

## 'data.frame':    36553 obs. of  66 variables:
##  $ ID    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Attr1 : num  0.13537 0.00586 0.1106 0.06391 0.13168 ...
##  $ Attr2 : num  0.452 0.399 0.161 1.407 0.66 ...
##  $ Attr3 : num  0.312 0.198 0.479 -0.296 0.441 ...
##  $ Attr4 : num  2.047 1.939 7.571 0.529 2.71 ...
##  $ Attr5 : num  10.23 9.58 263.9 -46.29 -23.6 ...
##  $ Attr6 : num  0.168 0 0 -0.714 -0.2 ...
##  $ Attr7 : num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr8 : num  1.213 1.509 5.205 -0.331 0.514 ...
##  $ Attr9 : num  2.255 0.979 0.684 0.985 2.136 ...
##  $ Attr10: num  0.548 0.601 0.839 -0.465 0.34 ...
##  $ Attr11: num  0.1833 0.0295 0.1388 0.0791 0.1861 ...
##  $ Attr12: num  0.5632 0.0344 1.8983 0.1258 0.5112 ...
##  $ Attr13: num  0.0892 0.0364 0.2362 0.0303 0.1671 ...
##  $ Attr14: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr15: num  820 4088 364 5433 675 ...
##  $ Attr16: num  0.4453 0.0893 1.0032 0.0672 0.5405 ...
##  $ Attr17: num  2.213 2.509 6.205 0.711 1.514 ...
##  $ Attr18: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr19: num  0.07433 0.00739 0.20214 0.02532 0.06176 ...
##  $ Attr20: num  40.2 58.7 38.6 12.5 80.3 ...
##  $ Attr21: num  NA 0.979 1.835 0.449 1.06 ...
##  $ Attr22: num  0.18115 0.00602 0.11669 0.13787 0.18272 ...
##  $ Attr23: num  0.06002 0.00599 0.16159 0.02047 0.06166 ...
##  $ Attr24: num  0.1676 0.0578 0.2354 -0.6893 0.2086 ...
##  $ Attr25: num  0.5115 0.5474 0.8349 -0.4655 0.0482 ...
##  $ Attr26: num  0.3739 0.0858 0.8309 0.0564 0.5401 ...
##  $ Attr27: num  11.554 0.271 249.3 0.435 3.371 ...
##  $ Attr28: num  0.798 0.334 1.069 -0.443 1.467 ...
##  $ Attr29: num  3.58 5.64 4.1 4.66 2.23 ...
##  $ Attr30: num  0.156 0.407 -0.163 0.45 0.306 ...
##  $ Attr31: num  0.07999 0.00739 0.20283 0.02532 0.07901 ...
##  $ Attr32: num  52.3 107.5 46.3 72.3 49.6 ...
##  $ Attr33: num  6.98 3.69 7.89 5.04 7.36 ...
##  $ Attr34: num  4.598 1.95 3.568 0.098 2.878 ...
##  $ Attr35: num  0.178 0.264 0.109 0.138 0.235 ...
##  $ Attr36: num  2.255 0.979 0.684 3.501 2.136 ...
##  $ Attr37: num  3.59 1.33 5.58 0.29 0.57 ...
##  $ Attr38: num  0.649 0.789 0.925 0.313 0.742 ...
##  $ Attr39: num  0.0787 0.2697 0.1599 0.0442 0.1102 ...
##  $ Attr40: num  0.35559 0.1049 3.7693 0.00798 0.34263 ...
##  $ Attr41: num  0.0701 0.3865 0.0384 0.3017 0.054 ...
##  $ Attr42: num  0.08032 0.00615 0.17048 0.04415 0.08556 ...
##  $ Attr43: num  81.5 144 147.8 38.3 104.4 ...
##  $ Attr44: num  41.3 85.3 109.1 25.8 24.1 ...
##  $ Attr45: num  0.5455 0.0373 1.527 0.599 0.2803 ...
##  $ Attr46: num  1.213 1.192 6.577 0.359 0.889 ...
##  $ Attr47: num  43.6 73.9 46 12.3 90.2 ...
##  $ Attr48: num  0.1475 -0.0223 0.0934 0.1224 -0.0422 ...
##  $ Attr49: num  0.0654 -0.0228 0.1364 0.0392 -0.0198 ...
##  $ Attr50: num  1.348 1.024 3.424 0.236 1.059 ...
##  $ Attr51: num  0.2977 0.2105 0.0729 0.6285 0.2581 ...
##  $ Attr52: num  0.143 0.271 0.127 0.198 0.136 ...
##  $ Attr53: num  1.403 1.016 1.872 -0.697 1.129 ...
##  $ Attr54: num  1.661 1.334 2.063 0.469 2.467 ...
##  $ Attr55: num  1189.7 1.94 6092.3 -13606 75.68 ...
##  $ Attr56: num  0.0787 0.2697 0.1599 -0.0153 0.1102 ...
##  $ Attr57: num  0.247 0 0.132 -0.137 0.388 ...
##  $ Attr58: num  0.926 0.793 0.807 1.015 0.939 ...
##  $ Attr59: num  0.184 0.313 0.102 -1.673 1.184 ...
##  $ Attr60: num  9.09 6.22 9.45 29.27 4.55 ...
##  $ Attr61: num  8.83 4.28 3.34 14.14 15.14 ...
##  $ Attr62: num  48.2 78.5 38.9 73.5 44.1 ...
##  $ Attr63: num  7.58 4.65 9.39 4.97 8.28 ...
##  $ Attr64: num  5.77 1.65 1.53 4.68 7.1 ...
##  $ target: int  0 0 0 0 0 0 0 0 0 0 ...

summary(bank_data)

##        ID            Attr1                Attr2          
##  Min.   :    1   Min.   :-256.89000   Min.   :-430.8700  
##  1st Qu.: 9139   1st Qu.:   0.00334   1st Qu.:   0.2696  
##  Median :18277   Median :   0.04960   Median :   0.4726  
##  Mean   :18277   Mean   :   0.05146   Mean   :   0.5777  
##  3rd Qu.:27415   3rd Qu.:   0.13008   3rd Qu.:   0.6892  
##  Max.   :36553   Max.   :  94.28000   Max.   : 480.9600  
##                  NA's   :8            NA's   :8          
##      Attr3               Attr4              Attr5          
##  Min.   :-479.9600   Min.   :   -0.05   Min.   :-11903000  
##  1st Qu.:   0.0221   1st Qu.:    1.05   1st Qu.:      -49  
##  Median :   0.1975   Median :    1.57   Median :       -1  
##  Mean   :   0.1274   Mean   :    6.84   Mean   :     -482  
##  3rd Qu.:   0.4048   3rd Qu.:    2.79   3rd Qu.:       51  
##  Max.   :  28.3360   Max.   :53433.00   Max.   :  1034100  
##  NA's   :8           NA's   :111        NA's   :76         
##      Attr6               Attr7               Attr8         
##  Min.   :-508.4100   Min.   :-517.4800   Min.   : -141.41  
##  1st Qu.:   0.0000   1st Qu.:   0.0057   1st Qu.:    0.43  
##  Median :   0.0000   Median :   0.0595   Median :    1.07  
##  Mean   :  -0.0309   Mean   :   0.1151   Mean   :   12.91  
##  3rd Qu.:   0.0869   3rd Qu.:   0.1517   3rd Qu.:    2.61  
##  Max.   : 543.2500   Max.   : 649.2300   Max.   :53432.00  
##  NA's   :8           NA's   :8           NA's   :79        
##      Attr9              Attr10              Attr11         
##  Min.   :  -3.496   Min.   :-479.9100   Min.   :-189.4500  
##  1st Qu.:   1.019   1st Qu.:   0.2948   1st Qu.:   0.0153  
##  Median :   1.200   Median :   0.5056   Median :   0.0754  
##  Mean   :   2.675   Mean   :   0.6496   Mean   :   0.1549  
##  3rd Qu.:   2.063   3rd Qu.:   0.7084   3rd Qu.:   0.1678  
##  Max.   :9742.300   Max.   :1099.5000   Max.   : 681.5400  
##  NA's   :7          NA's   :8           NA's   :37         
##      Attr12              Attr13              Attr14         
##  Min.   :-6331.800   Min.   :-1317.600   Min.   :-517.4800  
##  1st Qu.:    0.015   1st Qu.:    0.024   1st Qu.:   0.0057  
##  Median :    0.171   Median :    0.068   Median :   0.0596  
##  Mean   :    1.182   Mean   :    0.956   Mean   :   0.1152  
##  3rd Qu.:    0.586   3rd Qu.:    0.135   3rd Qu.:   0.1517  
##  Max.   : 8259.400   Max.   :13315.000   Max.   : 649.2300  
##  NA's   :111         NA's   :113         NA's   :8          
##      Attr15             Attr16              Attr17        
##  Min.   :-5611900   Min.   :-6331.800   Min.   :   -0.41  
##  1st Qu.:     220   1st Qu.:    0.073   1st Qu.:    1.45  
##  Median :     845   Median :    0.245   Median :    2.11  
##  Mean   :    2405   Mean   :    1.519   Mean   :   14.08  
##  3rd Qu.:    2222   3rd Qu.:    0.665   3rd Qu.:    3.69  
##  Max.   :10236000   Max.   : 8259.400   Max.   :53433.00  
##  NA's   :28         NA's   :80          NA's   :79        
##      Attr18              Attr19              Attr20       
##  Min.   :-517.4800   Min.   :-1325.600   Min.   :    -29  
##  1st Qu.:   0.0057   1st Qu.:    0.004   1st Qu.:     15  
##  Median :   0.0596   Median :    0.036   Median :     35  
##  Mean   :   0.1212   Mean   :    0.192   Mean   :    278  
##  3rd Qu.:   0.1517   3rd Qu.:    0.091   3rd Qu.:     64  
##  Max.   : 649.2300   Max.   : 9230.500   Max.   :7809200  
##  NA's   :8           NA's   :114         NA's   :113      
##      Attr21              Attr22              Attr23         
##  Min.   :-1325.000   Min.   :-216.8000   Min.   :-1325.600  
##  1st Qu.:    0.908   1st Qu.:   0.0000   1st Qu.:    0.002  
##  Median :    1.045   Median :   0.0622   Median :    0.030  
##  Mean   :    4.035   Mean   :   0.1356   Mean   :    0.179  
##  3rd Qu.:    1.204   3rd Qu.:   0.1507   3rd Qu.:    0.078  
##  Max.   :29907.000   Max.   : 681.5400   Max.   : 9230.500  
##  NA's   :4983        NA's   :8           NA's   :113        
##      Attr24              Attr25              Attr26         
##  Min.   :-314.3700   Min.   :-500.9300   Min.   :-6331.800  
##  1st Qu.:   0.0212   1st Qu.:   0.1478   1st Qu.:    0.066  
##  Median :   0.1550   Median :   0.3840   Median :    0.221  
##  Mean   :   0.3000   Mean   :   0.3875   Mean   :    1.360  
##  3rd Qu.:   0.3559   3rd Qu.:   0.6098   3rd Qu.:    0.598  
##  Max.   : 831.6600   Max.   :1353.3000   Max.   : 8262.300  
##  NA's   :794         NA's   :8           NA's   :80         
##      Attr27            Attr28              Attr29       
##  Min.   :-259010   Min.   :-3829.900   Min.   :-0.8861  
##  1st Qu.:      0   1st Qu.:    0.040   1st Qu.: 3.4859  
##  Median :      1   Median :    0.468   Median : 4.0033  
##  Mean   :   1188   Mean   :    5.640   Mean   : 3.9963  
##  3rd Qu.:      5   3rd Qu.:    1.511   3rd Qu.: 4.5125  
##  Max.   :4208800   Max.   :21701.000   Max.   : 9.6983  
##  NA's   :2343      NA's   :694         NA's   :8        
##      Attr30              Attr31              Attr32        
##  Min.   : -4940.00   Min.   :-1325.600   Min.   :   -9296  
##  1st Qu.:     0.08   1st Qu.:    0.007   1st Qu.:      46  
##  Median :     0.22   Median :    0.043   Median :      78  
##  Mean   :     8.78   Mean   :    0.215   Mean   :     926  
##  3rd Qu.:     0.41   3rd Qu.:    0.102   3rd Qu.:     128  
##  Max.   :152860.00   Max.   : 9244.300   Max.   :17364000  
##  NA's   :113         NA's   :113         NA's   :312       
##      Attr33              Attr34              Attr35         
##  Min.   :   -7.235   Min.   :-1696.000   Min.   :-169.4700  
##  1st Qu.:    2.821   1st Qu.:    0.312   1st Qu.:   0.0058  
##  Median :    4.621   Median :    1.978   Median :   0.0606  
##  Mean   :    8.743   Mean   :    5.521   Mean   :   0.1286  
##  3rd Qu.:    7.816   3rd Qu.:    4.561   3rd Qu.:   0.1509  
##  Max.   :21944.000   Max.   :21944.000   Max.   : 626.9200  
##  NA's   :111         NA's   :79          NA's   :8          
##      Attr36             Attr37             Attr38         
##  Min.   :  -0.001   Min.   :  -525.5   Min.   :-479.9100  
##  1st Qu.:   1.104   1st Qu.:     1.1   1st Qu.:   0.4192  
##  Median :   1.646   Median :     3.1   Median :   0.6130  
##  Mean   :   2.907   Mean   :   110.8   Mean   :   0.7483  
##  3rd Qu.:   2.425   3rd Qu.:    11.5   3rd Qu.:   0.7722  
##  Max.   :9742.300   Max.   :398920.0   Max.   :1099.5000  
##  NA's   :8          NA's   :16093      NA's   :8          
##      Attr39              Attr40             Attr41         
##  Min.   :-7522.000   Min.   :-101.270   Min.   : -1234.40  
##  1st Qu.:    0.004   1st Qu.:   0.052   1st Qu.:     0.03  
##  Median :    0.037   Median :   0.176   Median :     0.09  
##  Mean   :   -0.328   Mean   :   2.268   Mean   :     8.96  
##  3rd Qu.:    0.092   3rd Qu.:   0.653   3rd Qu.:     0.20  
##  Max.   : 2156.500   Max.   :8007.100   Max.   :288770.00  
##  NA's   :113         NA's   :111        NA's   :650        
##      Attr42               Attr43             Attr44        
##  Min.   :-1395.8000   Min.   : -115870   Min.   : -115870  
##  1st Qu.:    0.0000   1st Qu.:      67   1st Qu.:      35  
##  Median :    0.0379   Median :     100   Median :      55  
##  Mean   :   -0.1348   Mean   :    1153   Mean   :     875  
##  3rd Qu.:    0.0921   3rd Qu.:     141   3rd Qu.:      80  
##  Max.   : 2156.8000   Max.   :30393000   Max.   :22584000  
##  NA's   :113          NA's   :113        NA's   :113       
##      Attr45              Attr46             Attr47       
##  Min.   :-256230.0   Min.   : -101.26   Min.   :    -53  
##  1st Qu.:      0.0   1st Qu.:    0.61   1st Qu.:     16  
##  Median :      0.3   Median :    1.03   Median :     38  
##  Mean   :     16.4   Mean   :    5.94   Mean   :    281  
##  3rd Qu.:      1.0   3rd Qu.:    1.92   3rd Qu.:     71  
##  Max.   : 366030.0   Max.   :53433.00   Max.   :6084200  
##  NA's   :1829        NA's   :112        NA's   :252      
##      Attr48              Attr49              Attr50        
##  Min.   :-218.4200   Min.   :-9001.000   Min.   :   -0.05  
##  1st Qu.:  -0.0387   1st Qu.:   -0.027   1st Qu.:    0.78  
##  Median :   0.0184   Median :    0.011   Median :    1.22  
##  Mean   :   0.0507   Mean   :   -0.527   Mean   :    6.44  
##  3rd Qu.:   0.1082   3rd Qu.:    0.062   3rd Qu.:    2.21  
##  Max.   : 623.8500   Max.   :  107.680   Max.   :53433.00  
##  NA's   :9           NA's   :113         NA's   :79        
##      Attr51             Attr52             Attr53         
##  Min.   :  0.0000   Min.   :  -25.47   Min.   : -3828.90  
##  1st Qu.:  0.1899   1st Qu.:    0.13   1st Qu.:     0.69  
##  Median :  0.3405   Median :    0.21   Median :     1.21  
##  Mean   :  0.4719   Mean   :    4.63   Mean   :    26.45  
##  3rd Qu.:  0.5356   3rd Qu.:    0.35   3rd Qu.:     2.24  
##  Max.   :480.9600   Max.   :88433.00   Max.   :180440.00  
##  NA's   :8          NA's   :256        NA's   :694        
##      Attr54              Attr55             Attr56          
##  Min.   : -3828.90   Min.   :-1805200   Min.   :-1108300.0  
##  1st Qu.:     0.96   1st Qu.:      30   1st Qu.:       0.0  
##  Median :     1.38   Median :    1078   Median :       0.1  
##  Mean   :    27.18   Mean   :    7870   Mean   :     -31.1  
##  3rd Qu.:     2.39   3rd Qu.:    4945   3rd Qu.:       0.1  
##  Max.   :180440.00   Max.   : 6123700   Max.   :     112.0  
##  NA's   :694                            NA's   :113         
##      Attr57               Attr58              Attr59         
##  Min.   :-1667.3000   Min.   :   -198.7   Min.   : -327.970  
##  1st Qu.:    0.0147   1st Qu.:      0.9   1st Qu.:    0.000  
##  Median :    0.1205   Median :      1.0   Median :    0.006  
##  Mean   :   -0.0179   Mean   :     35.0   Mean   :    1.481  
##  3rd Qu.:    0.2863   3rd Qu.:      1.0   3rd Qu.:    0.235  
##  Max.   :  552.6400   Max.   :1108300.0   Max.   :23853.000  
##  NA's   :7            NA's   :76          NA's   :7          
##      Attr60            Attr61              Attr62        
##  Min.   :    -12   Min.   :   -12.66   Min.   :  -14965  
##  1st Qu.:      6   1st Qu.:     4.51   1st Qu.:      42  
##  Median :     10   Median :     6.63   Median :      71  
##  Mean   :    482   Mean   :    17.68   Mean   :    1784  
##  3rd Qu.:     20   3rd Qu.:    10.38   3rd Qu.:     117  
##  Max.   :4818700   Max.   :108000.00   Max.   :25016000  
##  NA's   :1833      NA's   :86          NA's   :113       
##      Attr63              Attr64              target       
##  Min.   :   -0.368   Min.   :    -3.73   Min.   :0.00000  
##  1st Qu.:    3.099   1st Qu.:     2.19   1st Qu.:0.00000  
##  Median :    5.079   Median :     4.31   Median :0.00000  
##  Mean   :    9.417   Mean   :    77.35   Mean   :0.04829  
##  3rd Qu.:    8.607   3rd Qu.:     9.79   3rd Qu.:0.00000  
##  Max.   :23454.000   Max.   :294770.00   Max.   :1.00000  
##  NA's   :111         NA's   :694

head(bank_data)

##   ID     Attr1    Attr2    Attr3   Attr4    Attr5    Attr6     Attr7
## 1  1 0.1353700 0.451850  0.31162  2.0469  10.2340  0.16768 0.1676300
## 2  2 0.0058613 0.398580  0.19768  1.9390   9.5771  0.00000 0.0072373
## 3  3 0.1106000 0.161170  0.47894  7.5711 263.9000  0.00000 0.1383600
## 4  4 0.0639110 1.407300 -0.29595  0.5291 -46.2870 -0.71420 0.0790710
## 5  5 0.1316800 0.660310  0.44121  2.7098 -23.5960 -0.20007 0.1319100
## 6  6 0.2541100 0.022149  0.69694 33.2270  86.6400  0.00000 0.2541100
##      Attr8   Attr9   Attr10   Attr11   Attr12   Attr13    Attr14   Attr15
## 1  1.21310 2.25540  0.54815 0.183310  0.56316 0.089220 0.1676300  819.600
## 2  1.50890 0.97880  0.60142 0.029484  0.03438 0.036362 0.0072373 4087.600
## 3  5.20450 0.68447  0.83883 0.138830  1.89830 0.236220 0.1383600  363.850
## 4 -0.33076 0.98490 -0.46548 0.079071  0.12581 0.030274 0.0790710 5433.400
## 5  0.51445 2.13570  0.33969 0.186110  0.51117 0.167100 0.1319100  675.350
## 6 44.14900 1.97160  0.97785 0.264670 11.75000 0.150150 0.2541100   27.309
##      Attr16   Attr17    Attr18    Attr19 Attr20  Attr21    Attr22
## 1  0.445340  2.21310 0.1676300 0.0743250 40.156      NA 0.1811500
## 2  0.089295  2.50890 0.0072373 0.0073941 58.670 0.97850 0.0060182
## 3  1.003200  6.20450 0.1383600 0.2021400 38.625 1.83520 0.1166900
## 4  0.067177  0.71058 0.0790710 0.0253210 12.470 0.44909 0.1378700
## 5  0.540460  1.51440 0.1319100 0.0617630 80.288 1.06000 0.1827200
## 6 13.366000 45.14900 0.2541100 0.1288800 53.382 1.33860 0.2590200
##      Attr23    Attr24    Attr25    Attr26    Attr27   Attr28 Attr29
## 1 0.0600190  0.167630  0.511480  0.373930  11.55400  0.79756 3.5818
## 2 0.0059882  0.057817  0.547400  0.085842   0.27052  0.33402 5.6431
## 3 0.1615900  0.235380  0.834900  0.830940 249.30000  1.06860 4.1045
## 4 0.0204660 -0.689290 -0.465480  0.056405   0.43483 -0.44339 4.6625
## 5 0.0616560  0.208580  0.048201  0.540110   3.37080  1.46710 2.2343
## 6 0.1288800  0.603570  0.835400 13.366000  24.51900  2.47640 3.6387
##      Attr30    Attr31   Attr32  Attr33    Attr34  Attr35  Attr36  Attr37
## 1  0.155790 0.0799860  52.2890  6.9805  4.598500 0.17756 2.25540 3.58820
## 2  0.407210 0.0073941 107.4900  3.6924  1.950100 0.26400 0.97880 1.33380
## 3 -0.162880 0.2028300  46.2630  7.8896  3.567900 0.10943 0.68447 5.58440
## 4  0.450030 0.0253210  72.3500  5.0450  0.097967 0.13787 3.50080 0.28998
## 5  0.305820 0.0790120  49.5620  7.3646  2.878100 0.23529 2.13570 0.57046
## 6 -0.019578 0.1289300   4.5636 79.9800 78.090000 0.24202 1.97160      NA
##    Attr38   Attr39    Attr40    Attr41    Attr42  Attr43  Attr44   Attr45
## 1 0.64880 0.078728 0.3555900 0.0701390 0.0803160  81.473  41.317 0.545540
## 2 0.78949 0.269720 0.1049000 0.3865400 0.0061486 143.980  85.309 0.037254
## 3 0.92467 0.159870 3.7693000 0.0383720 0.1704800 147.760 109.140 1.527000
## 4 0.31335 0.044150 0.0079827 0.3017400 0.0441500  38.281  25.811 0.599050
## 5 0.74195 0.110170 0.3426300 0.0539890 0.0855550 104.400  24.108 0.280300
## 6 0.97785 0.122750 2.9208000 0.0024532 0.1313700 121.330  67.949 0.881220
##     Attr46 Attr47    Attr48    Attr49   Attr50   Attr51   Attr52   Attr53
## 1  1.21330 43.588  0.147550  0.065421  1.34840 0.297670 0.143260  1.40290
## 2  1.19170 73.881 -0.022335 -0.022819  1.02410 0.210510 0.270830  1.01620
## 3  6.57730 45.975  0.093362  0.136400  3.42380 0.072886 0.126750  1.87160
## 4  0.35935 12.282  0.122400  0.039196  0.23629 0.628480 0.198220 -0.69738
## 5  0.88928 90.228 -0.042243 -0.019780  1.05900 0.258050 0.135790  1.12950
## 6 19.89300 60.852  0.217090  0.110110 32.44200 0.021626 0.012503  3.47450
##    Attr54     Attr55    Attr56   Attr57  Attr58   Attr59  Attr60  Attr61
## 1 1.66060   1189.700  0.078728  0.24695 0.92586  0.18362  9.0895  8.8342
## 2 1.33400      1.939  0.269720  0.00000 0.79303  0.31271  6.2213  4.2785
## 3 2.06320   6092.300  0.159870  0.13185 0.80748  0.10234  9.4499  3.3443
## 4 0.46946 -13606.000 -0.015327 -0.13730 1.01530 -1.67320 29.2710 14.1410
## 5 2.46710     75.681  0.110170  0.38764 0.93881  1.18420  4.5462 15.1400
## 6 3.47450   3033.300  0.122750  0.25986 0.87268  0.00000  6.8375  5.3717
##    Attr62  Attr63 Attr64 target
## 1 48.1720  7.5770 5.7725      0
## 2 78.4990  4.6497 1.6539      0
## 3 38.8670  9.3910 1.5272      0
## 4 73.4580  4.9688 4.6785      0
## 5 44.1010  8.2764 7.1014      0
## 6  4.0034 91.1710 7.0057      0

tail(bank_data)

##          ID     Attr1   Attr2     Attr3   Attr4    Attr5      Attr6
## 36548 36548 -0.117490 1.10800 -0.224310 0.79755 -201.640 -0.0593250
## 36549 36549  0.425980 0.18731  0.361740 3.00860   50.257  1.2929000
## 36550 36550 -0.016238 0.43902  0.069466 1.16780 -191.590  0.0045487
## 36551 36551  0.073750 0.18138  0.219690 3.50200   66.436  0.1694200
## 36552 36552 -0.871080 1.16990 -0.187060 0.84010  -62.657 -0.0281450
## 36553 36553  0.063897 0.84939  0.285080 1.55680   -3.031 -0.1169200
##           Attr7     Attr8   Attr9   Attr10    Attr11    Attr12    Attr13
## 36548 -0.117490 -0.097464 0.92515 -0.10799 -0.117490 -0.106040 -0.062038
## 36549  0.425980  3.967200 1.38350  0.74311  0.425980  2.365200  0.358000
## 36550 -0.021788  1.108500 0.94926  0.48663 -0.021788 -0.052624  0.052201
## 36551  0.092655  4.513200 1.09190  0.81862  0.092655  1.055300  0.247430
## 36552 -0.871080 -0.145210 1.89250 -0.16988 -0.643470 -0.744590 -0.460280
## 36553  0.083468  0.177310 4.83790  0.15060  0.114700  0.163030  0.022181
##          Attr14   Attr15    Attr16  Attr17    Attr18    Attr19  Attr20
## 36548 -0.117490 -4634.60 -0.078756 0.90254 -0.117490 -0.083527 155.200
## 36549  0.425980   143.04  2.551700 5.33860  0.425980  0.319050  62.803
## 36550 -0.021788  5365.50  0.068027 2.27780 -0.021788 -0.038084 226.770
## 36551  0.092655   272.72  1.338400 5.51320  0.092655  0.094437  14.523
## 36552 -0.871080  -490.20 -0.744590 0.85479 -0.871080 -0.460280  57.237
## 36553  0.083468  2889.20  0.126330 1.17730  0.083468  0.017253  23.837
##        Attr21     Attr22    Attr23     Attr24   Attr25    Attr26   Attr27
## 36548 0.82328 -0.1192400 -0.083527 -0.0593250 -0.10799 -0.078756 -0.78427
## 36549 0.94648  0.4258900  0.319050  1.2929000  0.74311  2.551700  4.41330
## 36550 1.25910 -0.0086335 -0.028383  0.0073332  0.48663  0.080669 -0.14325
## 36551 1.06550  0.1015600  0.075168  0.2137100  0.81862  1.234200  1.13030
## 36552 1.07610 -0.6434700 -0.460280 -1.6029000 -0.89923 -0.744590 -2.82700
## 36553 0.90215  0.0951480  0.013208 -0.0285290  0.10235  0.103290  3.04660
##          Attr28 Attr29   Attr30    Attr31  Attr32  Attr33    Attr34
## 36548  -1.92840 4.3581 0.712570 -0.083527 266.000  1.3722 -0.107620
## 36549   0.78955 4.6393 0.055119  0.319050  68.120  5.3582  2.273700
## 36550   0.13450 4.7117 0.764470 -0.038084 250.750  1.4556 -0.019665
## 36551   0.31723 4.4366 0.101630  0.094437  35.667 10.2330  0.559910
## 36552 -10.88900 1.8321 0.593950 -0.460150 153.610  2.3762  2.376200
## 36553   1.40590 3.7510 0.138870  0.023709  39.432  9.3064  5.609500
##           Attr35  Attr36  Attr37   Attr38    Attr39   Attr40    Attr41
## 36548 -0.1192400 1.42300      NA -0.10799 -0.084773 0.106480 -0.409240
## 36549  0.4258900 1.40480 43.2640  0.75033  0.318990 0.656740  0.012886
## 36550 -0.0086335 0.61816  5.1275  0.51161 -0.015090 0.021591  0.335510
## 36551  0.1015600 1.02290  2.8687  0.91220  0.103510 1.414600  0.023695
## 36552 -0.8873900 1.89250      NA -0.16988 -0.468900 0.039169 -0.060603
## 36553  0.0988740 4.87930  1.6105  0.44934  0.020437 0.364520  0.237950
##          Attr42  Attr43  Attr44    Attr45  Attr46  Attr47    Attr48
## 36548 -0.084773 198.700  43.500 -0.196440 0.25777 143.580 -0.149460
## 36549  0.318990 115.790  52.991  1.854300 1.73300  86.890  0.373890
## 36550 -0.015090 302.770  75.998 -0.045684 0.30930 215.260 -0.060287
## 36551  0.103510  68.186  53.663  1.889200 3.05740  15.858 -0.048547
## 36552 -0.340010 180.720 123.480 -2.935200 0.58643  38.966 -0.643470
## 36553  0.019667  46.055  22.218  0.202240 0.98164  24.204  0.071309
##         Attr49  Attr50   Attr51   Attr52   Attr53   Attr54    Attr55
## 36548 -0.10626 0.79755 1.108000 0.728760 -0.92837 -0.92837 -5116.500
## 36549  0.28004 2.89270 0.180100 0.186630  1.62200  1.63770 15766.000
## 36550 -0.10538 1.10130 0.414040 0.686980  0.94219  0.99054  3576.300
## 36551 -0.04948 1.69530 0.087803 0.097719  1.18210  1.31720  6003.900
## 36552 -0.34001 0.84010 1.169900 0.420840 -9.88950 -9.88950   -12.708
## 36553  0.01474 0.93839 0.511970 0.107450  0.74273  2.21600  1607.000
##          Attr56    Attr57  Attr58   Attr59  Attr60  Attr61  Attr62  Attr63
## 36548 -0.080906  1.088000 1.08090 0.000000  2.3518  8.3909 287.520  1.2695
## 36549  0.277220  0.573240 0.72278 0.009708  5.8118  6.8880  49.236  7.4133
## 36550 -0.053449 -0.033368 1.05340 0.051323  1.6096  4.8027 264.150  1.3818
## 36551  0.084195  0.090091 0.91581 0.114310 25.1320  6.8017  32.664 11.1740
## 36552 -0.468900  5.127500 1.44510 0.000000  6.3770  2.9560 225.630  1.6177
## 36553  0.020425  0.424280 0.97650 1.983700 15.3120 16.4280  38.627  9.4494
##         Attr64 target
## 36548  12.0920      0
## 36549   2.9141      0
## 36550   1.1077      0
## 36551   1.4168      0
## 36552 110.1700      0
## 36553  23.8590      1

Data Description

Attr1 net profit / total assets Attr2 total liabilities / total assets Attr3 working capital / total assets Attr4 current assets / short-term liabilities Attr5 [(cash + short-term securities + receivables - short-term liabilities) / (operating expenses - depreciation)] * 365 Attr6 retained earnings / total assets Attr7 EBIT / total assets Attr8 book value of equity / total liabilities Attr9 sales / total assets Attr10 equity / total assets Attr11 (gross profit + extraordinary items + financial expenses) / total assets Attr12 gross profit / short-term liabilities Attr13 (gross profit + depreciation) / sales Attr14 (gross profit + interest) / total assets Attr15 (total liabilities * 365) / (gross profit + depreciation) Attr16 (gross profit + depreciation) / total liabilities Attr17 total assets / total liabilities Attr18 gross profit / total assets Attr19 gross profit / sales Attr20 (inventory * 365) / sales Attr21 sales (n) / sales (n-1) Attr22 profit on operating activities / total assets Attr23 net profit / sales Attr24 gross profit (in 3 years) / total assets Attr25 (equity - share capital) / total assets Attr26 (net profit + depreciation) / total liabilities Attr27 profit on operating activities / financial expenses Attr28 working capital / fixed assets Attr29 logarithm of total assets Attr30 (total liabilities - cash) / sales Attr31 (gross profit + interest) / sales Attr32 (current liabilities * 365) / cost of products sold Attr33 operating expenses / short-term liabilities Attr34 operating expenses / total liabilities Attr35 profit on sales / total assets Attr36 total sales / total assets Attr37 (current assets - inventories) / long-term liabilities Attr38 constant capital / total assets Attr39 profit on sales / sales Attr40 (current assets - inventory - receivables) / short-term liabilities Attr41 total liabilities / ((profit on operating activities + depreciation) * (12/365)) Attr42 profit on operating activities / sales Attr43 rotation receivables + inventory turnover in days Attr44 (receivables * 365) / sales Attr45 net profit / inventory Attr46 (current assets - inventory) / short-term liabilities Attr47 (inventory * 365) / cost of products sold Attr48 EBITDA (profit on operating activities - depreciation) / total assets Attr49 EBITDA (profit on operating activities - depreciation) / sales Attr50 current assets / total liabilities Attr51 short-term liabilities / total assets Attr52 (short-term liabilities * 365) / cost of products sold) Attr53 equity / fixed assets Attr54 constant capital / fixed assets Attr55 working capital Attr56 (sales - cost of products sold) / sales Attr57 (current assets - inventory - short-term liabilities) / (sales - gross profit - depreciation) Attr58 total costs /total sales Attr59 long-term liabilities / equity Attr60 sales / inventory Attr61 sales / receivables Attr62 (short-term liabilities *365) / sales Attr63 sales / short-term liabilities Attr64 sales / fixed assets

Data Pre-processing

Verify the data types assigned to the variables in the dataset

str(bank_data)

## 'data.frame':    36553 obs. of  66 variables:
##  $ ID    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Attr1 : num  0.13537 0.00586 0.1106 0.06391 0.13168 ...
##  $ Attr2 : num  0.452 0.399 0.161 1.407 0.66 ...
##  $ Attr3 : num  0.312 0.198 0.479 -0.296 0.441 ...
##  $ Attr4 : num  2.047 1.939 7.571 0.529 2.71 ...
##  $ Attr5 : num  10.23 9.58 263.9 -46.29 -23.6 ...
##  $ Attr6 : num  0.168 0 0 -0.714 -0.2 ...
##  $ Attr7 : num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr8 : num  1.213 1.509 5.205 -0.331 0.514 ...
##  $ Attr9 : num  2.255 0.979 0.684 0.985 2.136 ...
##  $ Attr10: num  0.548 0.601 0.839 -0.465 0.34 ...
##  $ Attr11: num  0.1833 0.0295 0.1388 0.0791 0.1861 ...
##  $ Attr12: num  0.5632 0.0344 1.8983 0.1258 0.5112 ...
##  $ Attr13: num  0.0892 0.0364 0.2362 0.0303 0.1671 ...
##  $ Attr14: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr15: num  820 4088 364 5433 675 ...
##  $ Attr16: num  0.4453 0.0893 1.0032 0.0672 0.5405 ...
##  $ Attr17: num  2.213 2.509 6.205 0.711 1.514 ...
##  $ Attr18: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr19: num  0.07433 0.00739 0.20214 0.02532 0.06176 ...
##  $ Attr20: num  40.2 58.7 38.6 12.5 80.3 ...
##  $ Attr21: num  NA 0.979 1.835 0.449 1.06 ...
##  $ Attr22: num  0.18115 0.00602 0.11669 0.13787 0.18272 ...
##  $ Attr23: num  0.06002 0.00599 0.16159 0.02047 0.06166 ...
##  $ Attr24: num  0.1676 0.0578 0.2354 -0.6893 0.2086 ...
##  $ Attr25: num  0.5115 0.5474 0.8349 -0.4655 0.0482 ...
##  $ Attr26: num  0.3739 0.0858 0.8309 0.0564 0.5401 ...
##  $ Attr27: num  11.554 0.271 249.3 0.435 3.371 ...
##  $ Attr28: num  0.798 0.334 1.069 -0.443 1.467 ...
##  $ Attr29: num  3.58 5.64 4.1 4.66 2.23 ...
##  $ Attr30: num  0.156 0.407 -0.163 0.45 0.306 ...
##  $ Attr31: num  0.07999 0.00739 0.20283 0.02532 0.07901 ...
##  $ Attr32: num  52.3 107.5 46.3 72.3 49.6 ...
##  $ Attr33: num  6.98 3.69 7.89 5.04 7.36 ...
##  $ Attr34: num  4.598 1.95 3.568 0.098 2.878 ...
##  $ Attr35: num  0.178 0.264 0.109 0.138 0.235 ...
##  $ Attr36: num  2.255 0.979 0.684 3.501 2.136 ...
##  $ Attr37: num  3.59 1.33 5.58 0.29 0.57 ...
##  $ Attr38: num  0.649 0.789 0.925 0.313 0.742 ...
##  $ Attr39: num  0.0787 0.2697 0.1599 0.0442 0.1102 ...
##  $ Attr40: num  0.35559 0.1049 3.7693 0.00798 0.34263 ...
##  $ Attr41: num  0.0701 0.3865 0.0384 0.3017 0.054 ...
##  $ Attr42: num  0.08032 0.00615 0.17048 0.04415 0.08556 ...
##  $ Attr43: num  81.5 144 147.8 38.3 104.4 ...
##  $ Attr44: num  41.3 85.3 109.1 25.8 24.1 ...
##  $ Attr45: num  0.5455 0.0373 1.527 0.599 0.2803 ...
##  $ Attr46: num  1.213 1.192 6.577 0.359 0.889 ...
##  $ Attr47: num  43.6 73.9 46 12.3 90.2 ...
##  $ Attr48: num  0.1475 -0.0223 0.0934 0.1224 -0.0422 ...
##  $ Attr49: num  0.0654 -0.0228 0.1364 0.0392 -0.0198 ...
##  $ Attr50: num  1.348 1.024 3.424 0.236 1.059 ...
##  $ Attr51: num  0.2977 0.2105 0.0729 0.6285 0.2581 ...
##  $ Attr52: num  0.143 0.271 0.127 0.198 0.136 ...
##  $ Attr53: num  1.403 1.016 1.872 -0.697 1.129 ...
##  $ Attr54: num  1.661 1.334 2.063 0.469 2.467 ...
##  $ Attr55: num  1189.7 1.94 6092.3 -13606 75.68 ...
##  $ Attr56: num  0.0787 0.2697 0.1599 -0.0153 0.1102 ...
##  $ Attr57: num  0.247 0 0.132 -0.137 0.388 ...
##  $ Attr58: num  0.926 0.793 0.807 1.015 0.939 ...
##  $ Attr59: num  0.184 0.313 0.102 -1.673 1.184 ...
##  $ Attr60: num  9.09 6.22 9.45 29.27 4.55 ...
##  $ Attr61: num  8.83 4.28 3.34 14.14 15.14 ...
##  $ Attr62: num  48.2 78.5 38.9 73.5 44.1 ...
##  $ Attr63: num  7.58 4.65 9.39 4.97 8.28 ...
##  $ Attr64: num  5.77 1.65 1.53 4.68 7.1 ...
##  $ target: int  0 0 0 0 0 0 0 0 0 0 ...

Plot the data to understand

par(mfrow = c(2,2))

plot(bank_data[,"Attr9"],bank_data[,"Attr10"],xlab="sale / total assets",ylab="equity / total assets",type="p",main="sales and equity" )
plot(bank_data[,"Attr18"],bank_data[,"Attr24"],xlab="gross profit / total assets",ylab="gross profit (in 3 years) / total assets",type="p",main="gross profit now and in 3 years" )
plot(bank_data[,"Attr22"],bank_data[,"Attr7"],xlab="profit on operating activities / total assets ",ylab="EBIT / total assets",type="p",main="EBIT and profit on operating activities" )
plot(bank_data[,"Attr2"],bank_data[,"Attr3"],xlab="total liabilities / total assets",ylab="working capital / total assets",type="p",main="working capital and total liabilities" )

Feature Engineering

Substracting current gross income from gross income in 3 years

str(bank_data)

## 'data.frame':    36553 obs. of  66 variables:
##  $ ID    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Attr1 : num  0.13537 0.00586 0.1106 0.06391 0.13168 ...
##  $ Attr2 : num  0.452 0.399 0.161 1.407 0.66 ...
##  $ Attr3 : num  0.312 0.198 0.479 -0.296 0.441 ...
##  $ Attr4 : num  2.047 1.939 7.571 0.529 2.71 ...
##  $ Attr5 : num  10.23 9.58 263.9 -46.29 -23.6 ...
##  $ Attr6 : num  0.168 0 0 -0.714 -0.2 ...
##  $ Attr7 : num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr8 : num  1.213 1.509 5.205 -0.331 0.514 ...
##  $ Attr9 : num  2.255 0.979 0.684 0.985 2.136 ...
##  $ Attr10: num  0.548 0.601 0.839 -0.465 0.34 ...
##  $ Attr11: num  0.1833 0.0295 0.1388 0.0791 0.1861 ...
##  $ Attr12: num  0.5632 0.0344 1.8983 0.1258 0.5112 ...
##  $ Attr13: num  0.0892 0.0364 0.2362 0.0303 0.1671 ...
##  $ Attr14: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr15: num  820 4088 364 5433 675 ...
##  $ Attr16: num  0.4453 0.0893 1.0032 0.0672 0.5405 ...
##  $ Attr17: num  2.213 2.509 6.205 0.711 1.514 ...
##  $ Attr18: num  0.16763 0.00724 0.13836 0.07907 0.13191 ...
##  $ Attr19: num  0.07433 0.00739 0.20214 0.02532 0.06176 ...
##  $ Attr20: num  40.2 58.7 38.6 12.5 80.3 ...
##  $ Attr21: num  NA 0.979 1.835 0.449 1.06 ...
##  $ Attr22: num  0.18115 0.00602 0.11669 0.13787 0.18272 ...
##  $ Attr23: num  0.06002 0.00599 0.16159 0.02047 0.06166 ...
##  $ Attr24: num  0.1676 0.0578 0.2354 -0.6893 0.2086 ...
##  $ Attr25: num  0.5115 0.5474 0.8349 -0.4655 0.0482 ...
##  $ Attr26: num  0.3739 0.0858 0.8309 0.0564 0.5401 ...
##  $ Attr27: num  11.554 0.271 249.3 0.435 3.371 ...
##  $ Attr28: num  0.798 0.334 1.069 -0.443 1.467 ...
##  $ Attr29: num  3.58 5.64 4.1 4.66 2.23 ...
##  $ Attr30: num  0.156 0.407 -0.163 0.45 0.306 ...
##  $ Attr31: num  0.07999 0.00739 0.20283 0.02532 0.07901 ...
##  $ Attr32: num  52.3 107.5 46.3 72.3 49.6 ...
##  $ Attr33: num  6.98 3.69 7.89 5.04 7.36 ...
##  $ Attr34: num  4.598 1.95 3.568 0.098 2.878 ...
##  $ Attr35: num  0.178 0.264 0.109 0.138 0.235 ...
##  $ Attr36: num  2.255 0.979 0.684 3.501 2.136 ...
##  $ Attr37: num  3.59 1.33 5.58 0.29 0.57 ...
##  $ Attr38: num  0.649 0.789 0.925 0.313 0.742 ...
##  $ Attr39: num  0.0787 0.2697 0.1599 0.0442 0.1102 ...
##  $ Attr40: num  0.35559 0.1049 3.7693 0.00798 0.34263 ...
##  $ Attr41: num  0.0701 0.3865 0.0384 0.3017 0.054 ...
##  $ Attr42: num  0.08032 0.00615 0.17048 0.04415 0.08556 ...
##  $ Attr43: num  81.5 144 147.8 38.3 104.4 ...
##  $ Attr44: num  41.3 85.3 109.1 25.8 24.1 ...
##  $ Attr45: num  0.5455 0.0373 1.527 0.599 0.2803 ...
##  $ Attr46: num  1.213 1.192 6.577 0.359 0.889 ...
##  $ Attr47: num  43.6 73.9 46 12.3 90.2 ...
##  $ Attr48: num  0.1475 -0.0223 0.0934 0.1224 -0.0422 ...
##  $ Attr49: num  0.0654 -0.0228 0.1364 0.0392 -0.0198 ...
##  $ Attr50: num  1.348 1.024 3.424 0.236 1.059 ...
##  $ Attr51: num  0.2977 0.2105 0.0729 0.6285 0.2581 ...
##  $ Attr52: num  0.143 0.271 0.127 0.198 0.136 ...
##  $ Attr53: num  1.403 1.016 1.872 -0.697 1.129 ...
##  $ Attr54: num  1.661 1.334 2.063 0.469 2.467 ...
##  $ Attr55: num  1189.7 1.94 6092.3 -13606 75.68 ...
##  $ Attr56: num  0.0787 0.2697 0.1599 -0.0153 0.1102 ...
##  $ Attr57: num  0.247 0 0.132 -0.137 0.388 ...
##  $ Attr58: num  0.926 0.793 0.807 1.015 0.939 ...
##  $ Attr59: num  0.184 0.313 0.102 -1.673 1.184 ...
##  $ Attr60: num  9.09 6.22 9.45 29.27 4.55 ...
##  $ Attr61: num  8.83 4.28 3.34 14.14 15.14 ...
##  $ Attr62: num  48.2 78.5 38.9 73.5 44.1 ...
##  $ Attr63: num  7.58 4.65 9.39 4.97 8.28 ...
##  $ Attr64: num  5.77 1.65 1.53 4.68 7.1 ...
##  $ target: int  0 0 0 0 0 0 0 0 0 0 ...

bank_data1= cbind(bank_data,data.frame(bank_data$Attr24-bank_data$Attr18))

Check for missing values

sum(is.na(bank_data))

## [1] 35187

bank_data=centralImputation(bank_data)
sum(is.na(bank_data))

## [1] 0

Check for class imbalance

prop.table(table(bank_data$target))

## 
##          0          1 
## 0.95171395 0.04828605

bank_data_rose <- ROSE(target~ ., data=bank_data, seed=111)$data
prop.table(table(bank_data_rose$target))

## 
##        0        1 
## 0.500807 0.499193

Find the corelation between the features.

cat_var="target"
num_var=setdiff(names(bank_data),cat_var)
corrplot( cor(bank_data_rose[,num_var]), method="circle", tl.cex = 0.5, tl.col = 'black',  order = "hclust", diag = FALSE)

corrplot(cor(bank_data_rose[,num_var]), method="shade",type = "full")

Split the Data into train and test sets

Use stratified sampling to split the data into train/test sets (70/30)
Use the createDataPartition() function from the caret package to do stratified sampling

# Set the seed after attaching the caret package

set.seed(111)

# The first argument is the imbalanced class reference variable, the second is the proportion to sample

# Remember to include list = F as the function returns a list otherwise which would not be able to subset a dataframe

trainIndex <- createDataPartition(bank_data$target, p = .7, list = F)

train_data <- bank_data[trainIndex, ]

test_data <- bank_data[-trainIndex, ]

Build a Decision Tree

Model the tree

We will be using Quinlan’s C5.0 decision tree algorithm implementation from the C50 package to build our decision tree

train_data$target<-as.factor(train_data$target)
test_data$target<-as.factor(test_data$target)
str(train_data$target)

##  Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

c5_tree <- C5.0(target ~ . , train_data)

# Use the rules = T argument if you want to extract rules later from the model

c5_rules <- C5.0(target ~ . , train_data, rules = T)

Variable Importance in trees

Find the importance of each variable in the dataset using the c5imp() function
The default metric “usage” in the c5imp function gives the percentage of data being split by using the attribute at that particular time. So variable used for splitting at the root node always has 100, and the variables at the leaf nodes might be close to 0 if there is no data remaining to classify

C5imp(c5_tree, metric = "usage")

##        Overall
## Attr41  100.00
## Attr27   98.93
## Attr46   93.18
## Attr24   72.54
## Attr34   27.54
## Attr5    23.10
## Attr56   11.09
## Attr58    9.90
## Attr35    9.74
## Attr8     6.68
## Attr60    6.53
## Attr42    5.26
## Attr9     4.40
## Attr6     2.86
## Attr15    2.78
## Attr25    1.73
## Attr44    1.15
## Attr23    0.50
## Attr22    0.42
## Attr13    0.35
## Attr39    0.28
## Attr19    0.25
## Attr48    0.19
## Attr38    0.16
## Attr53    0.16
## Attr64    0.13
## Attr29    0.05
## ID        0.00
## Attr1     0.00
## Attr2     0.00
## Attr3     0.00
## Attr4     0.00
## Attr7     0.00
## Attr10    0.00
## Attr11    0.00
## Attr12    0.00
## Attr14    0.00
## Attr16    0.00
## Attr17    0.00
## Attr18    0.00
## Attr20    0.00
## Attr21    0.00
## Attr26    0.00
## Attr28    0.00
## Attr30    0.00
## Attr31    0.00
## Attr32    0.00
## Attr33    0.00
## Attr36    0.00
## Attr37    0.00
## Attr40    0.00
## Attr43    0.00
## Attr45    0.00
## Attr47    0.00
## Attr49    0.00
## Attr50    0.00
## Attr51    0.00
## Attr52    0.00
## Attr54    0.00
## Attr55    0.00
## Attr57    0.00
## Attr59    0.00
## Attr61    0.00
## Attr62    0.00
## Attr63    0.00

Rules from trees

Understand the summary of the returned c5.0 rules based on the decision tree model

summary(c5_rules)

## 
## Call:
## C5.0.formula(formula = target ~ ., data = train_data, rules = T)
## 
## 
## C5.0 [Release 2.07 GPL Edition]      Fri Jun 22 21:28:43 2018
## -------------------------------
## 
## Class specified by attribute `outcome'
## 
## Read 25588 cases (66 attributes) from undefined.data
## 
## Rules:
## 
## Rule 1: (1868, lift 1.0)
##  Attr27 <= 0.87682
##  Attr34 <= 0.078963
##  Attr46 > 0.5449
##  ->  class 0  [0.999]
## 
## Rule 2: (174, lift 1.0)
##  Attr5 > -22.458
##  Attr5 <= 82.901
##  Attr41 > -0.000156
##  Attr58 <= 0.34016
##  ->  class 0  [0.994]
## 
## Rule 3: (592/7, lift 1.0)
##  Attr34 <= 0.22583
##  Attr41 > -0.000156
##  Attr46 > 0.5449
##  Attr56 > 0.068701
##  ->  class 0  [0.987]
## 
## Rule 4: (8685/121, lift 1.0)
##  Attr27 <= 0.96568
##  Attr41 > -0.000156
##  ->  class 0  [0.986]
## 
## Rule 5: (623/9, lift 1.0)
##  Attr5 <= -194.98
##  Attr27 > 0.96568
##  Attr34 > 0.14597
##  Attr58 > 0.77596
##  ->  class 0  [0.984]
## 
## Rule 6: (2151/43, lift 1.0)
##  Attr5 > 176.99
##  ->  class 0  [0.980]
## 
## Rule 7: (7239/176, lift 1.0)
##  Attr42 > 0.082447
##  ->  class 0  [0.976]
## 
## Rule 8: (18750/545, lift 1.0)
##  Attr34 > 0.14597
##  Attr41 > -0.000156
##  Attr46 > 0.18877
##  ->  class 0  [0.971]
## 
## Rule 9: (11760/342, lift 1.0)
##  Attr8 > 0.33112
##  Attr27 > 0.96568
##  Attr34 > 0.20488
##  ->  class 0  [0.971]
## 
## Rule 10: (3918/360, lift 1.0)
##  Attr27 <= -0.0305
##  ->  class 0  [0.908]
## 
## Rule 11: (89, lift 20.7)
##  Attr6 > 0.033369
##  Attr27 > 1.0896
##  Attr27 <= 1.09005
##  Attr34 <= 1.577
##  Attr35 > 0.063439
##  ->  class 1  [0.989]
## 
## Rule 12: (72/1, lift 20.4)
##  Attr5 > -22.458
##  Attr5 <= 82.901
##  Attr41 > -0.000156
##  Attr44 > 23.394
##  Attr46 <= 0.5449
##  Attr58 > 0.34016
##  ->  class 1  [0.973]
## 
## Rule 13: (31, lift 20.3)
##  Attr5 > -50.6
##  Attr5 <= 23.747
##  Attr34 > 0.001229
##  Attr42 <= -0.023691
##  Attr44 > 25.918
##  Attr46 <= 0.40497
##  ->  class 1  [0.970]
## 
## Rule 14: (104/3, lift 20.1)
##  Attr27 > -0.0305
##  Attr34 <= -0.015941
##  Attr41 <= -0.000156
##  ->  class 1  [0.962]
## 
## Rule 15: (24, lift 20.1)
##  Attr5 > -194.98
##  Attr8 <= 0.33112
##  Attr27 > 0.96568
##  Attr44 > 23.023
##  Attr46 <= 0.18877
##  Attr58 > 0.77596
##  ->  class 1  [0.962]
## 
## Rule 16: (19, lift 19.9)
##  Attr5 > -31.004
##  Attr5 <= 23.747
##  Attr27 <= -0.0305
##  Attr34 > 0.001229
##  Attr41 <= -0.000156
##  Attr46 <= 0.59928
##  Attr58 <= 1.0202
##  ->  class 1  [0.952]
## 
## Rule 17: (15, lift 19.7)
##  Attr8 <= 0.33112
##  Attr39 <= 0.086457
##  Attr41 > -0.000156
##  Attr42 > 0.020436
##  Attr46 <= 0.5449
##  Attr58 <= 0.77596
##  ->  class 1  [0.941]
## 
## Rule 18: (32/1, lift 19.7)
##  Attr5 > -22.458
##  Attr5 <= 82.901
##  Attr27 > 0.67238
##  Attr46 <= 0.5449
##  Attr53 <= 0.69144
##  ->  class 1  [0.941]
## 
## Rule 19: (70/4, lift 19.5)
##  Attr5 > -986.05
##  Attr5 <= 176.99
##  Attr27 <= -0.0305
##  Attr34 > 0.0070654
##  Attr35 <= -0.0034142
##  Attr56 > 0.065998
##  ->  class 1  [0.931]
## 
## Rule 20: (77/6, lift 19.1)
##  Attr5 > -22.458
##  Attr5 <= 82.901
##  Attr41 > -0.000156
##  Attr44 > 23.394
##  Attr46 <= 0.5449
##  ->  class 1  [0.911]
## 
## Rule 21: (29/3, lift 18.2)
##  Attr5 <= 23.747
##  Attr23 <= -0.021107
##  Attr34 > 0.001229
##  Attr41 <= -0.000156
##  Attr56 <= 0.065998
##  Attr58 <= 1.0202
##  Attr64 > 1.2237
##  ->  class 1  [0.871]
## 
## Rule 22: (5, lift 17.9)
##  Attr27 > -0.0305
##  Attr27 <= -0.0034433
##  Attr34 > 0.71601
##  Attr41 <= -0.000156
##  ->  class 1  [0.857]
## 
## Rule 23: (7/1, lift 16.3)
##  Attr8 <= 0.33112
##  Attr29 <= 3.2812
##  Attr34 > 0.14597
##  Attr42 > 0.020436
##  Attr46 <= 0.5449
##  Attr58 <= 0.77596
##  ->  class 1  [0.778]
## 
## Rule 24: (34/8, lift 15.7)
##  Attr5 <= 23.747
##  Attr23 <= -0.021107
##  Attr34 > 0.001229
##  Attr41 <= -0.000156
##  Attr56 <= 0.065998
##  Attr58 <= 1.0202
##  ->  class 1  [0.750]
## 
## Rule 25: (24/8, lift 13.7)
##  Attr8 <= 0.33112
##  Attr34 > 0.14597
##  Attr42 > 0.020436
##  Attr46 <= 0.5449
##  Attr58 <= 0.77596
##  ->  class 1  [0.654]
## 
## Rule 26: (441/164, lift 13.1)
##  Attr24 <= 0.067406
##  Attr27 > 0.87682
##  Attr34 <= 0.22583
##  ->  class 1  [0.628]
## 
## Rule 27: (214/95, lift 11.6)
##  Attr5 > -22.458
##  Attr5 <= 82.901
##  Attr46 <= 0.5449
##  ->  class 1  [0.556]
## 
## Default class: 0
## 
## 
## Evaluation on training data (25588 cases):
## 
##          Rules     
##    ----------------
##      No      Errors
## 
##      27  773( 3.0%)   <<
## 
## 
##     (a)   (b)    <-classified as
##    ----  ----
##   24336    29    (a): class 0
##     744   479    (b): class 1
## 
## 
##  Attribute usage:
## 
##   93.09% Attr27
##   87.30% Attr41
##   84.40% Attr34
##   82.03% Attr46
##   46.16% Attr8
##   28.45% Attr42
##   12.80% Attr5
##    3.75% Attr58
##    2.72% Attr56
##    1.72% Attr24
##    0.62% Attr35
##    0.48% Attr44
##    0.35% Attr6
##    0.13% Attr23
##    0.13% Attr53
##    0.11% Attr64
##    0.06% Attr39
##    0.03% Attr29
## 
## 
## Time: 6.7 secs

From the output of the summary above, you can clearly understand the rules and their associated metrics such as lift and support
This is great for explicability and can also be used for understanding interesting relationships in data, even if your final model is not a decision tree

Plotting the tree

Call the plot function on the tree object to visualize the tree

plot(c5_tree)

Evaluating the model

Predictions on the test data

We’ll evaluate the decision tree using the standard error metrics on test data

preds <- predict(c5_tree, train_data)
preds1 <- predict(c5_tree, test_data)

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

conf_train=confusionMatrix(preds, train_data$target, positive = "0")
conf_test=confusionMatrix(preds1, test_data$target, positive = "0")
conf_train

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 24342   632
##          1    23   591
##                                           
##                Accuracy : 0.9744          
##                  95% CI : (0.9724, 0.9763)
##     No Information Rate : 0.9522          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.6317          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9991          
##             Specificity : 0.4832          
##          Pos Pred Value : 0.9747          
##          Neg Pred Value : 0.9625          
##              Prevalence : 0.9522          
##          Detection Rate : 0.9513          
##    Detection Prevalence : 0.9760          
##       Balanced Accuracy : 0.7411          
##                                           
##        'Positive' Class : 0               
##

conf_test

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10384   301
##          1    39   241
##                                           
##                Accuracy : 0.969           
##                  95% CI : (0.9656, 0.9722)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.572           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9963          
##             Specificity : 0.4446          
##          Pos Pred Value : 0.9718          
##          Neg Pred Value : 0.8607          
##              Prevalence : 0.9506          
##          Detection Rate : 0.9470          
##    Detection Prevalence : 0.9745          
##       Balanced Accuracy : 0.7205          
##                                           
##        'Positive' Class : 0               
##

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

F1_score<-function(Recall, Precision)   {
     F1<-2*Recall*Precision/(Recall+Precision)
     return(F1)
}

recall_test <- sensitivity(preds1, test_data$target)
precision_test <- posPredValue(preds1, test_data$target)

F1_model_c5<-F1_score(recall_test,precision_test)

print(F1_model_c5)

## [1] 0.9838924

CART Trees

The classification and regression trees use gini index in place of the gain ratio (based on information gain) used by the ID3 based algorithms, such as c4.5 and c5.0

Goal

The goal of this activity is to predict the imbd score of a movie using a classification and regression tree (cart)

reg_tree <- rpart(target ~ ., train_data,method='class')

printcp(reg_tree)

## 
## Classification tree:
## rpart(formula = target ~ ., data = train_data, method = "class")
## 
## Variables actually used in tree construction:
## [1] Attr24 Attr27 Attr34 Attr35 Attr56
## 
## Root node error: 1223/25588 = 0.047796
## 
## n= 25588 
## 
##         CP nsplit rel error  xerror     xstd
## 1 0.027187      0   1.00000 1.00000 0.027903
## 2 0.024530      4   0.89125 0.92723 0.026918
## 3 0.010000      8   0.79313 0.80948 0.025225

Tree Explicability

The variable importance can accessed accessing variable.importance from the reg.tree list

reg_tree$variable.importance

##      Attr34      Attr56      Attr39      Attr41      Attr35      Attr22 
## 266.0787251 175.2489122  86.2233824  71.8981654  69.5123896  65.9275392 
##      Attr55      Attr58      Attr27      Attr42      Attr33      Attr29 
##  63.1595004  60.1780071  59.7002362  56.7521977  55.7751874  48.6854482 
##       Attr9      Attr36      Attr15      Attr14      Attr18       Attr7 
##  42.9363953  31.9695663  30.2639273  24.7714078  24.7714078  24.7634139 
##      Attr24      Attr19      Attr52      Attr30      Attr32      Attr12 
##  21.1439300  19.7373439  12.0567802  11.0083645  10.9527372   6.9107476 
##       Attr1      Attr48      Attr11      Attr47      Attr10      Attr25 
##   6.5390301   5.7424230   5.4991000   0.9371612   0.4685806   0.4685806

We can plot the regression tree using the rpart.plot() function from the rpart.plot package

rpart.plot(reg_tree)

## Evaluating the model

Predictions on the test data

We’ll evaluate the decision tree using the standard error metrics on test data

summary(reg_tree) # detailed summary of splits

## Call:
## rpart(formula = target ~ ., data = train_data, method = "class")
##   n= 25588 
## 
##           CP nsplit rel error    xerror       xstd
## 1 0.02718724      0 1.0000000 1.0000000 0.02790306
## 2 0.02452984      4 0.8912510 0.9272281 0.02691763
## 3 0.01000000      8 0.7931316 0.8094849 0.02522452
## 
## Variable importance
## Attr34 Attr56 Attr39 Attr41 Attr35 Attr22 Attr55 Attr58 Attr27 Attr42 
##     20     13      6      5      5      5      5      4      4      4 
## Attr33 Attr29  Attr9 Attr36 Attr15 Attr14 Attr18  Attr7 Attr24 Attr19 
##      4      4      3      2      2      2      2      2      2      1 
## Attr52 Attr30 Attr32 Attr12 
##      1      1      1      1 
## 
## Node number 1: 25588 observations,    complexity param=0.02718724
##   predicted class=0  expected loss=0.04779584  P(node) =1
##     class counts: 24365  1223
##    probabilities: 0.952 0.048 
##   left son=2 (22195 obs) right son=3 (3393 obs)
##   Primary splits:
##       Attr35 < -0.0393365   to the right, improve=69.51239, (0 missing)
##       Attr39 < -0.035456    to the right, improve=69.15036, (0 missing)
##       Attr41 < -0.0128205   to the right, improve=67.27618, (0 missing)
##       Attr26 < 0.032943     to the right, improve=63.03163, (0 missing)
##       Attr24 < 0.053813     to the right, improve=62.76884, (0 missing)
##   Surrogate splits:
##       Attr39 < -0.0238625   to the right, agree=0.974, adj=0.808, (0 split)
##       Attr22 < -0.0393375   to the right, agree=0.967, adj=0.749, (0 split)
##       Attr56 < -0.021217    to the right, agree=0.956, adj=0.665, (0 split)
##       Attr42 < -0.019477    to the right, agree=0.950, adj=0.625, (0 split)
##       Attr41 < -0.000131595 to the right, agree=0.944, adj=0.580, (0 split)
## 
## Node number 2: 22195 observations,    complexity param=0.02452984
##   predicted class=0  expected loss=0.0333859  P(node) =0.8673988
##     class counts: 21454   741
##    probabilities: 0.967 0.033 
##   left son=4 (16905 obs) right son=5 (5290 obs)
##   Primary splits:
##       Attr24 < 0.053813     to the right, improve=21.14393, (0 missing)
##       Attr46 < 0.470925     to the right, improve=18.38534, (0 missing)
##       Attr26 < 0.124525     to the right, improve=16.31054, (0 missing)
##       Attr16 < 0.138375     to the right, improve=15.81470, (0 missing)
##       Attr38 < 0.495755     to the right, improve=11.83234, (0 missing)
##   Surrogate splits:
##       Attr14 < 0.018395     to the right, agree=0.843, adj=0.343, (0 split)
##       Attr18 < 0.018395     to the right, agree=0.843, adj=0.343, (0 split)
##       Attr7  < 0.018395     to the right, agree=0.843, adj=0.342, (0 split)
##       Attr12 < 0.038155     to the right, agree=0.840, adj=0.327, (0 split)
##       Attr1  < 0.0124095    to the right, agree=0.835, adj=0.309, (0 split)
## 
## Node number 3: 3393 observations,    complexity param=0.02718724
##   predicted class=0  expected loss=0.1420572  P(node) =0.1326012
##     class counts:  2911   482
##    probabilities: 0.858 0.142 
##   left son=6 (3283 obs) right son=7 (110 obs)
##   Primary splits:
##       Attr56 < 0.0681095    to the left,  improve=51.54386, (0 missing)
##       Attr58 < 0.87463      to the right, improve=27.36437, (0 missing)
##       Attr27 < 0.988425     to the left,  improve=26.46364, (0 missing)
##       Attr46 < 0.38793      to the right, improve=25.38936, (0 missing)
##       Attr40 < 0.069154     to the right, improve=18.24269, (0 missing)
##   Surrogate splits:
##       Attr9  < -0.0161855   to the right, agree=0.968, adj=0.018, (0 split)
##       Attr47 < -1.953635    to the right, agree=0.968, adj=0.018, (0 split)
##       Attr10 < 1.3329       to the left,  agree=0.968, adj=0.009, (0 split)
##       Attr25 < 1.33203      to the left,  agree=0.968, adj=0.009, (0 split)
##       Attr32 < 16740.5      to the left,  agree=0.968, adj=0.009, (0 split)
## 
## Node number 4: 16905 observations
##   predicted class=0  expected loss=0.02117717  P(node) =0.6606612
##     class counts: 16547   358
##    probabilities: 0.979 0.021 
## 
## Node number 5: 5290 observations,    complexity param=0.02452984
##   predicted class=0  expected loss=0.07240076  P(node) =0.2067375
##     class counts:  4907   383
##    probabilities: 0.928 0.072 
##   left son=10 (3354 obs) right son=11 (1936 obs)
##   Primary splits:
##       Attr27 < 1.0641       to the left,  improve=31.404890, (0 missing)
##       Attr6  < 0.000283725  to the left,  improve= 8.278168, (0 missing)
##       Attr46 < 0.22588      to the right, improve= 7.193995, (0 missing)
##       Attr24 < -0.08189     to the left,  improve= 6.509649, (0 missing)
##       Attr22 < 0.0100565    to the left,  improve= 4.900454, (0 missing)
##   Surrogate splits:
##       Attr48 < 0.011722     to the left,  agree=0.701, adj=0.183, (0 split)
##       Attr11 < 0.0393115    to the left,  agree=0.698, adj=0.175, (0 split)
##       Attr7  < 0.0237045    to the left,  agree=0.697, adj=0.173, (0 split)
##       Attr14 < 0.0237045    to the left,  agree=0.697, adj=0.173, (0 split)
##       Attr18 < 0.0237045    to the left,  agree=0.697, adj=0.173, (0 split)
## 
## Node number 6: 3283 observations,    complexity param=0.02718724
##   predicted class=0  expected loss=0.1261042  P(node) =0.1283023
##     class counts:  2869   414
##    probabilities: 0.874 0.126 
##   left son=12 (2666 obs) right son=13 (617 obs)
##   Primary splits:
##       Attr27 < 0.988425     to the left,  improve=28.29535, (0 missing)
##       Attr46 < 0.63169      to the right, improve=20.93660, (0 missing)
##       Attr40 < 0.0547355    to the right, improve=16.41326, (0 missing)
##       Attr35 < -0.321925    to the right, improve=12.23396, (0 missing)
##       Attr3  < -0.508195    to the right, improve=11.71989, (0 missing)
##   Surrogate splits:
##       Attr22 < 0.00320755   to the left,  agree=0.904, adj=0.489, (0 split)
##       Attr42 < 0.0014794    to the left,  agree=0.900, adj=0.470, (0 split)
##       Attr7  < 0.00135305   to the left,  agree=0.892, adj=0.428, (0 split)
##       Attr14 < 0.00135305   to the left,  agree=0.892, adj=0.428, (0 split)
##       Attr18 < 0.00135305   to the left,  agree=0.892, adj=0.428, (0 split)
## 
## Node number 7: 110 observations
##   predicted class=1  expected loss=0.3818182  P(node) =0.00429889
##     class counts:    42    68
##    probabilities: 0.382 0.618 
## 
## Node number 10: 3354 observations
##   predicted class=0  expected loss=0.03100775  P(node) =0.1310771
##     class counts:  3250   104
##    probabilities: 0.969 0.031 
## 
## Node number 11: 1936 observations,    complexity param=0.02452984
##   predicted class=0  expected loss=0.1441116  P(node) =0.07566047
##     class counts:  1657   279
##    probabilities: 0.856 0.144 
##   left son=22 (1697 obs) right son=23 (239 obs)
##   Primary splits:
##       Attr34 < 0.159565     to the right, improve=125.28570, (0 missing)
##       Attr6  < 0.000283725  to the left,  improve= 57.38552, (0 missing)
##       Attr27 < 1.090225     to the right, improve= 48.32148, (0 missing)
##       Attr9  < 1.1032       to the right, improve= 34.39527, (0 missing)
##       Attr29 < 4.00925      to the left,  improve= 31.25859, (0 missing)
##   Surrogate splits:
##       Attr33 < 0.25645      to the right, agree=0.902, adj=0.205, (0 split)
##       Attr56 < 0.967725     to the left,  agree=0.890, adj=0.113, (0 split)
##       Attr52 < 3.9009       to the left,  agree=0.888, adj=0.096, (0 split)
##       Attr30 < 3.4617       to the left,  agree=0.887, adj=0.088, (0 split)
##       Attr32 < 1423.8       to the left,  agree=0.887, adj=0.084, (0 split)
## 
## Node number 12: 2666 observations
##   predicted class=0  expected loss=0.09452363  P(node) =0.1041895
##     class counts:  2414   252
##    probabilities: 0.905 0.095 
## 
## Node number 13: 617 observations,    complexity param=0.02718724
##   predicted class=0  expected loss=0.2625608  P(node) =0.02411287
##     class counts:   455   162
##    probabilities: 0.737 0.263 
##   left son=26 (510 obs) right son=27 (107 obs)
##   Primary splits:
##       Attr34 < 0.034667     to the right, improve=140.79310, (0 missing)
##       Attr42 < -0.017354    to the right, improve= 67.08098, (0 missing)
##       Attr19 < -0.028688    to the right, improve= 64.23393, (0 missing)
##       Attr31 < -0.028688    to the right, improve= 64.23393, (0 missing)
##       Attr23 < -0.022592    to the right, improve= 63.92920, (0 missing)
##   Surrogate splits:
##       Attr55 < -2890.9      to the right, agree=0.904, adj=0.449, (0 split)
##       Attr29 < 4.3103       to the left,  agree=0.887, adj=0.346, (0 split)
##       Attr41 < -0.10932     to the right, agree=0.865, adj=0.224, (0 split)
##       Attr15 < -1189.6      to the right, agree=0.864, adj=0.215, (0 split)
##       Attr19 < -0.11616     to the right, agree=0.851, adj=0.140, (0 split)
## 
## Node number 22: 1697 observations
##   predicted class=0  expected loss=0.07660577  P(node) =0.06632015
##     class counts:  1567   130
##    probabilities: 0.923 0.077 
## 
## Node number 23: 239 observations,    complexity param=0.02452984
##   predicted class=1  expected loss=0.376569  P(node) =0.009340316
##     class counts:    90   149
##    probabilities: 0.377 0.623 
##   left son=46 (101 obs) right son=47 (138 obs)
##   Primary splits:
##       Attr56 < 0.05258875   to the right, improve=63.31228, (0 missing)
##       Attr58 < 0.9511       to the left,  improve=62.93555, (0 missing)
##       Attr33 < 0.96864      to the left,  improve=43.04750, (0 missing)
##       Attr36 < 0.887245     to the left,  improve=39.93445, (0 missing)
##       Attr27 < 1.092825     to the right, improve=36.18642, (0 missing)
##   Surrogate splits:
##       Attr58 < 0.9511       to the left,  agree=0.979, adj=0.950, (0 split)
##       Attr9  < 1.0591       to the right, agree=0.858, adj=0.663, (0 split)
##       Attr36 < 0.8505       to the left,  agree=0.791, adj=0.505, (0 split)
##       Attr33 < 2.01405      to the left,  agree=0.778, adj=0.475, (0 split)
##       Attr39 < 0.0570895    to the right, agree=0.778, adj=0.475, (0 split)
## 
## Node number 26: 510 observations
##   predicted class=0  expected loss=0.1078431  P(node) =0.01993122
##     class counts:   455    55
##    probabilities: 0.892 0.108 
## 
## Node number 27: 107 observations
##   predicted class=1  expected loss=0  P(node) =0.004181648
##     class counts:     0   107
##    probabilities: 0.000 1.000 
## 
## Node number 46: 101 observations
##   predicted class=0  expected loss=0.1980198  P(node) =0.003947163
##     class counts:    81    20
##    probabilities: 0.802 0.198 
## 
## Node number 47: 138 observations
##   predicted class=1  expected loss=0.06521739  P(node) =0.005393153
##     class counts:     9   129
##    probabilities: 0.065 0.935

pred2 = predict(reg_tree, train_data,type = "class")
pred3 = predict(reg_tree, test_data,type = "class")

table(pred2, train_data$target)

##      
## pred2     0     1
##     0 24314   919
##     1    51   304

table(pred3, test_data$target)

##      
## pred3     0     1
##     0 10397   401
##     1    26   141

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

conf_train=confusionMatrix(pred2, train_data$target, positive = "0")
conf_test=confusionMatrix(pred3, test_data$target, positive = "0")
conf_train

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 24314   919
##          1    51   304
##                                           
##                Accuracy : 0.9621          
##                  95% CI : (0.9597, 0.9644)
##     No Information Rate : 0.9522          
##     P-Value [Acc > NIR] : 9.278e-15       
##                                           
##                   Kappa : 0.3718          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9979          
##             Specificity : 0.2486          
##          Pos Pred Value : 0.9636          
##          Neg Pred Value : 0.8563          
##              Prevalence : 0.9522          
##          Detection Rate : 0.9502          
##    Detection Prevalence : 0.9861          
##       Balanced Accuracy : 0.6232          
##                                           
##        'Positive' Class : 0               
##

conf_test

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10397   401
##          1    26   141
##                                           
##                Accuracy : 0.9611          
##                  95% CI : (0.9573, 0.9646)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : 8.803e-08       
##                                           
##                   Kappa : 0.3834          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9975          
##             Specificity : 0.2601          
##          Pos Pred Value : 0.9629          
##          Neg Pred Value : 0.8443          
##              Prevalence : 0.9506          
##          Detection Rate : 0.9482          
##    Detection Prevalence : 0.9848          
##       Balanced Accuracy : 0.6288          
##                                           
##        'Positive' Class : 0               
##

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

recall_test <- sensitivity(pred3, test_data$target)
precision_test <- posPredValue(pred3, test_data$target)

F1_model_rpart<-F1_score(recall_test,precision_test)

print(F1_model_rpart)

## [1] 0.9798784

Build KNN model

  # N = 1/3/5/7
  Neigh <-3
  pred=knn(train_data[,num_var], test_data[,num_var],train_data$target , k = Neigh)
  a=table(pred,test_data$target)

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

conf_test=confusionMatrix(pred, test_data$target, positive = "0")
conf_test

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10318   528
##          1   105    14
##                                           
##                Accuracy : 0.9423          
##                  95% CI : (0.9377, 0.9466)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : 0.025           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.98993         
##             Specificity : 0.02583         
##          Pos Pred Value : 0.95132         
##          Neg Pred Value : 0.11765         
##              Prevalence : 0.95057         
##          Detection Rate : 0.94099         
##    Detection Prevalence : 0.98915         
##       Balanced Accuracy : 0.50788         
##                                           
##        'Positive' Class : 0               
##

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

recall_test <- sensitivity(pred, test_data$target)
precision_test <- posPredValue(pred, test_data$target)

F1_model_knn<-F1_score(recall_test,precision_test)

print(F1_model_knn)

## [1] 0.9702384

accu= sum(diag(a))/sum(a)
accu

## [1] 0.9422709

Model Building using Random Forest and tuning

# Build the classification model using randomForest
model = randomForest(target ~ ., data=train_data, 
                      keep.forest=TRUE, ntree=50) 

# Print and understand the model
print(model)

## 
## Call:
##  randomForest(formula = target ~ ., data = train_data, keep.forest = TRUE,      ntree = 50) 
##                Type of random forest: classification
##                      Number of trees: 50
## No. of variables tried at each split: 8
## 
##         OOB estimate of  error rate: 4.13%
## Confusion matrix:
##       0   1 class.error
## 0 24280  85 0.003488611
## 1   971 252 0.793949305

Important attributes

model$importance

##        MeanDecreaseGini
## ID             40.18912
## Attr1          26.56679
## Attr2          22.61744
## Attr3          29.02045
## Attr4          28.61218
## Attr5          38.50798
## Attr6          45.20829
## Attr7          26.39948
## Attr8          23.70999
## Attr9          49.74943
## Attr10         23.58335
## Attr11         28.47532
## Attr12         23.86368
## Attr13         28.45474
## Attr14         24.83425
## Attr15         32.23372
## Attr16         34.57927
## Attr17         23.54452
## Attr18         21.70739
## Attr19         23.32987
## Attr20         31.25680
## Attr21         29.68573
## Attr22         34.05966
## Attr23         23.97626
## Attr24         61.71602
## Attr25         35.18235
## Attr26         33.61562
## Attr27        138.38957
## Attr28         24.29480
## Attr29         46.60052
## Attr30         28.36964
## Attr31         23.87696
## Attr32         24.83620
## Attr33         31.88914
## Attr34         83.99513
## Attr35         50.06304
## Attr36         27.82057
## Attr37         24.70612
## Attr38         30.13073
## Attr39         48.89836
## Attr40         44.46032
## Attr41         40.83784
## Attr42         41.63294
## Attr43         25.77022
## Attr44         43.71854
## Attr45         28.90882
## Attr46         78.30439
## Attr47         31.79035
## Attr48         30.48163
## Attr49         28.96734
## Attr50         28.56525
## Attr51         26.85638
## Attr52         26.87720
## Attr53         26.97061
## Attr54         27.52999
## Attr55         37.75451
## Attr56         62.71097
## Attr57         28.31858
## Attr58         63.23325
## Attr59         19.47262
## Attr60         34.29337
## Attr61         41.03418
## Attr62         24.06294
## Attr63         25.16699
## Attr64         26.52542

round(importance(model), 2)

##        MeanDecreaseGini
## ID                40.19
## Attr1             26.57
## Attr2             22.62
## Attr3             29.02
## Attr4             28.61
## Attr5             38.51
## Attr6             45.21
## Attr7             26.40
## Attr8             23.71
## Attr9             49.75
## Attr10            23.58
## Attr11            28.48
## Attr12            23.86
## Attr13            28.45
## Attr14            24.83
## Attr15            32.23
## Attr16            34.58
## Attr17            23.54
## Attr18            21.71
## Attr19            23.33
## Attr20            31.26
## Attr21            29.69
## Attr22            34.06
## Attr23            23.98
## Attr24            61.72
## Attr25            35.18
## Attr26            33.62
## Attr27           138.39
## Attr28            24.29
## Attr29            46.60
## Attr30            28.37
## Attr31            23.88
## Attr32            24.84
## Attr33            31.89
## Attr34            84.00
## Attr35            50.06
## Attr36            27.82
## Attr37            24.71
## Attr38            30.13
## Attr39            48.90
## Attr40            44.46
## Attr41            40.84
## Attr42            41.63
## Attr43            25.77
## Attr44            43.72
## Attr45            28.91
## Attr46            78.30
## Attr47            31.79
## Attr48            30.48
## Attr49            28.97
## Attr50            28.57
## Attr51            26.86
## Attr52            26.88
## Attr53            26.97
## Attr54            27.53
## Attr55            37.75
## Attr56            62.71
## Attr57            28.32
## Attr58            63.23
## Attr59            19.47
## Attr60            34.29
## Attr61            41.03
## Attr62            24.06
## Attr63            25.17
## Attr64            26.53

Extract and store important variables obtained from the random forest model

rf_Imp_Attr = data.frame(model$importance)
rf_Imp_Attr = data.frame(row.names(rf_Imp_Attr),rf_Imp_Attr[,1])
colnames(rf_Imp_Attr) = c('Attributes', 'Importance')
rf_Imp_Attr = rf_Imp_Attr[order(rf_Imp_Attr$Importance, decreasing = TRUE),]
rf_Imp_Attr

##    Attributes Importance
## 28     Attr27  138.38957
## 35     Attr34   83.99513
## 47     Attr46   78.30439
## 59     Attr58   63.23325
## 57     Attr56   62.71097
## 25     Attr24   61.71602
## 36     Attr35   50.06304
## 10      Attr9   49.74943
## 40     Attr39   48.89836
## 30     Attr29   46.60052
## 7       Attr6   45.20829
## 41     Attr40   44.46032
## 45     Attr44   43.71854
## 43     Attr42   41.63294
## 62     Attr61   41.03418
## 42     Attr41   40.83784
## 1          ID   40.18912
## 6       Attr5   38.50798
## 56     Attr55   37.75451
## 26     Attr25   35.18235
## 17     Attr16   34.57927
## 61     Attr60   34.29337
## 23     Attr22   34.05966
## 27     Attr26   33.61562
## 16     Attr15   32.23372
## 34     Attr33   31.88914
## 48     Attr47   31.79035
## 21     Attr20   31.25680
## 49     Attr48   30.48163
## 39     Attr38   30.13073
## 22     Attr21   29.68573
## 4       Attr3   29.02045
## 50     Attr49   28.96734
## 46     Attr45   28.90882
## 5       Attr4   28.61218
## 51     Attr50   28.56525
## 12     Attr11   28.47532
## 14     Attr13   28.45474
## 31     Attr30   28.36964
## 58     Attr57   28.31858
## 37     Attr36   27.82057
## 55     Attr54   27.52999
## 54     Attr53   26.97061
## 53     Attr52   26.87720
## 52     Attr51   26.85638
## 2       Attr1   26.56679
## 65     Attr64   26.52542
## 8       Attr7   26.39948
## 44     Attr43   25.77022
## 64     Attr63   25.16699
## 33     Attr32   24.83620
## 15     Attr14   24.83425
## 38     Attr37   24.70612
## 29     Attr28   24.29480
## 63     Attr62   24.06294
## 24     Attr23   23.97626
## 32     Attr31   23.87696
## 13     Attr12   23.86368
## 9       Attr8   23.70999
## 11     Attr10   23.58335
## 18     Attr17   23.54452
## 20     Attr19   23.32987
## 3       Attr2   22.61744
## 19     Attr18   21.70739
## 60     Attr59   19.47262

plot (directly prints the important attributes)

varImpPlot(model)

Predict on Train data

pred_Train_rd = predict(model, 
                     train_data[,setdiff(names(train_data), "target")],
                     type="response", 
                     norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Train = table("actual"= train_data$target, "predicted" = pred_Train_rd);
accu_Train= sum(diag(cm_Train))/sum(cm_Train)
#rm(pred_Train_rd, cm_Train)

Predicton Test Data

pred_Test_rd = predict(model, test_data[,setdiff(names(test_data),
                                              "target")],
                    type="response", 
                    norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Test = table("actual"=test_data$target, "predicted"=pred_Test_rd);
accu_Test= sum(diag(cm_Test))/sum(cm_Test)
#rm(pred_Test, cm_Test)

Check the accuracy in train and test

accu_Train

## [1] 0.9996874

accu_Test

## [1] 0.9605107

Build randorm forest using top 9 important attributes.

top_Imp_Attr = as.character(rf_Imp_Attr$Attributes[1:9])

Build the classification model using randomForest

model_Imp = randomForest(target~.,
                         data=train_data[,c(top_Imp_Attr,"target")], 
                         keep.forest=TRUE,ntree=50)

Print and understand the model

print(model_Imp)

## 
## Call:
##  randomForest(formula = target ~ ., data = train_data[, c(top_Imp_Attr,      "target")], keep.forest = TRUE, ntree = 50) 
##                Type of random forest: classification
##                      Number of trees: 50
## No. of variables tried at each split: 3
## 
##         OOB estimate of  error rate: 3.51%
## Confusion matrix:
##       0   1 class.error
## 0 24271  94 0.003857993
## 1   803 420 0.656582175

Important attributes

model_Imp$importance

##        MeanDecreaseGini
## Attr27         328.5925
## Attr34         385.3800
## Attr46         251.1152
## Attr58         252.9222
## Attr56         241.1253
## Attr24         228.2003
## Attr35         203.9537
## Attr9          236.9829
## Attr39         196.9732

Predict on Train data

pred_Train_rd_attr = predict(model_Imp, train_data[,top_Imp_Attr],
                     type="response", norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Train = table("actual" = train_data$target, 
                 "predicted" = pred_Train_rd_attr);
accu_Train_Imp = sum(diag(cm_Train))/sum(cm_Train)
#rm(pred_Train, cm_Train)

Predicton Test Data

pred_Test_rd_attr = predict(model_Imp, test_data[,top_Imp_Attr],
                    type="response", norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Test = table("actual" = test_data$target, 
                "predicted" = pred_Test_rd_attr);
accu_Test_Imp = sum(diag(cm_Test))/sum(cm_Test)
#rm(pred_Test, cm_Test)

Print the accuracy

accu_Train

## [1] 0.9996874

accu_Test

## [1] 0.9605107

accu_Train_Imp

## [1] 0.9998437

accu_Test_Imp

## [1] 0.9652531

Top Important attributes

top_Imp_Attr = as.character(rf_Imp_Attr$Attributes[1:9])
set.seed(123)
x <- train_data[,!(names(train_data) %in% c("target"))]
y <- train_data[,(names(train_data) %in% c("target"))]
str(y)

##  Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

tunedmodel <-tuneRF(x, y, ntreeTry = 50, trace=TRUE, plot=TRUE, doBest = TRUE)

## mtry = 8  OOB error = 4.01% 
## Searching left ...
## mtry = 4     OOB error = 4.61% 
## -0.1489776 0.05 
## Searching right ...
## mtry = 16    OOB error = 3.6% 
## 0.104187 0.05 
## mtry = 32    OOB error = 3.22% 
## 0.1043478 0.05 
## mtry = 64    OOB error = 3.12% 
## 0.03033981 0.05

Print the tuned model

print(tunedmodel)

## 
## Call:
##  randomForest(x = x, y = y, mtry = res[which.min(res[, 2]), 1]) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 64
## 
##         OOB estimate of  error rate: 3.14%
## Confusion matrix:
##       0   1 class.error
## 0 24296  69 0.002831931
## 1   734 489 0.600163532

tunedmodel$importance

##        MeanDecreaseGini
## ID            47.104366
## Attr1         10.947148
## Attr2          9.820054
## Attr3         18.484756
## Attr4         26.354043
## Attr5         84.159971
## Attr6         56.048337
## Attr7          8.167342
## Attr8         10.221930
## Attr9         25.189081
## Attr10         9.617765
## Attr11        15.136207
## Attr12        15.358034
## Attr13        20.008260
## Attr14         7.872529
## Attr15        32.708267
## Attr16        13.789300
## Attr17         8.974876
## Attr18         7.895996
## Attr19        15.693212
## Attr20        24.159767
## Attr21        32.812584
## Attr22        22.087733
## Attr23        11.941566
## Attr24        68.233731
## Attr25        31.536609
## Attr26        20.918064
## Attr27       137.914810
## Attr28        16.296122
## Attr29        32.463563
## Attr30        25.805018
## Attr31        12.835974
## Attr32        16.292146
## Attr33        21.983803
## Attr34       346.724633
## Attr35        65.423062
## Attr36        28.826141
## Attr37        27.784577
## Attr38        20.722892
## Attr39        48.709275
## Attr40        47.164651
## Attr41        37.082006
## Attr42        20.188188
## Attr43        20.991746
## Attr44        55.369345
## Attr45        24.270499
## Attr46       100.862549
## Attr47        29.236586
## Attr48        19.804869
## Attr49        17.412889
## Attr50        20.985341
## Attr51        15.689752
## Attr52        11.649343
## Attr53        15.888776
## Attr54        16.258606
## Attr55        35.781258
## Attr56       115.569227
## Attr57        27.162836
## Attr58       101.198686
## Attr59        19.776938
## Attr60        26.795227
## Attr61        46.713272
## Attr62        12.471954
## Attr63        13.754349
## Attr64        22.334851

varImpPlot(tunedmodel)

Predict on Train data

pred_Train_rd_tune = predict(tunedmodel, train_data,
                     type="response", norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Train = table("actual" = train_data$target, 
                 "predicted" = pred_Train_rd_tune);
accu_Train = sum(diag(cm_Train))/sum(cm_Train)
#rm(pred_Train, cm_Train)

Predicton Test Data

pred_Test_rd_tune = predict(tunedmodel, test_data,
                    type="response", norm.votes=TRUE)

Build confusion matrix and find accuracy

cm_Test = table("actual" = test_data$target, 
                "predicted" = pred_Test_rd_tune);
accu_Test = sum(diag(cm_Test))/sum(cm_Test)
#rm(pred_Test, cm_Test)

Get the accuracy on train and test

accu_Train

## [1] 1

accu_Test

## [1] 0.9690834

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

conf_train_rd=confusionMatrix(pred_Train_rd, train_data$target, positive = "0")
conf_test_rd=confusionMatrix(pred_Test_rd, test_data$target, positive = "0")

conf_train_rd_attr=confusionMatrix(pred_Train_rd_attr, train_data$target, positive = "0")
conf_test_rd_attr=confusionMatrix(pred_Test_rd_attr, test_data$target, positive = "0")

conf_train_rd_tune=confusionMatrix(pred_Train_rd_tune, train_data$target, positive = "0")
conf_test_rd_tune=confusionMatrix(pred_Test_rd_tune, test_data$target, positive = "0")

conf_train_rd

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 24365     8
##          1     0  1215
##                                           
##                Accuracy : 0.9997          
##                  95% CI : (0.9994, 0.9999)
##     No Information Rate : 0.9522          
##     P-Value [Acc > NIR] : < 2e-16         
##                                           
##                   Kappa : 0.9966          
##  Mcnemar's Test P-Value : 0.01333         
##                                           
##             Sensitivity : 1.0000          
##             Specificity : 0.9935          
##          Pos Pred Value : 0.9997          
##          Neg Pred Value : 1.0000          
##              Prevalence : 0.9522          
##          Detection Rate : 0.9522          
##    Detection Prevalence : 0.9525          
##       Balanced Accuracy : 0.9967          
##                                           
##        'Positive' Class : 0               
##

conf_test_rd

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10393   403
##          1    30   139
##                                           
##                Accuracy : 0.9605          
##                  95% CI : (0.9567, 0.9641)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : 3.93e-07        
##                                           
##                   Kappa : 0.3763          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9971          
##             Specificity : 0.2565          
##          Pos Pred Value : 0.9627          
##          Neg Pred Value : 0.8225          
##              Prevalence : 0.9506          
##          Detection Rate : 0.9478          
##    Detection Prevalence : 0.9846          
##       Balanced Accuracy : 0.6268          
##                                           
##        'Positive' Class : 0               
##

conf_train_rd_attr

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 24365     4
##          1     0  1219
##                                      
##                Accuracy : 0.9998     
##                  95% CI : (0.9996, 1)
##     No Information Rate : 0.9522     
##     P-Value [Acc > NIR] : <2e-16     
##                                      
##                   Kappa : 0.9983     
##  Mcnemar's Test P-Value : 0.1336     
##                                      
##             Sensitivity : 1.0000     
##             Specificity : 0.9967     
##          Pos Pred Value : 0.9998     
##          Neg Pred Value : 1.0000     
##              Prevalence : 0.9522     
##          Detection Rate : 0.9522     
##    Detection Prevalence : 0.9524     
##       Balanced Accuracy : 0.9984     
##                                      
##        'Positive' Class : 0          
##

conf_test_rd_attr

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10394   352
##          1    29   190
##                                           
##                Accuracy : 0.9653          
##                  95% CI : (0.9617, 0.9686)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : 4.86e-14        
##                                           
##                   Kappa : 0.4847          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9972          
##             Specificity : 0.3506          
##          Pos Pred Value : 0.9672          
##          Neg Pred Value : 0.8676          
##              Prevalence : 0.9506          
##          Detection Rate : 0.9479          
##    Detection Prevalence : 0.9800          
##       Balanced Accuracy : 0.6739          
##                                           
##        'Positive' Class : 0               
##

conf_train_rd_tune

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 24365     0
##          1     0  1223
##                                      
##                Accuracy : 1          
##                  95% CI : (0.9999, 1)
##     No Information Rate : 0.9522     
##     P-Value [Acc > NIR] : < 2.2e-16  
##                                      
##                   Kappa : 1          
##  Mcnemar's Test P-Value : NA         
##                                      
##             Sensitivity : 1.0000     
##             Specificity : 1.0000     
##          Pos Pred Value : 1.0000     
##          Neg Pred Value : 1.0000     
##              Prevalence : 0.9522     
##          Detection Rate : 0.9522     
##    Detection Prevalence : 0.9522     
##       Balanced Accuracy : 1.0000     
##                                      
##        'Positive' Class : 0          
##

conf_test_rd_tune

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction     0     1
##          0 10396   312
##          1    27   230
##                                           
##                Accuracy : 0.9691          
##                  95% CI : (0.9657, 0.9722)
##     No Information Rate : 0.9506          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5618          
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9974          
##             Specificity : 0.4244          
##          Pos Pred Value : 0.9709          
##          Neg Pred Value : 0.8949          
##              Prevalence : 0.9506          
##          Detection Rate : 0.9481          
##    Detection Prevalence : 0.9766          
##       Balanced Accuracy : 0.7109          
##                                           
##        'Positive' Class : 0               
##

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

recall_test_rd <- sensitivity(pred_Test_rd, test_data$target)
precision_test_rd <- posPredValue(pred_Test_rd, test_data$target)

recall_test_rd_attr <- sensitivity(pred_Test_rd_attr, test_data$target)
precision_test_rd_attr <- posPredValue(pred_Test_rd_attr, test_data$target)

recall_test_rd_tune <- sensitivity(pred_Test_rd_tune, test_data$target)
precision_test_rd_tune <- posPredValue(pred_Test_rd_tune, test_data$target)

F1_model_rpart_rd<-F1_score(recall_test_rd,precision_test_rd)
F1_model_rpart_rd_attr<-F1_score(recall_test_rd_attr,precision_test_rd_attr)
F1_model_rpart_rd_tune<-F1_score(recall_test_rd_tune,precision_test_rd_tune)

print(F1_model_rpart_rd)

## [1] 0.9795938

print(F1_model_rpart_rd_attr)

## [1] 0.982002

print(F1_model_rpart_rd_tune)

## [1] 0.9839572

test_data$ID

##     [1]     1     5     7    16    22    23    27    28    30    31    32
##    [12]    35    40    43    45    52    53    59    61    64    67    68
##    [23]    71    72    73    74    75    77    78    81    85    90    92
##    [34]   104   108   112   113   115   120   122   123   124   130   131
##    [45]   132   134   135   137   143   145   152   153   155   157   162
##    [56]   175   179   180   181   188   189   190   193   194   201   205
##    [67]   206   208   213   217   219   220   221   222   225   229   232
##    [78]   237   238   245   254   257   265   267   279   280   284   287
##    [89]   295   301   303   309   314   317   332   343   346   354   356
##   [100]   357   358   362   364   374   375   378   382   394   396   399
##   [111]   400   409   412   420   423   427   431   433   436   437   438
##   [122]   444   445   446   449   451   453   457   460   461   463   464
##   [133]   468   469   470   471   472   478   480   483   488   492   495
##   [144]   497   501   511   513   515   516   522   524   527   528   537
##   [155]   541   542   543   544   545   546   547   550   557   566   577
##   [166]   585   587   590   598   601   605   608   609   613   620   626
##   [177]   628   640   648   649   651   652   656   657   659   661   663
##   [188]   664   665   668   669   672   673   677   686   687   690   691
##   [199]   694   695   699   701   702   707   712   714   719   721   724
##   [210]   725   727   730   733   735   738   746   747   748   750   754
##   [221]   762   767   769   770   788   792   794   800   804   806   807
##   [232]   811   813   814   816   817   821   822   829   830   832   834
##   [243]   840   844   845   848   849   850   852   853   858   859   863
##   [254]   865   866   875   882   883   886   887   888   889   906   907
##   [265]   908   909   910   913   919   920   922   924   927   929   931
##   [276]   935   937   939   941   944   946   947   957   959   960   966
##   [287]   967   970   977   981   989   999  1001  1002  1005  1007  1009
##   [298]  1013  1015  1016  1019  1026  1028  1031  1033  1034  1037  1043
##   [309]  1047  1051  1054  1057  1059  1067  1074  1076  1079  1084  1086
##   [320]  1087  1091  1093  1099  1109  1111  1112  1113  1117  1125  1128
##   [331]  1131  1135  1139  1146  1147  1148  1149  1157  1158  1165  1167
##   [342]  1168  1173  1176  1183  1200  1207  1210  1216  1219  1222  1223
##   [353]  1225  1228  1234  1237  1242  1243  1250  1253  1256  1258  1261
##   [364]  1264  1268  1273  1278  1279  1281  1284  1285  1289  1291  1299
##   [375]  1300  1301  1305  1307  1311  1315  1319  1320  1322  1329  1330
##   [386]  1331  1332  1338  1345  1370  1376  1379  1382  1384  1385  1387
##   [397]  1389  1409  1410  1421  1422  1431  1432  1437  1440  1442  1444
##   [408]  1447  1448  1450  1453  1457  1459  1470  1471  1472  1477  1480
##   [419]  1481  1482  1484  1485  1487  1488  1490  1492  1493  1494  1495
##   [430]  1497  1499  1504  1505  1507  1510  1514  1520  1523  1526  1532
##   [441]  1533  1534  1535  1541  1544  1545  1546  1547  1552  1565  1567
##   [452]  1568  1569  1570  1571  1575  1576  1577  1578  1590  1596  1597
##   [463]  1598  1599  1600  1609  1610  1611  1617  1622  1623  1626  1628
##   [474]  1634  1637  1638  1641  1645  1650  1652  1654  1655  1658  1660
##   [485]  1667  1670  1673  1674  1677  1678  1680  1686  1688  1689  1690
##   [496]  1691  1694  1695  1703  1706  1707  1712  1714  1716  1718  1719
##   [507]  1721  1724  1725  1726  1730  1733  1734  1743  1751  1752  1753
##   [518]  1755  1757  1762  1772  1784  1792  1795  1801  1816  1818  1819
##   [529]  1820  1824  1825  1826  1827  1836  1841  1842  1843  1846  1850
##   [540]  1853  1861  1864  1865  1866  1868  1869  1884  1886  1887  1890
##   [551]  1893  1894  1896  1897  1900  1906  1910  1912  1919  1925  1926
##   [562]  1928  1930  1932  1933  1935  1948  1951  1954  1958  1962  1964
##   [573]  1966  1967  1969  1978  1980  1981  1983  1990  1993  1994  1997
##   [584]  2001  2006  2010  2012  2013  2017  2020  2022  2023  2025  2029
##   [595]  2030  2032  2033  2037  2039  2040  2041  2046  2047  2059  2061
##   [606]  2064  2068  2075  2076  2077  2079  2080  2083  2087  2102  2110
##   [617]  2116  2124  2126  2127  2130  2132  2135  2136  2137  2139  2140
##   [628]  2149  2150  2151  2152  2155  2158  2173  2174  2181  2182  2183
##   [639]  2186  2191  2201  2203  2204  2205  2206  2211  2215  2216  2217
##   [650]  2233  2235  2236  2242  2243  2251  2253  2266  2270  2273  2275
##   [661]  2282  2283  2284  2286  2288  2299  2301  2308  2310  2312  2322
##   [672]  2323  2324  2327  2335  2339  2341  2342  2346  2348  2351  2356
##   [683]  2357  2363  2365  2370  2372  2373  2375  2376  2378  2380  2393
##   [694]  2399  2400  2409  2410  2413  2419  2425  2429  2430  2435  2440
##   [705]  2442  2444  2446  2450  2458  2459  2460  2463  2473  2479  2483
##   [716]  2492  2494  2498  2499  2501  2503  2512  2518  2519  2525  2526
##   [727]  2531  2535  2536  2537  2538  2539  2541  2545  2548  2550  2551
##   [738]  2559  2568  2571  2572  2574  2575  2576  2577  2579  2585  2593
##   [749]  2598  2601  2603  2607  2608  2609  2611  2616  2617  2618  2623
##   [760]  2628  2633  2638  2643  2646  2648  2651  2654  2655  2657  2659
##   [771]  2666  2669  2675  2679  2680  2689  2697  2698  2701  2715  2721
##   [782]  2726  2727  2729  2732  2734  2735  2738  2742  2745  2752  2756
##   [793]  2760  2763  2769  2771  2773  2779  2783  2788  2795  2798  2802
##   [804]  2803  2805  2808  2811  2813  2814  2818  2819  2820  2827  2837
##   [815]  2838  2839  2841  2845  2851  2852  2853  2855  2856  2857  2858
##   [826]  2866  2867  2873  2877  2879  2882  2884  2885  2892  2895  2896
##   [837]  2898  2901  2905  2907  2908  2909  2913  2914  2915  2916  2918
##   [848]  2920  2926  2927  2929  2933  2936  2937  2939  2942  2945  2949
##   [859]  2954  2956  2961  2966  2969  2972  2973  2977  2978  2984  2985
##   [870]  2987  2990  2991  2996  3001  3003  3005  3011  3015  3016  3024
##   [881]  3029  3034  3036  3038  3040  3045  3046  3048  3054  3056  3063
##   [892]  3066  3067  3071  3073  3078  3079  3080  3081  3082  3085  3091
##   [903]  3092  3100  3103  3104  3108  3115  3119  3122  3124  3125  3129
##   [914]  3131  3133  3138  3139  3145  3149  3151  3155  3160  3161  3162
##   [925]  3164  3169  3172  3174  3178  3185  3188  3189  3203  3204  3205
##   [936]  3208  3212  3213  3216  3230  3231  3232  3233  3241  3242  3254
##   [947]  3256  3265  3270  3272  3277  3283  3284  3287  3291  3293  3300
##   [958]  3301  3310  3311  3316  3317  3320  3321  3322  3335  3341  3355
##   [969]  3357  3359  3361  3369  3373  3374  3381  3396  3397  3401  3404
##   [980]  3405  3410  3417  3419  3421  3426  3427  3429  3432  3440  3444
##   [991]  3449  3455  3458  3462  3463  3467  3469  3474  3475  3476  3491
##  [1002]  3495  3506  3508  3509  3513  3526  3527  3528  3532  3534  3535
##  [1013]  3540  3549  3556  3557  3563  3564  3565  3566  3567  3570  3571
##  [1024]  3580  3585  3586  3587  3589  3597  3603  3610  3611  3613  3618
##  [1035]  3619  3629  3631  3632  3635  3637  3640  3646  3649  3652  3657
##  [1046]  3679  3680  3688  3689  3694  3700  3712  3714  3719  3722  3724
##  [1057]  3729  3732  3735  3737  3739  3743  3744  3753  3754  3758  3759
##  [1068]  3760  3765  3767  3786  3787  3788  3790  3797  3799  3802  3803
##  [1079]  3806  3807  3815  3817  3819  3826  3829  3830  3833  3834  3836
##  [1090]  3840  3843  3845  3846  3847  3849  3851  3858  3859  3865  3868
##  [1101]  3874  3875  3878  3881  3882  3891  3895  3897  3900  3903  3908
##  [1112]  3911  3912  3918  3922  3923  3925  3927  3928  3930  3931  3934
##  [1123]  3937  3945  3947  3948  3950  3951  3952  3957  3961  3963  3964
##  [1134]  3966  3967  3968  3970  3972  3979  3982  3993  3995  4005  4007
##  [1145]  4008  4010  4014  4015  4019  4021  4028  4032  4035  4039  4046
##  [1156]  4047  4051  4058  4059  4063  4070  4075  4080  4086  4089  4090
##  [1167]  4091  4094  4100  4107  4109  4112  4117  4121  4126  4127  4129
##  [1178]  4132  4134  4142  4145  4146  4148  4149  4159  4160  4165  4167
##  [1189]  4168  4169  4177  4179  4182  4188  4190  4191  4193  4196  4197
##  [1200]  4198  4202  4204  4211  4213  4215  4216  4218  4231  4245  4255
##  [1211]  4259  4260  4264  4268  4271  4272  4273  4276  4280  4284  4286
##  [1222]  4289  4290  4291  4294  4296  4302  4308  4313  4314  4316  4317
##  [1233]  4319  4320  4321  4322  4325  4326  4329  4331  4332  4343  4345
##  [1244]  4347  4350  4351  4354  4356  4359  4360  4362  4369  4377  4381
##  [1255]  4383  4385  4387  4395  4399  4404  4411  4414  4420  4423  4424
##  [1266]  4430  4433  4436  4444  4450  4456  4462  4468  4482  4483  4487
##  [1277]  4490  4495  4496  4498  4512  4517  4518  4520  4524  4529  4530
##  [1288]  4531  4534  4537  4543  4546  4553  4557  4562  4568  4571  4573
##  [1299]  4581  4584  4585  4587  4592  4595  4597  4599  4609  4613  4619
##  [1310]  4620  4623  4627  4628  4635  4637  4642  4651  4653  4656  4658
##  [1321]  4667  4669  4674  4675  4680  4684  4685  4686  4687  4689  4691
##  [1332]  4692  4694  4699  4701  4704  4707  4713  4717  4718  4726  4729
##  [1343]  4737  4739  4740  4742  4747  4750  4751  4755  4761  4763  4767
##  [1354]  4768  4775  4777  4781  4788  4792  4806  4808  4811  4815  4816
##  [1365]  4825  4827  4828  4829  4830  4835  4838  4839  4842  4844  4849
##  [1376]  4852  4857  4865  4869  4870  4872  4879  4880  4881  4883  4885
##  [1387]  4886  4887  4895  4899  4903  4907  4908  4912  4924  4930  4931
##  [1398]  4935  4938  4940  4941  4942  4945  4946  4947  4951  4953  4955
##  [1409]  4956  4957  4960  4971  4972  4973  4978  4985  4995  4996  4997
##  [1420]  5001  5003  5007  5009  5010  5011  5012  5013  5014  5016  5018
##  [1431]  5019  5022  5023  5025  5030  5031  5032  5033  5035  5041  5042
##  [1442]  5043  5045  5048  5052  5054  5059  5060  5063  5066  5067  5073
##  [1453]  5075  5079  5084  5088  5092  5093  5099  5100  5102  5104  5106
##  [1464]  5107  5109  5111  5113  5116  5117  5118  5119  5120  5122  5123
##  [1475]  5130  5133  5136  5142  5145  5147  5156  5160  5163  5164  5168
##  [1486]  5170  5172  5174  5177  5179  5183  5185  5187  5189  5190  5205
##  [1497]  5206  5209  5213  5216  5219  5221  5222  5223  5227  5228  5231
##  [1508]  5235  5237  5238  5242  5244  5245  5249  5250  5252  5255  5261
##  [1519]  5272  5275  5276  5278  5279  5285  5288  5290  5291  5303  5306
##  [1530]  5309  5311  5312  5314  5316  5317  5318  5320  5321  5322  5323
##  [1541]  5329  5330  5332  5335  5336  5339  5340  5343  5345  5355  5362
##  [1552]  5365  5369  5370  5371  5378  5380  5390  5395  5400  5406  5409
##  [1563]  5417  5418  5423  5432  5435  5436  5445  5449  5450  5453  5454
##  [1574]  5455  5457  5458  5459  5460  5461  5465  5467  5468  5474  5479
##  [1585]  5483  5489  5491  5492  5495  5496  5497  5498  5503  5505  5506
##  [1596]  5511  5513  5517  5522  5523  5525  5535  5542  5546  5550  5552
##  [1607]  5556  5565  5566  5570  5572  5574  5575  5579  5587  5589  5592
##  [1618]  5598  5599  5600  5602  5610  5614  5616  5617  5621  5627  5631
##  [1629]  5632  5637  5640  5644  5647  5650  5651  5652  5653  5655  5658
##  [1640]  5659  5663  5664  5666  5668  5675  5683  5688  5694  5696  5697
##  [1651]  5699  5702  5704  5709  5715  5716  5721  5725  5726  5732  5733
##  [1662]  5735  5737  5741  5743  5752  5755  5758  5760  5762  5770  5773
##  [1673]  5780  5782  5784  5785  5786  5790  5791  5792  5795  5799  5803
##  [1684]  5809  5811  5812  5817  5819  5821  5822  5845  5846  5852  5855
##  [1695]  5857  5858  5859  5861  5863  5866  5868  5869  5870  5871  5872
##  [1706]  5874  5880  5881  5884  5885  5886  5888  5889  5897  5898  5901
##  [1717]  5902  5917  5919  5925  5926  5929  5930  5934  5941  5956  5960
##  [1728]  5961  5962  5968  5970  5980  5982  5983  5984  5991  5992  5994
##  [1739]  5999  6002  6006  6008  6009  6015  6016  6019  6021  6025  6026
##  [1750]  6034  6038  6042  6045  6051  6058  6064  6071  6073  6077  6084
##  [1761]  6085  6088  6099  6100  6117  6122  6123  6127  6128  6130  6135
##  [1772]  6137  6140  6144  6145  6147  6152  6154  6159  6161  6168  6173
##  [1783]  6174  6175  6177  6179  6180  6183  6186  6188  6191  6192  6198
##  [1794]  6200  6201  6202  6205  6206  6207  6210  6211  6212  6217  6219
##  [1805]  6220  6222  6235  6237  6240  6245  6246  6249  6250  6255  6258
##  [1816]  6264  6267  6268  6272  6273  6278  6282  6284  6285  6286  6290
##  [1827]  6293  6298  6304  6305  6308  6309  6310  6314  6316  6317  6320
##  [1838]  6331  6333  6337  6338  6339  6342  6344  6346  6347  6351  6353
##  [1849]  6358  6359  6364  6372  6374  6378  6383  6389  6395  6397  6402
##  [1860]  6404  6407  6410  6411  6414  6415  6416  6418  6420  6421  6431
##  [1871]  6436  6440  6443  6444  6448  6451  6452  6455  6457  6459  6463
##  [1882]  6465  6466  6467  6475  6477  6479  6482  6485  6493  6511  6512
##  [1893]  6513  6514  6515  6521  6523  6524  6526  6527  6528  6529  6530
##  [1904]  6533  6538  6541  6544  6546  6548  6552  6554  6555  6559  6562
##  [1915]  6563  6568  6575  6577  6581  6591  6593  6594  6595  6597  6601
##  [1926]  6602  6606  6607  6610  6612  6613  6614  6615  6619  6622  6623
##  [1937]  6636  6656  6657  6659  6660  6663  6666  6668  6670  6672  6675
##  [1948]  6680  6681  6687  6688  6693  6696  6697  6698  6701  6702  6704
##  [1959]  6707  6712  6713  6715  6721  6724  6726  6727  6730  6736  6737
##  [1970]  6739  6740  6743  6747  6748  6756  6757  6759  6762  6763  6765
##  [1981]  6769  6770  6771  6773  6775  6784  6786  6787  6791  6792  6796
##  [1992]  6810  6813  6815  6817  6821  6822  6829  6833  6839  6840  6842
##  [2003]  6846  6850  6853  6856  6862  6868  6869  6871  6873  6875  6876
##  [2014]  6877  6878  6879  6880  6883  6884  6885  6888  6894  6898  6901
##  [2025]  6902  6915  6919  6920  6922  6923  6924  6929  6933  6935  6936
##  [2036]  6938  6944  6948  6949  6950  6954  6958  6960  6961  6974  6979
##  [2047]  6980  6984  6986  6989  6993  6996  6997  7002  7005  7008  7011
##  [2058]  7018  7019  7020  7021  7025  7026  7028  7030  7036  7037  7038
##  [2069]  7046  7047  7048  7049  7050  7054  7055  7062  7069  7074  7076
##  [2080]  7080  7082  7084  7085  7090  7095  7096  7099  7103  7104  7106
##  [2091]  7110  7111  7114  7115  7117  7120  7122  7123  7125  7126  7127
##  [2102]  7134  7138  7143  7145  7153  7157  7158  7159  7160  7172  7173
##  [2113]  7175  7177  7180  7181  7185  7188  7192  7196  7199  7203  7204
##  [2124]  7209  7210  7212  7214  7217  7218  7220  7231  7234  7238  7246
##  [2135]  7250  7253  7254  7255  7258  7260  7268  7276  7277  7279  7281
##  [2146]  7288  7290  7293  7294  7302  7303  7304  7305  7308  7309  7320
##  [2157]  7325  7335  7339  7345  7346  7347  7351  7355  7357  7358  7359
##  [2168]  7362  7363  7368  7369  7381  7382  7384  7387  7394  7396  7399
##  [2179]  7401  7410  7412  7418  7419  7421  7423  7427  7431  7432  7433
##  [2190]  7438  7439  7442  7445  7446  7447  7448  7451  7456  7457  7458
##  [2201]  7460  7462  7465  7471  7473  7474  7477  7480  7487  7495  7497
##  [2212]  7502  7504  7505  7510  7511  7512  7513  7516  7519  7520  7521
##  [2223]  7525  7527  7534  7535  7538  7540  7541  7542  7543  7547  7555
##  [2234]  7556  7557  7558  7561  7562  7563  7564  7566  7578  7579  7584
##  [2245]  7586  7587  7592  7599  7604  7605  7611  7612  7613  7615  7619
##  [2256]  7626  7631  7633  7634  7638  7647  7649  7654  7657  7661  7664
##  [2267]  7666  7671  7673  7675  7677  7685  7691  7693  7696  7698  7700
##  [2278]  7702  7704  7706  7711  7717  7718  7719  7721  7722  7727  7729
##  [2289]  7731  7735  7738  7740  7747  7749  7753  7756  7761  7765  7766
##  [2300]  7767  7769  7770  7773  7775  7777  7780  7784  7786  7788  7792
##  [2311]  7794  7796  7797  7800  7804  7806  7811  7812  7815  7816  7826
##  [2322]  7828  7829  7834  7836  7838  7840  7843  7845  7847  7855  7861
##  [2333]  7866  7870  7874  7889  7890  7901  7902  7904  7905  7907  7909
##  [2344]  7911  7913  7915  7919  7921  7924  7932  7933  7936  7940  7941
##  [2355]  7947  7948  7957  7964  7965  7969  7970  7971  7973  7975  7976
##  [2366]  7978  7982  7987  7989  7992  7994  7996  7997  7998  8002  8006
##  [2377]  8012  8015  8022  8034  8039  8041  8043  8046  8048  8051  8053
##  [2388]  8058  8061  8063  8069  8071  8079  8080  8081  8082  8083  8084
##  [2399]  8085  8088  8089  8090  8091  8092  8101  8104  8106  8107  8108
##  [2410]  8109  8114  8116  8121  8122  8123  8124  8126  8130  8131  8134
##  [2421]  8138  8141  8143  8146  8148  8153  8155  8156  8157  8165  8168
##  [2432]  8170  8171  8176  8179  8180  8182  8185  8187  8189  8194  8202
##  [2443]  8206  8208  8211  8220  8223  8229  8231  8233  8234  8239  8241
##  [2454]  8244  8250  8251  8252  8253  8255  8263  8265  8268  8269  8270
##  [2465]  8273  8278  8284  8287  8295  8296  8300  8303  8307  8310  8312
##  [2476]  8322  8326  8328  8330  8332  8334  8336  8339  8341  8343  8349
##  [2487]  8352  8369  8370  8373  8382  8383  8386  8396  8397  8405  8407
##  [2498]  8408  8411  8413  8420  8421  8430  8432  8440  8442  8445  8447
##  [2509]  8452  8453  8454  8458  8459  8460  8461  8470  8473  8474  8475
##  [2520]  8479  8482  8485  8486  8487  8488  8491  8492  8499  8501  8506
##  [2531]  8508  8513  8515  8516  8517  8520  8524  8525  8527  8529  8531
##  [2542]  8534  8536  8538  8542  8544  8549  8551  8558  8559  8561  8566
##  [2553]  8568  8577  8578  8580  8582  8583  8584  8585  8587  8590  8594
##  [2564]  8595  8601  8604  8608  8611  8612  8615  8621  8627  8628  8629
##  [2575]  8631  8640  8641  8643  8645  8646  8649  8654  8658  8663  8665
##  [2586]  8673  8677  8679  8681  8682  8684  8686  8695  8696  8698  8699
##  [2597]  8706  8710  8711  8718  8725  8727  8729  8730  8733  8743  8751
##  [2608]  8755  8756  8757  8760  8763  8764  8765  8771  8774  8781  8784
##  [2619]  8785  8790  8793  8798  8803  8812  8815  8816  8825  8826  8836
##  [2630]  8838  8840  8845  8849  8852  8860  8863  8865  8871  8873  8874
##  [2641]  8879  8880  8884  8886  8890  8892  8895  8896  8900  8902  8909
##  [2652]  8914  8917  8918  8924  8925  8929  8935  8938  8943  8944  8948
##  [2663]  8950  8951  8954  8963  8964  8965  8968  8969  8973  8974  8979
##  [2674]  9004  9008  9010  9013  9023  9025  9030  9034  9036  9037  9042
##  [2685]  9046  9048  9052  9055  9056  9059  9062  9063  9064  9066  9070
##  [2696]  9071  9074  9075  9076  9078  9080  9086  9087  9089  9091  9092
##  [2707]  9098  9099  9100  9102  9103  9106  9107  9108  9110  9111  9115
##  [2718]  9120  9121  9122  9123  9124  9125  9126  9129  9133  9135  9138
##  [2729]  9139  9148  9150  9156  9158  9160  9161  9167  9170  9171  9172
##  [2740]  9179  9180  9183  9187  9191  9192  9197  9199  9200  9202  9203
##  [2751]  9207  9217  9222  9230  9232  9233  9237  9238  9239  9241  9247
##  [2762]  9249  9251  9253  9255  9261  9263  9272  9273  9274  9280  9281
##  [2773]  9282  9283  9294  9299  9302  9305  9307  9310  9312  9313  9315
##  [2784]  9317  9320  9322  9324  9325  9327  9330  9332  9335  9338  9343
##  [2795]  9346  9349  9350  9351  9353  9360  9363  9366  9367  9370  9371
##  [2806]  9375  9376  9377  9382  9383  9386  9392  9393  9396  9398  9400
##  [2817]  9403  9404  9408  9409  9410  9413  9426  9429  9432  9438  9439
##  [2828]  9442  9447  9448  9458  9459  9463  9470  9472  9480  9489  9492
##  [2839]  9497  9499  9503  9504  9505  9506  9516  9517  9519  9523  9524
##  [2850]  9526  9536  9541  9544  9548  9549  9555  9556  9558  9559  9562
##  [2861]  9563  9570  9575  9576  9578  9585  9588  9590  9595  9596  9597
##  [2872]  9600  9601  9608  9609  9612  9617  9619  9620  9621  9625  9627
##  [2883]  9633  9634  9638  9646  9650  9654  9656  9657  9661  9662  9663
##  [2894]  9667  9674  9682  9684  9685  9692  9695  9696  9697  9698  9701
##  [2905]  9702  9704  9707  9709  9711  9712  9720  9722  9727  9730  9731
##  [2916]  9735  9737  9739  9743  9753  9755  9759  9761  9762  9764  9765
##  [2927]  9775  9777  9788  9792  9796  9803  9805  9808  9810  9814  9815
##  [2938]  9823  9832  9834  9835  9838  9841  9842  9848  9851  9854  9866
##  [2949]  9869  9871  9874  9877  9881  9886  9888  9893  9894  9895  9898
##  [2960]  9900  9904  9905  9906  9911  9915  9921  9923  9924  9925  9926
##  [2971]  9927  9935  9936  9941  9947  9949  9950  9954  9959  9962  9963
##  [2982]  9967  9973  9974  9975  9980  9981  9983  9987  9989  9990  9994
##  [2993]  9998  9999 10003 10006 10008 10009 10013 10018 10019 10021 10023
##  [3004] 10030 10031 10036 10043 10045 10046 10048 10050 10056 10063 10064
##  [3015] 10071 10077 10079 10082 10084 10086 10094 10096 10102 10104 10105
##  [3026] 10108 10110 10112 10115 10116 10117 10118 10119 10126 10128 10130
##  [3037] 10131 10134 10137 10139 10140 10141 10143 10148 10153 10160 10163
##  [3048] 10164 10165 10168 10169 10178 10179 10181 10184 10185 10186 10188
##  [3059] 10192 10194 10201 10204 10205 10206 10211 10218 10224 10228 10234
##  [3070] 10238 10246 10251 10257 10260 10264 10265 10266 10270 10271 10274
##  [3081] 10275 10276 10278 10280 10282 10289 10290 10292 10293 10295 10298
##  [3092] 10303 10315 10319 10320 10321 10324 10326 10329 10333 10336 10341
##  [3103] 10344 10346 10357 10358 10359 10361 10363 10364 10365 10366 10384
##  [3114] 10407 10410 10413 10414 10421 10425 10426 10427 10430 10452 10455
##  [3125] 10456 10460 10462 10469 10474 10477 10479 10486 10489 10492 10496
##  [3136] 10502 10504 10512 10513 10516 10518 10520 10522 10523 10527 10531
##  [3147] 10536 10537 10540 10542 10543 10546 10553 10554 10558 10559 10566
##  [3158] 10567 10570 10573 10576 10584 10587 10589 10590 10591 10593 10595
##  [3169] 10596 10610 10619 10621 10625 10626 10629 10637 10638 10639 10646
##  [3180] 10647 10650 10653 10655 10657 10659 10660 10664 10667 10670 10672
##  [3191] 10677 10680 10681 10689 10697 10698 10700 10703 10704 10710 10712
##  [3202] 10714 10718 10721 10722 10729 10730 10732 10733 10734 10736 10741
##  [3213] 10745 10746 10748 10751 10752 10759 10760 10764 10765 10769 10774
##  [3224] 10775 10782 10783 10785 10787 10792 10797 10800 10802 10806 10808
##  [3235] 10814 10816 10817 10818 10822 10825 10827 10830 10834 10835 10843
##  [3246] 10847 10849 10850 10851 10852 10855 10856 10860 10870 10877 10878
##  [3257] 10885 10886 10889 10893 10896 10901 10902 10903 10904 10906 10907
##  [3268] 10908 10911 10921 10922 10924 10935 10937 10939 10941 10942 10947
##  [3279] 10950 10953 10956 10960 10961 10962 10964 10967 10971 10978 10979
##  [3290] 10993 10995 10996 10998 11000 11001 11005 11016 11018 11019 11022
##  [3301] 11023 11038 11040 11046 11047 11051 11058 11068 11072 11080 11083
##  [3312] 11089 11090 11103 11106 11111 11119 11121 11123 11134 11137 11138
##  [3323] 11139 11142 11145 11146 11149 11150 11153 11156 11160 11168 11179
##  [3334] 11201 11203 11205 11209 11211 11213 11215 11219 11222 11235 11237
##  [3345] 11242 11248 11254 11256 11258 11261 11263 11265 11268 11273 11277
##  [3356] 11285 11289 11292 11304 11311 11312 11314 11317 11319 11321 11322
##  [3367] 11327 11328 11330 11331 11336 11337 11339 11344 11345 11354 11355
##  [3378] 11358 11360 11362 11366 11377 11381 11382 11390 11393 11398 11400
##  [3389] 11402 11412 11416 11418 11424 11425 11435 11436 11438 11440 11446
##  [3400] 11447 11450 11452 11457 11458 11459 11461 11463 11465 11466 11467
##  [3411] 11474 11475 11481 11483 11484 11491 11495 11496 11498 11499 11508
##  [3422] 11516 11517 11518 11520 11524 11528 11530 11533 11534 11539 11543
##  [3433] 11544 11548 11556 11560 11572 11576 11578 11586 11587 11591 11595
##  [3444] 11604 11613 11616 11618 11621 11624 11628 11630 11646 11647 11655
##  [3455] 11656 11659 11661 11662 11669 11670 11673 11674 11675 11676 11678
##  [3466] 11679 11686 11687 11688 11697 11699 11701 11703 11709 11710 11714
##  [3477] 11719 11720 11724 11730 11731 11732 11733 11734 11742 11743 11745
##  [3488] 11758 11761 11762 11779 11781 11792 11797 11804 11811 11813 11814
##  [3499] 11819 11823 11828 11834 11836 11843 11851 11854 11860 11861 11867
##  [3510] 11870 11874 11880 11883 11894 11901 11917 11918 11919 11942 11944
##  [3521] 11948 11950 11952 11954 11966 11968 11973 11976 11977 11980 11981
##  [3532] 11983 11986 11989 11995 11996 12000 12006 12007 12009 12012 12013
##  [3543] 12015 12018 12020 12022 12023 12024 12029 12030 12033 12035 12038
##  [3554] 12040 12046 12052 12055 12058 12059 12061 12063 12066 12074 12079
##  [3565] 12089 12093 12095 12109 12110 12115 12123 12127 12131 12135 12139
##  [3576] 12145 12153 12159 12163 12164 12165 12166 12167 12168 12171 12172
##  [3587] 12175 12182 12183 12190 12191 12194 12197 12200 12201 12203 12205
##  [3598] 12208 12211 12215 12217 12222 12223 12229 12231 12235 12237 12241
##  [3609] 12244 12247 12251 12255 12257 12264 12265 12268 12272 12273 12276
##  [3620] 12279 12280 12283 12284 12287 12290 12292 12293 12294 12299 12304
##  [3631] 12308 12309 12314 12315 12320 12321 12329 12331 12333 12336 12348
##  [3642] 12353 12354 12356 12359 12366 12383 12384 12391 12394 12396 12398
##  [3653] 12399 12410 12414 12416 12418 12420 12423 12426 12427 12429 12433
##  [3664] 12436 12437 12440 12446 12447 12448 12451 12452 12453 12455 12456
##  [3675] 12461 12464 12469 12471 12476 12478 12482 12487 12488 12495 12505
##  [3686] 12506 12509 12512 12515 12522 12526 12529 12531 12537 12542 12545
##  [3697] 12546 12551 12552 12553 12554 12558 12561 12563 12576 12577 12578
##  [3708] 12580 12582 12583 12594 12598 12602 12603 12606 12607 12618 12622
##  [3719] 12624 12625 12626 12628 12629 12633 12637 12638 12643 12647 12648
##  [3730] 12649 12653 12654 12659 12664 12671 12672 12676 12680 12682 12683
##  [3741] 12685 12688 12690 12692 12701 12708 12710 12713 12715 12721 12731
##  [3752] 12732 12737 12743 12750 12755 12759 12763 12764 12765 12766 12768
##  [3763] 12771 12772 12773 12776 12783 12784 12790 12793 12799 12802 12806
##  [3774] 12807 12808 12813 12816 12818 12819 12821 12822 12830 12831 12832
##  [3785] 12833 12841 12842 12846 12847 12850 12857 12861 12862 12863 12864
##  [3796] 12866 12867 12869 12872 12874 12877 12883 12887 12888 12893 12896
##  [3807] 12897 12905 12908 12909 12911 12915 12919 12920 12926 12927 12932
##  [3818] 12933 12940 12944 12948 12963 12967 12971 12973 12975 12976 12985
##  [3829] 12991 12993 12996 12997 12999 13000 13005 13007 13009 13026 13028
##  [3840] 13029 13030 13031 13032 13039 13042 13043 13044 13046 13047 13049
##  [3851] 13051 13052 13053 13056 13057 13062 13063 13067 13069 13070 13080
##  [3862] 13083 13089 13095 13098 13103 13106 13107 13108 13111 13114 13116
##  [3873] 13117 13119 13127 13128 13130 13137 13141 13145 13146 13147 13150
##  [3884] 13153 13154 13158 13160 13161 13166 13171 13172 13178 13186 13187
##  [3895] 13194 13200 13209 13211 13213 13215 13216 13219 13224 13226 13228
##  [3906] 13229 13234 13237 13240 13242 13243 13246 13251 13252 13256 13257
##  [3917] 13258 13259 13261 13263 13264 13265 13269 13270 13271 13272 13273
##  [3928] 13280 13281 13285 13291 13299 13300 13305 13313 13317 13320 13328
##  [3939] 13330 13332 13335 13337 13349 13354 13355 13357 13363 13367 13368
##  [3950] 13369 13381 13383 13395 13409 13410 13411 13416 13417 13420 13424
##  [3961] 13425 13428 13434 13459 13461 13463 13467 13470 13473 13477 13478
##  [3972] 13479 13480 13481 13482 13486 13489 13494 13496 13500 13504 13509
##  [3983] 13522 13523 13525 13527 13529 13531 13532 13536 13542 13547 13551
##  [3994] 13553 13555 13560 13562 13563 13564 13565 13568 13569 13570 13572
##  [4005] 13573 13577 13580 13586 13588 13590 13594 13595 13596 13597 13599
##  [4016] 13603 13605 13606 13607 13609 13612 13614 13616 13619 13620 13621
##  [4027] 13623 13626 13632 13634 13645 13649 13650 13653 13654 13655 13659
##  [4038] 13662 13666 13673 13675 13680 13681 13691 13702 13703 13705 13707
##  [4049] 13709 13715 13719 13725 13732 13734 13736 13738 13742 13744 13746
##  [4060] 13748 13757 13758 13761 13762 13766 13767 13768 13774 13780 13786
##  [4071] 13789 13794 13807 13808 13819 13821 13824 13831 13833 13838 13840
##  [4082] 13842 13844 13847 13850 13851 13857 13858 13860 13861 13864 13874
##  [4093] 13875 13877 13879 13885 13887 13888 13891 13897 13899 13903 13906
##  [4104] 13911 13912 13916 13918 13920 13924 13926 13927 13931 13933 13937
##  [4115] 13938 13941 13945 13947 13958 13959 13960 13969 13970 13979 13981
##  [4126] 13982 13987 13991 13993 13996 14002 14005 14009 14010 14011 14012
##  [4137] 14014 14024 14025 14033 14040 14042 14048 14050 14052 14056 14059
##  [4148] 14062 14064 14068 14069 14072 14074 14076 14080 14083 14088 14090
##  [4159] 14092 14093 14101 14108 14112 14114 14123 14130 14131 14132 14133
##  [4170] 14136 14143 14144 14145 14146 14147 14149 14153 14156 14159 14163
##  [4181] 14167 14171 14174 14178 14187 14188 14198 14199 14204 14207 14216
##  [4192] 14217 14230 14231 14232 14234 14237 14241 14244 14245 14246 14248
##  [4203] 14251 14252 14256 14258 14260 14265 14267 14270 14276 14282 14294
##  [4214] 14297 14299 14300 14302 14305 14306 14308 14315 14322 14329 14332
##  [4225] 14334 14336 14342 14346 14349 14351 14353 14358 14360 14361 14364
##  [4236] 14366 14369 14370 14371 14375 14380 14389 14394 14397 14400 14401
##  [4247] 14414 14418 14421 14429 14431 14433 14435 14440 14443 14444 14447
##  [4258] 14449 14454 14456 14459 14462 14464 14468 14471 14475 14485 14487
##  [4269] 14490 14492 14497 14498 14499 14501 14503 14504 14505 14506 14507
##  [4280] 14510 14511 14515 14518 14520 14521 14526 14532 14533 14536 14548
##  [4291] 14549 14553 14558 14559 14563 14566 14569 14570 14571 14573 14575
##  [4302] 14583 14585 14588 14589 14591 14595 14598 14609 14613 14614 14615
##  [4313] 14616 14622 14627 14629 14630 14632 14634 14635 14639 14645 14646
##  [4324] 14663 14665 14670 14672 14681 14684 14685 14689 14691 14704 14714
##  [4335] 14717 14721 14723 14726 14728 14730 14738 14739 14740 14741 14743
##  [4346] 14746 14750 14755 14756 14758 14759 14765 14768 14770 14771 14773
##  [4357] 14776 14779 14780 14783 14786 14788 14789 14792 14794 14796 14800
##  [4368] 14801 14807 14810 14812 14818 14819 14821 14823 14827 14829 14830
##  [4379] 14832 14839 14841 14842 14844 14845 14850 14853 14856 14857 14858
##  [4390] 14861 14862 14863 14864 14868 14869 14870 14874 14879 14880 14881
##  [4401] 14888 14889 14890 14893 14900 14902 14903 14904 14907 14917 14918
##  [4412] 14922 14926 14928 14934 14938 14940 14944 14945 14961 14963 14964
##  [4423] 14965 14971 14972 14973 14975 14979 14980 14984 14988 14991 14993
##  [4434] 14995 14997 15004 15006 15008 15014 15015 15017 15021 15022 15023
##  [4445] 15026 15030 15031 15032 15033 15035 15042 15043 15046 15047 15050
##  [4456] 15052 15054 15055 15065 15068 15074 15081 15082 15093 15100 15105
##  [4467] 15107 15109 15116 15118 15122 15123 15127 15130 15133 15135 15137
##  [4478] 15140 15142 15143 15146 15147 15150 15153 15154 15156 15157 15158
##  [4489] 15159 15167 15170 15172 15180 15181 15186 15187 15191 15194 15196
##  [4500] 15201 15212 15224 15226 15227 15229 15237 15238 15239 15240 15241
##  [4511] 15244 15248 15249 15250 15255 15262 15266 15271 15274 15279 15283
##  [4522] 15288 15289 15292 15295 15297 15299 15302 15304 15305 15306 15307
##  [4533] 15308 15311 15312 15317 15323 15326 15331 15340 15342 15346 15351
##  [4544] 15355 15356 15362 15363 15365 15368 15370 15373 15374 15379 15380
##  [4555] 15383 15384 15390 15391 15396 15397 15398 15403 15406 15418 15420
##  [4566] 15421 15427 15432 15436 15438 15439 15440 15452 15455 15456 15459
##  [4577] 15461 15462 15464 15467 15473 15476 15481 15483 15488 15489 15494
##  [4588] 15496 15497 15499 15502 15503 15505 15507 15508 15510 15517 15522
##  [4599] 15524 15527 15530 15533 15536 15540 15543 15544 15545 15548 15549
##  [4610] 15551 15554 15558 15564 15570 15571 15577 15579 15589 15592 15594
##  [4621] 15595 15597 15600 15603 15606 15611 15616 15625 15629 15633 15639
##  [4632] 15641 15642 15649 15651 15657 15662 15664 15668 15672 15674 15675
##  [4643] 15676 15684 15686 15687 15689 15695 15696 15699 15703 15705 15709
##  [4654] 15713 15714 15715 15720 15728 15736 15751 15754 15758 15765 15771
##  [4665] 15773 15776 15781 15790 15795 15800 15805 15807 15811 15812 15817
##  [4676] 15819 15821 15826 15830 15832 15837 15838 15840 15841 15846 15848
##  [4687] 15849 15859 15863 15866 15868 15877 15878 15883 15885 15888 15891
##  [4698] 15895 15909 15910 15917 15918 15919 15923 15924 15936 15951 15954
##  [4709] 15955 15959 15960 15963 15964 15967 15971 15973 15974 15975 15981
##  [4720] 15984 15986 15989 15991 16003 16006 16007 16008 16022 16024 16025
##  [4731] 16030 16033 16042 16054 16055 16056 16060 16066 16067 16068 16072
##  [4742] 16074 16075 16079 16082 16090 16091 16096 16100 16102 16103 16104
##  [4753] 16113 16116 16118 16123 16129 16133 16134 16143 16152 16153 16155
##  [4764] 16157 16158 16160 16163 16165 16166 16169 16171 16172 16177 16179
##  [4775] 16183 16186 16188 16191 16192 16194 16197 16201 16205 16214 16215
##  [4786] 16221 16222 16225 16226 16230 16231 16234 16237 16240 16241 16248
##  [4797] 16252 16256 16259 16268 16270 16271 16283 16289 16294 16295 16297
##  [4808] 16300 16301 16304 16310 16313 16314 16315 16321 16326 16333 16334
##  [4819] 16335 16338 16339 16343 16347 16350 16354 16355 16356 16359 16362
##  [4830] 16365 16367 16368 16370 16374 16376 16381 16390 16393 16398 16403
##  [4841] 16412 16413 16415 16421 16424 16425 16432 16438 16446 16448 16449
##  [4852] 16453 16458 16459 16461 16464 16478 16485 16487 16488 16489 16512
##  [4863] 16513 16514 16515 16518 16520 16521 16524 16528 16531 16538 16549
##  [4874] 16550 16551 16555 16556 16558 16563 16568 16573 16576 16580 16588
##  [4885] 16591 16597 16599 16602 16603 16605 16609 16611 16615 16616 16617
##  [4896] 16620 16621 16622 16625 16627 16628 16629 16631 16635 16636 16639
##  [4907] 16641 16649 16650 16651 16657 16658 16661 16665 16666 16670 16672
##  [4918] 16674 16682 16684 16686 16687 16688 16693 16695 16698 16703 16706
##  [4929] 16709 16710 16713 16714 16718 16722 16724 16727 16730 16732 16737
##  [4940] 16740 16743 16744 16747 16751 16753 16754 16756 16762 16770 16771
##  [4951] 16779 16786 16787 16789 16790 16800 16801 16807 16813 16814 16816
##  [4962] 16817 16822 16823 16824 16826 16827 16833 16834 16841 16855 16857
##  [4973] 16858 16860 16861 16862 16864 16865 16866 16870 16871 16872 16877
##  [4984] 16884 16886 16887 16888 16889 16895 16897 16899 16912 16927 16930
##  [4995] 16936 16938 16939 16940 16944 16945 16953 16954 16955 16959 16965
##  [5006] 16967 16970 16972 16975 16976 16982 16983 16987 16994 16998 17000
##  [5017] 17013 17018 17027 17031 17032 17038 17045 17046 17047 17050 17054
##  [5028] 17056 17059 17060 17061 17064 17066 17068 17076 17081 17085 17088
##  [5039] 17089 17092 17100 17102 17104 17105 17107 17112 17113 17114 17119
##  [5050] 17121 17123 17132 17136 17138 17144 17145 17148 17153 17154 17156
##  [5061] 17159 17161 17164 17166 17172 17176 17179 17183 17186 17188 17190
##  [5072] 17193 17200 17201 17205 17206 17208 17211 17213 17218 17222 17223
##  [5083] 17225 17226 17230 17237 17243 17245 17246 17252 17256 17258 17259
##  [5094] 17261 17262 17267 17274 17275 17277 17279 17280 17281 17284 17287
##  [5105] 17289 17291 17301 17302 17306 17307 17310 17313 17319 17322 17324
##  [5116] 17327 17328 17335 17336 17346 17347 17348 17349 17352 17353 17358
##  [5127] 17360 17366 17374 17375 17378 17379 17380 17384 17389 17394 17396
##  [5138] 17398 17400 17401 17406 17407 17408 17410 17415 17418 17424 17425
##  [5149] 17426 17427 17429 17430 17433 17435 17436 17440 17442 17443 17447
##  [5160] 17451 17456 17458 17460 17461 17462 17465 17469 17470 17477 17478
##  [5171] 17480 17483 17485 17487 17494 17497 17498 17499 17503 17505 17510
##  [5182] 17524 17526 17527 17528 17529 17533 17536 17542 17545 17547 17548
##  [5193] 17551 17555 17557 17561 17575 17577 17578 17583 17589 17592 17599
##  [5204] 17601 17602 17614 17615 17621 17625 17627 17636 17640 17643 17647
##  [5215] 17650 17653 17654 17656 17657 17658 17669 17671 17672 17678 17685
##  [5226] 17686 17687 17690 17697 17699 17701 17705 17711 17714 17722 17729
##  [5237] 17731 17735 17738 17739 17746 17749 17751 17754 17757 17761 17764
##  [5248] 17766 17767 17770 17774 17776 17781 17785 17786 17791 17792 17793
##  [5259] 17797 17806 17810 17811 17813 17814 17819 17821 17823 17827 17830
##  [5270] 17840 17844 17846 17847 17849 17850 17852 17853 17873 17875 17880
##  [5281] 17882 17885 17886 17888 17889 17891 17894 17895 17896 17897 17903
##  [5292] 17904 17917 17918 17919 17923 17930 17935 17938 17940 17943 17946
##  [5303] 17951 17953 17957 17962 17967 17971 17972 17975 17979 17984 17988
##  [5314] 17989 17994 17995 18001 18007 18009 18016 18020 18021 18023 18024
##  [5325] 18025 18031 18032 18035 18036 18037 18039 18041 18043 18053 18054
##  [5336] 18056 18058 18060 18062 18064 18066 18071 18072 18078 18083 18085
##  [5347] 18097 18098 18100 18102 18103 18114 18116 18118 18120 18128 18130
##  [5358] 18131 18133 18134 18139 18142 18143 18148 18149 18151 18152 18156
##  [5369] 18158 18159 18160 18165 18168 18173 18174 18177 18180 18181 18186
##  [5380] 18188 18195 18196 18199 18203 18206 18210 18214 18217 18227 18229
##  [5391] 18231 18232 18234 18235 18236 18239 18244 18245 18249 18256 18260
##  [5402] 18263 18267 18268 18280 18285 18287 18292 18293 18295 18296 18297
##  [5413] 18298 18299 18300 18306 18307 18320 18324 18327 18328 18332 18337
##  [5424] 18341 18343 18344 18346 18350 18355 18356 18358 18364 18373 18377
##  [5435] 18379 18380 18386 18387 18388 18392 18393 18398 18401 18404 18407
##  [5446] 18410 18411 18413 18423 18430 18433 18434 18437 18439 18440 18441
##  [5457] 18442 18446 18449 18450 18452 18456 18458 18462 18463 18471 18475
##  [5468] 18481 18484 18489 18495 18497 18500 18501 18504 18510 18511 18520
##  [5479] 18522 18524 18537 18538 18543 18544 18549 18556 18557 18558 18562
##  [5490] 18568 18571 18573 18574 18575 18576 18585 18586 18593 18596 18598
##  [5501] 18603 18605 18607 18614 18618 18622 18626 18635 18638 18642 18643
##  [5512] 18646 18648 18652 18656 18659 18660 18663 18665 18666 18667 18671
##  [5523] 18687 18688 18690 18691 18693 18695 18705 18709 18713 18717 18720
##  [5534] 18721 18723 18725 18728 18733 18734 18735 18736 18737 18738 18739
##  [5545] 18742 18743 18753 18761 18764 18766 18767 18768 18774 18777 18781
##  [5556] 18783 18785 18792 18794 18797 18798 18804 18805 18806 18807 18809
##  [5567] 18812 18813 18816 18817 18818 18819 18824 18825 18830 18834 18836
##  [5578] 18839 18842 18843 18851 18857 18858 18862 18866 18872 18874 18876
##  [5589] 18878 18884 18886 18888 18896 18897 18900 18902 18905 18907 18908
##  [5600] 18915 18916 18923 18924 18932 18934 18936 18938 18939 18945 18947
##  [5611] 18951 18952 18954 18957 18962 18973 18974 18977 18994 18999 19003
##  [5622] 19006 19010 19017 19019 19023 19024 19029 19030 19032 19033 19035
##  [5633] 19037 19044 19046 19049 19050 19051 19053 19054 19061 19070 19074
##  [5644] 19080 19084 19085 19087 19092 19107 19108 19109 19117 19119 19120
##  [5655] 19125 19132 19133 19134 19135 19136 19139 19141 19144 19148 19153
##  [5666] 19168 19169 19171 19174 19175 19181 19182 19183 19190 19193 19195
##  [5677] 19196 19197 19200 19204 19205 19211 19213 19218 19224 19232 19235
##  [5688] 19238 19239 19240 19241 19244 19245 19247 19253 19254 19255 19257
##  [5699] 19258 19261 19263 19264 19269 19273 19275 19280 19286 19290 19295
##  [5710] 19297 19298 19300 19301 19303 19307 19308 19311 19313 19316 19318
##  [5721] 19336 19338 19339 19340 19344 19348 19349 19350 19353 19364 19367
##  [5732] 19369 19376 19377 19383 19392 19393 19394 19395 19397 19399 19400
##  [5743] 19402 19403 19409 19415 19417 19428 19430 19435 19442 19443 19445
##  [5754] 19446 19450 19451 19462 19472 19473 19476 19478 19481 19488 19489
##  [5765] 19491 19494 19499 19500 19501 19505 19506 19507 19510 19514 19518
##  [5776] 19527 19532 19534 19535 19537 19538 19540 19543 19545 19548 19550
##  [5787] 19552 19555 19559 19579 19586 19590 19591 19592 19595 19597 19598
##  [5798] 19607 19611 19620 19621 19622 19623 19624 19625 19626 19627 19630
##  [5809] 19643 19645 19646 19649 19652 19654 19656 19658 19661 19662 19671
##  [5820] 19673 19675 19677 19680 19683 19685 19686 19687 19694 19695 19699
##  [5831] 19700 19701 19702 19705 19708 19715 19717 19719 19727 19730 19732
##  [5842] 19734 19738 19740 19746 19751 19753 19757 19758 19761 19764 19765
##  [5853] 19767 19768 19770 19773 19777 19780 19782 19783 19787 19789 19798
##  [5864] 19799 19804 19812 19813 19814 19816 19817 19820 19826 19842 19844
##  [5875] 19845 19853 19859 19862 19867 19874 19883 19884 19888 19890 19893
##  [5886] 19898 19900 19901 19905 19906 19907 19911 19912 19914 19915 19916
##  [5897] 19918 19920 19923 19925 19926 19928 19929 19933 19935 19937 19938
##  [5908] 19943 19945 19951 19956 19959 19960 19969 19979 19980 19982 19989
##  [5919] 19995 19999 20000 20003 20005 20008 20009 20013 20015 20016 20017
##  [5930] 20018 20020 20026 20028 20031 20039 20040 20041 20046 20051 20052
##  [5941] 20054 20057 20061 20064 20065 20066 20067 20068 20071 20072 20074
##  [5952] 20075 20077 20084 20087 20088 20089 20095 20096 20098 20100 20104
##  [5963] 20106 20109 20113 20114 20116 20117 20118 20122 20123 20124 20125
##  [5974] 20127 20130 20134 20136 20137 20146 20147 20151 20153 20154 20167
##  [5985] 20168 20171 20172 20178 20179 20180 20182 20187 20188 20206 20207
##  [5996] 20217 20219 20220 20225 20226 20227 20238 20242 20246 20249 20251
##  [6007] 20255 20261 20266 20267 20281 20282 20288 20291 20293 20295 20296
##  [6018] 20298 20304 20315 20318 20319 20321 20323 20328 20329 20331 20341
##  [6029] 20344 20345 20347 20353 20354 20359 20360 20365 20368 20369 20374
##  [6040] 20376 20379 20380 20381 20383 20388 20389 20391 20393 20397 20400
##  [6051] 20401 20404 20411 20418 20419 20420 20425 20426 20427 20429 20431
##  [6062] 20438 20439 20446 20448 20455 20457 20458 20460 20464 20467 20468
##  [6073] 20469 20470 20480 20481 20482 20486 20491 20492 20494 20496 20497
##  [6084] 20501 20503 20505 20508 20510 20512 20514 20515 20518 20520 20522
##  [6095] 20543 20544 20545 20547 20553 20554 20555 20556 20561 20562 20563
##  [6106] 20568 20573 20574 20575 20577 20583 20584 20588 20590 20591 20592
##  [6117] 20596 20597 20599 20604 20610 20612 20617 20618 20619 20623 20632
##  [6128] 20636 20639 20640 20641 20644 20646 20647 20654 20655 20657 20661
##  [6139] 20666 20668 20672 20683 20694 20695 20698 20700 20701 20704 20707
##  [6150] 20708 20711 20718 20726 20731 20739 20741 20742 20743 20748 20750
##  [6161] 20751 20752 20758 20762 20765 20773 20779 20784 20786 20787 20791
##  [6172] 20796 20803 20806 20810 20811 20813 20815 20821 20822 20824 20831
##  [6183] 20836 20840 20841 20843 20844 20845 20846 20849 20851 20855 20856
##  [6194] 20860 20861 20864 20876 20877 20878 20880 20882 20885 20889 20897
##  [6205] 20898 20904 20906 20913 20921 20923 20926 20932 20934 20943 20944
##  [6216] 20946 20948 20950 20956 20958 20965 20968 20978 20979 20980 20984
##  [6227] 20990 20994 20996 20998 21003 21005 21006 21007 21020 21022 21023
##  [6238] 21024 21028 21036 21037 21046 21049 21052 21055 21060 21064 21069
##  [6249] 21070 21076 21081 21088 21089 21091 21092 21100 21104 21110 21112
##  [6260] 21119 21121 21127 21128 21129 21133 21136 21137 21142 21150 21164
##  [6271] 21165 21168 21169 21172 21174 21180 21182 21183 21186 21202 21204
##  [6282] 21206 21210 21212 21214 21216 21217 21223 21225 21228 21229 21232
##  [6293] 21233 21236 21241 21243 21244 21247 21249 21252 21258 21259 21262
##  [6304] 21263 21265 21273 21276 21287 21290 21291 21292 21293 21294 21295
##  [6315] 21299 21302 21303 21305 21307 21308 21309 21320 21321 21322 21323
##  [6326] 21329 21330 21335 21336 21344 21348 21351 21355 21358 21361 21368
##  [6337] 21369 21373 21374 21375 21377 21378 21379 21380 21381 21383 21384
##  [6348] 21390 21392 21394 21397 21401 21411 21413 21416 21423 21424 21426
##  [6359] 21428 21433 21441 21455 21457 21463 21466 21469 21470 21472 21476
##  [6370] 21477 21478 21481 21485 21486 21490 21492 21493 21495 21497 21499
##  [6381] 21502 21505 21508 21509 21513 21514 21518 21522 21524 21532 21534
##  [6392] 21538 21539 21563 21565 21567 21568 21570 21571 21572 21574 21578
##  [6403] 21579 21580 21581 21584 21588 21591 21593 21594 21599 21601 21602
##  [6414] 21607 21608 21611 21615 21616 21618 21626 21628 21629 21631 21635
##  [6425] 21636 21639 21640 21644 21647 21651 21653 21654 21655 21658 21660
##  [6436] 21661 21662 21663 21665 21668 21673 21678 21679 21685 21687 21689
##  [6447] 21690 21694 21695 21699 21700 21709 21714 21717 21724 21729 21730
##  [6458] 21731 21739 21743 21745 21746 21750 21753 21757 21758 21764 21768
##  [6469] 21769 21772 21776 21783 21788 21793 21797 21800 21802 21806 21807
##  [6480] 21808 21809 21813 21814 21819 21820 21823 21828 21833 21835 21838
##  [6491] 21844 21845 21851 21855 21856 21858 21870 21872 21874 21881 21886
##  [6502] 21890 21891 21892 21894 21896 21911 21915 21919 21922 21929 21938
##  [6513] 21940 21941 21949 21950 21952 21956 21961 21962 21971 21972 21980
##  [6524] 21981 21984 21987 21988 21989 21990 21996 22000 22009 22010 22012
##  [6535] 22013 22015 22023 22025 22027 22029 22032 22034 22039 22041 22048
##  [6546] 22049 22050 22053 22058 22063 22067 22068 22075 22080 22083 22085
##  [6557] 22088 22095 22096 22098 22104 22106 22114 22127 22128 22130 22131
##  [6568] 22132 22135 22136 22144 22146 22147 22150 22153 22156 22158 22160
##  [6579] 22161 22164 22165 22171 22172 22174 22176 22177 22178 22179 22181
##  [6590] 22188 22191 22195 22196 22197 22207 22209 22212 22220 22221 22230
##  [6601] 22238 22239 22240 22246 22251 22255 22258 22265 22267 22270 22271
##  [6612] 22274 22282 22283 22284 22286 22291 22293 22297 22298 22300 22304
##  [6623] 22306 22307 22308 22309 22314 22317 22321 22322 22325 22326 22331
##  [6634] 22334 22335 22339 22341 22342 22343 22344 22347 22358 22359 22360
##  [6645] 22362 22372 22375 22376 22378 22389 22391 22392 22395 22398 22399
##  [6656] 22400 22404 22412 22416 22421 22423 22429 22435 22454 22456 22458
##  [6667] 22462 22468 22471 22473 22478 22479 22481 22484 22486 22487 22502
##  [6678] 22505 22511 22520 22521 22525 22539 22540 22546 22547 22551 22554
##  [6689] 22561 22562 22563 22567 22570 22573 22575 22578 22582 22584 22588
##  [6700] 22593 22601 22603 22609 22611 22615 22625 22627 22630 22637 22638
##  [6711] 22641 22642 22645 22648 22650 22655 22660 22661 22665 22666 22667
##  [6722] 22673 22681 22685 22689 22692 22701 22704 22705 22708 22710 22714
##  [6733] 22716 22717 22723 22724 22725 22727 22728 22733 22734 22745 22746
##  [6744] 22747 22749 22751 22757 22769 22774 22778 22795 22800 22803 22804
##  [6755] 22807 22811 22812 22815 22816 22820 22821 22823 22827 22832 22833
##  [6766] 22843 22844 22845 22847 22848 22854 22862 22868 22869 22874 22875
##  [6777] 22876 22877 22878 22885 22886 22889 22891 22893 22895 22897 22901
##  [6788] 22904 22912 22913 22916 22927 22931 22941 22943 22946 22952 22953
##  [6799] 22956 22958 22961 22966 22974 22975 22976 22978 22987 22990 22993
##  [6810] 22998 23000 23002 23005 23007 23008 23009 23010 23019 23021 23028
##  [6821] 23033 23036 23047 23052 23053 23057 23064 23065 23076 23080 23087
##  [6832] 23091 23092 23094 23099 23102 23103 23105 23113 23119 23124 23128
##  [6843] 23129 23130 23131 23135 23136 23139 23140 23143 23144 23145 23153
##  [6854] 23156 23158 23161 23162 23164 23165 23168 23169 23172 23179 23181
##  [6865] 23183 23184 23185 23188 23189 23192 23204 23205 23207 23218 23226
##  [6876] 23227 23228 23232 23236 23238 23242 23243 23246 23247 23250 23255
##  [6887] 23262 23265 23266 23267 23268 23274 23276 23279 23281 23284 23296
##  [6898] 23297 23298 23300 23301 23307 23310 23311 23320 23331 23333 23335
##  [6909] 23342 23344 23345 23349 23351 23352 23353 23354 23361 23362 23366
##  [6920] 23368 23369 23376 23380 23386 23390 23391 23394 23397 23400 23402
##  [6931] 23405 23406 23408 23416 23418 23419 23421 23425 23427 23431 23433
##  [6942] 23438 23441 23455 23457 23458 23464 23465 23468 23477 23478 23479
##  [6953] 23481 23482 23483 23484 23486 23487 23490 23492 23496 23500 23501
##  [6964] 23505 23512 23516 23523 23527 23530 23531 23533 23537 23540 23541
##  [6975] 23553 23554 23555 23562 23563 23565 23566 23572 23574 23576 23578
##  [6986] 23580 23581 23584 23588 23592 23594 23595 23597 23598 23606 23608
##  [6997] 23610 23621 23625 23633 23638 23646 23649 23650 23651 23652 23668
##  [7008] 23671 23672 23678 23679 23680 23694 23695 23699 23701 23704 23706
##  [7019] 23709 23714 23717 23718 23720 23724 23725 23729 23730 23735 23736
##  [7030] 23741 23745 23747 23757 23762 23763 23777 23780 23787 23788 23797
##  [7041] 23799 23800 23804 23811 23816 23818 23828 23834 23835 23836 23841
##  [7052] 23843 23846 23849 23850 23852 23854 23858 23859 23862 23865 23866
##  [7063] 23871 23872 23873 23874 23875 23879 23881 23884 23886 23889 23890
##  [7074] 23894 23897 23898 23899 23902 23904 23908 23916 23918 23925 23926
##  [7085] 23927 23928 23931 23932 23936 23938 23941 23944 23953 23954 23955
##  [7096] 23957 23959 23960 23961 23963 23964 23967 23971 23975 23978 23983
##  [7107] 23988 23991 23994 23995 24001 24002 24006 24008 24020 24021 24028
##  [7118] 24032 24035 24038 24042 24044 24048 24050 24052 24053 24056 24064
##  [7129] 24065 24067 24070 24078 24080 24083 24086 24089 24091 24093 24099
##  [7140] 24102 24103 24107 24109 24115 24122 24127 24132 24133 24134 24135
##  [7151] 24146 24149 24161 24164 24168 24170 24182 24187 24188 24191 24192
##  [7162] 24194 24199 24200 24205 24207 24213 24216 24218 24232 24238 24239
##  [7173] 24241 24246 24251 24254 24261 24268 24269 24272 24274 24276 24278
##  [7184] 24285 24291 24292 24295 24296 24298 24305 24306 24317 24320 24322
##  [7195] 24323 24326 24328 24329 24333 24334 24335 24338 24346 24349 24351
##  [7206] 24353 24356 24357 24359 24362 24364 24367 24369 24371 24372 24373
##  [7217] 24377 24378 24381 24383 24386 24391 24396 24401 24406 24412 24415
##  [7228] 24425 24432 24434 24439 24451 24461 24466 24467 24470 24471 24479
##  [7239] 24487 24488 24491 24492 24497 24502 24505 24509 24513 24514 24515
##  [7250] 24521 24522 24525 24526 24532 24536 24539 24542 24543 24546 24547
##  [7261] 24554 24558 24562 24566 24572 24574 24579 24582 24585 24588 24592
##  [7272] 24608 24616 24617 24619 24631 24632 24638 24640 24644 24646 24647
##  [7283] 24648 24656 24658 24659 24662 24673 24675 24677 24678 24679 24680
##  [7294] 24681 24683 24684 24685 24688 24701 24702 24704 24706 24708 24715
##  [7305] 24716 24722 24724 24728 24734 24735 24736 24738 24743 24746 24747
##  [7316] 24748 24750 24752 24754 24759 24760 24762 24766 24772 24775 24777
##  [7327] 24778 24782 24784 24786 24788 24790 24793 24795 24797 24798 24801
##  [7338] 24803 24808 24811 24817 24819 24825 24834 24835 24836 24839 24841
##  [7349] 24842 24846 24851 24853 24856 24858 24859 24860 24861 24863 24865
##  [7360] 24866 24872 24875 24885 24892 24899 24902 24903 24904 24906 24908
##  [7371] 24910 24911 24915 24916 24917 24928 24933 24934 24935 24938 24941
##  [7382] 24942 24943 24944 24946 24948 24953 24955 24957 24958 24960 24961
##  [7393] 24963 24971 24972 24974 24980 24987 24992 24994 24996 25001 25005
##  [7404] 25007 25010 25013 25015 25017 25018 25019 25020 25023 25025 25026
##  [7415] 25030 25036 25040 25054 25055 25067 25070 25073 25074 25077 25078
##  [7426] 25081 25083 25084 25088 25089 25095 25098 25101 25109 25110 25113
##  [7437] 25115 25116 25117 25118 25121 25131 25133 25145 25155 25156 25157
##  [7448] 25159 25160 25161 25162 25163 25171 25174 25179 25180 25182 25193
##  [7459] 25195 25197 25204 25207 25211 25214 25218 25219 25220 25223 25244
##  [7470] 25247 25251 25252 25255 25257 25260 25263 25265 25266 25267 25271
##  [7481] 25275 25277 25282 25287 25288 25289 25294 25296 25300 25303 25307
##  [7492] 25308 25314 25315 25318 25331 25339 25341 25344 25345 25347 25350
##  [7503] 25354 25357 25361 25363 25367 25372 25375 25376 25382 25383 25389
##  [7514] 25390 25392 25393 25394 25397 25402 25404 25414 25425 25426 25428
##  [7525] 25430 25433 25435 25442 25452 25454 25455 25456 25459 25461 25462
##  [7536] 25470 25476 25483 25484 25488 25492 25495 25499 25505 25506 25509
##  [7547] 25510 25511 25512 25517 25523 25530 25532 25535 25539 25540 25543
##  [7558] 25546 25551 25552 25553 25557 25561 25567 25571 25574 25578 25579
##  [7569] 25580 25584 25586 25591 25593 25594 25595 25596 25598 25600 25614
##  [7580] 25615 25619 25623 25624 25626 25631 25633 25638 25639 25640 25645
##  [7591] 25652 25653 25655 25662 25671 25676 25677 25679 25680 25682 25686
##  [7602] 25687 25690 25698 25705 25706 25708 25709 25711 25713 25717 25723
##  [7613] 25728 25732 25736 25740 25742 25743 25744 25746 25751 25754 25762
##  [7624] 25764 25768 25770 25772 25773 25774 25779 25780 25785 25786 25790
##  [7635] 25792 25793 25796 25798 25800 25808 25812 25813 25828 25830 25834
##  [7646] 25837 25845 25848 25849 25850 25852 25854 25858 25862 25866 25868
##  [7657] 25872 25873 25875 25876 25878 25881 25882 25884 25885 25888 25890
##  [7668] 25896 25898 25899 25902 25903 25906 25910 25913 25916 25917 25919
##  [7679] 25921 25924 25931 25936 25945 25947 25948 25950 25951 25952 25953
##  [7690] 25956 25961 25963 25967 25968 25969 25970 25972 25973 25974 25981
##  [7701] 25985 25986 25990 25993 25996 25998 26005 26011 26012 26019 26024
##  [7712] 26025 26031 26032 26033 26038 26040 26044 26045 26052 26055 26057
##  [7723] 26058 26062 26069 26084 26097 26098 26101 26106 26109 26112 26124
##  [7734] 26126 26127 26128 26129 26131 26137 26140 26143 26147 26152 26154
##  [7745] 26156 26161 26166 26171 26175 26177 26178 26185 26190 26193 26195
##  [7756] 26199 26209 26210 26211 26213 26217 26221 26229 26231 26232 26233
##  [7767] 26234 26239 26241 26243 26244 26245 26252 26253 26260 26261 26264
##  [7778] 26267 26268 26269 26273 26274 26275 26281 26284 26286 26287 26289
##  [7789] 26290 26302 26303 26304 26309 26315 26316 26321 26332 26335 26339
##  [7800] 26342 26346 26351 26354 26362 26363 26368 26370 26375 26376 26381
##  [7811] 26382 26384 26388 26391 26393 26395 26399 26402 26403 26406 26407
##  [7822] 26410 26414 26417 26421 26424 26427 26435 26437 26441 26444 26448
##  [7833] 26455 26458 26472 26473 26474 26476 26479 26481 26491 26496 26503
##  [7844] 26505 26507 26524 26532 26534 26536 26541 26545 26551 26556 26557
##  [7855] 26559 26560 26561 26562 26566 26567 26571 26577 26579 26584 26594
##  [7866] 26597 26599 26600 26601 26603 26605 26606 26609 26617 26618 26624
##  [7877] 26625 26627 26630 26636 26637 26643 26646 26650 26651 26653 26657
##  [7888] 26658 26661 26662 26664 26665 26673 26677 26685 26694 26695 26696
##  [7899] 26697 26702 26708 26720 26723 26728 26729 26734 26740 26744 26746
##  [7910] 26749 26750 26757 26769 26772 26783 26784 26785 26787 26789 26798
##  [7921] 26804 26807 26811 26812 26815 26823 26827 26831 26832 26833 26835
##  [7932] 26838 26840 26841 26844 26846 26849 26852 26855 26856 26860 26861
##  [7943] 26863 26864 26865 26866 26867 26868 26872 26875 26880 26881 26883
##  [7954] 26885 26888 26889 26890 26893 26895 26896 26898 26900 26901 26905
##  [7965] 26906 26911 26916 26922 26924 26925 26933 26935 26942 26943 26947
##  [7976] 26948 26949 26950 26953 26957 26959 26960 26961 26962 26967 26971
##  [7987] 26972 26973 26975 26976 26977 26978 26982 26987 26990 26995 27006
##  [7998] 27017 27021 27025 27032 27033 27037 27038 27041 27042 27045 27048
##  [8009] 27049 27055 27057 27058 27059 27060 27063 27078 27082 27084 27086
##  [8020] 27088 27093 27098 27106 27108 27109 27110 27112 27113 27117 27119
##  [8031] 27124 27125 27126 27133 27134 27139 27143 27147 27149 27150 27152
##  [8042] 27156 27163 27164 27169 27174 27175 27178 27182 27189 27191 27195
##  [8053] 27196 27197 27202 27206 27211 27214 27217 27223 27227 27233 27234
##  [8064] 27239 27240 27242 27247 27251 27257 27259 27261 27263 27271 27273
##  [8075] 27274 27276 27279 27281 27283 27287 27293 27298 27299 27307 27308
##  [8086] 27310 27312 27317 27321 27342 27344 27346 27350 27358 27361 27362
##  [8097] 27368 27372 27374 27377 27378 27379 27382 27385 27388 27389 27390
##  [8108] 27391 27393 27395 27404 27413 27416 27417 27418 27422 27424 27430
##  [8119] 27434 27435 27436 27437 27438 27440 27456 27459 27463 27466 27470
##  [8130] 27476 27477 27481 27486 27488 27494 27501 27503 27504 27505 27506
##  [8141] 27507 27508 27519 27525 27533 27536 27537 27541 27544 27548 27550
##  [8152] 27554 27555 27562 27563 27567 27568 27570 27575 27577 27587 27588
##  [8163] 27590 27595 27600 27601 27602 27603 27607 27611 27614 27618 27622
##  [8174] 27623 27628 27636 27639 27640 27642 27646 27648 27652 27655 27656
##  [8185] 27666 27667 27668 27678 27681 27683 27689 27696 27697 27700 27703
##  [8196] 27704 27707 27709 27714 27716 27720 27721 27724 27727 27731 27732
##  [8207] 27733 27738 27743 27744 27745 27746 27749 27755 27758 27759 27761
##  [8218] 27762 27768 27769 27770 27771 27774 27778 27780 27785 27786 27789
##  [8229] 27790 27792 27794 27801 27803 27808 27810 27813 27818 27822 27824
##  [8240] 27832 27835 27838 27844 27849 27851 27854 27855 27860 27863 27864
##  [8251] 27865 27868 27869 27870 27871 27875 27876 27881 27883 27888 27897
##  [8262] 27904 27910 27914 27915 27921 27924 27930 27932 27934 27936 27948
##  [8273] 27952 27956 27958 27959 27960 27962 27963 27965 27966 27968 27971
##  [8284] 27972 27973 27974 27976 27977 27983 27984 27989 27992 27993 27995
##  [8295] 28002 28004 28005 28016 28017 28024 28027 28029 28040 28041 28047
##  [8306] 28049 28052 28055 28056 28059 28066 28067 28068 28070 28072 28075
##  [8317] 28078 28079 28083 28087 28090 28096 28097 28101 28103 28105 28106
##  [8328] 28108 28109 28110 28111 28114 28116 28121 28123 28128 28129 28136
##  [8339] 28145 28146 28148 28150 28153 28154 28161 28166 28167 28172 28175
##  [8350] 28176 28179 28181 28182 28184 28186 28187 28188 28189 28190 28192
##  [8361] 28199 28206 28208 28211 28215 28216 28228 28230 28233 28234 28239
##  [8372] 28243 28247 28248 28251 28256 28259 28260 28269 28270 28272 28277
##  [8383] 28282 28284 28285 28286 28287 28289 28291 28292 28293 28299 28303
##  [8394] 28308 28309 28312 28316 28318 28320 28324 28329 28330 28331 28335
##  [8405] 28339 28341 28345 28346 28347 28353 28362 28366 28372 28380 28381
##  [8416] 28385 28390 28391 28394 28395 28401 28405 28406 28410 28412 28416
##  [8427] 28417 28419 28421 28428 28429 28432 28434 28437 28439 28441 28446
##  [8438] 28452 28454 28460 28461 28464 28466 28470 28471 28472 28478 28482
##  [8449] 28485 28490 28495 28502 28514 28516 28523 28527 28529 28530 28531
##  [8460] 28536 28539 28544 28546 28549 28551 28554 28555 28559 28564 28565
##  [8471] 28569 28571 28573 28579 28580 28581 28586 28589 28596 28597 28600
##  [8482] 28602 28605 28606 28612 28615 28619 28625 28626 28627 28628 28634
##  [8493] 28641 28642 28644 28648 28653 28654 28655 28656 28660 28663 28665
##  [8504] 28667 28670 28672 28676 28679 28684 28686 28689 28690 28691 28692
##  [8515] 28698 28703 28707 28710 28712 28713 28715 28716 28719 28720 28722
##  [8526] 28727 28729 28731 28734 28740 28745 28747 28748 28751 28758 28759
##  [8537] 28760 28767 28772 28773 28774 28775 28784 28785 28789 28791 28792
##  [8548] 28795 28796 28798 28803 28804 28808 28809 28811 28817 28824 28831
##  [8559] 28834 28835 28836 28839 28840 28841 28842 28851 28854 28858 28859
##  [8570] 28861 28863 28872 28877 28881 28882 28884 28890 28893 28896 28905
##  [8581] 28910 28912 28914 28919 28922 28926 28929 28939 28943 28947 28948
##  [8592] 28954 28956 28960 28964 28965 28968 28969 28970 28971 28977 28981
##  [8603] 28992 28993 28994 28995 28997 28998 29007 29010 29011 29014 29020
##  [8614] 29021 29025 29026 29029 29030 29040 29042 29044 29047 29058 29062
##  [8625] 29071 29072 29077 29079 29080 29083 29088 29090 29091 29093 29094
##  [8636] 29097 29099 29102 29103 29104 29105 29107 29114 29115 29126 29129
##  [8647] 29132 29135 29141 29146 29151 29156 29157 29159 29162 29165 29167
##  [8658] 29169 29170 29172 29178 29182 29183 29185 29186 29187 29188 29190
##  [8669] 29195 29198 29199 29200 29203 29205 29206 29209 29212 29213 29217
##  [8680] 29218 29219 29220 29225 29228 29232 29241 29243 29246 29253 29264
##  [8691] 29268 29272 29278 29280 29281 29285 29290 29295 29298 29299 29301
##  [8702] 29302 29306 29309 29311 29312 29314 29315 29317 29322 29324 29331
##  [8713] 29336 29338 29342 29348 29349 29355 29358 29360 29361 29363 29365
##  [8724] 29368 29374 29377 29381 29391 29392 29395 29399 29400 29401 29402
##  [8735] 29405 29406 29409 29410 29411 29412 29418 29419 29423 29433 29437
##  [8746] 29438 29441 29442 29443 29445 29452 29455 29458 29461 29464 29465
##  [8757] 29467 29472 29476 29478 29487 29490 29492 29493 29500 29503 29507
##  [8768] 29510 29513 29516 29525 29526 29527 29533 29534 29540 29542 29545
##  [8779] 29551 29558 29559 29561 29562 29563 29566 29570 29572 29576 29584
##  [8790] 29590 29591 29594 29600 29602 29603 29606 29610 29611 29613 29617
##  [8801] 29620 29630 29633 29639 29641 29642 29643 29644 29646 29648 29649
##  [8812] 29653 29654 29663 29664 29667 29669 29671 29672 29673 29676 29677
##  [8823] 29678 29681 29683 29686 29688 29689 29691 29694 29698 29705 29709
##  [8834] 29710 29711 29712 29713 29721 29729 29731 29744 29746 29749 29752
##  [8845] 29753 29756 29757 29764 29766 29769 29771 29776 29777 29781 29783
##  [8856] 29785 29786 29794 29797 29799 29801 29802 29803 29805 29807 29813
##  [8867] 29820 29824 29826 29829 29834 29837 29839 29842 29843 29848 29851
##  [8878] 29855 29857 29860 29861 29865 29872 29890 29897 29901 29902 29911
##  [8889] 29920 29924 29930 29931 29936 29938 29939 29942 29948 29951 29954
##  [8900] 29957 29958 29964 29965 29967 29970 29972 29981 29983 29984 29986
##  [8911] 29987 29988 29990 29991 29995 29996 29999 30007 30010 30013 30021
##  [8922] 30023 30027 30033 30037 30042 30046 30049 30050 30053 30064 30067
##  [8933] 30072 30073 30074 30088 30092 30096 30097 30101 30107 30110 30113
##  [8944] 30114 30116 30117 30119 30120 30121 30123 30125 30127 30128 30136
##  [8955] 30137 30138 30143 30146 30148 30156 30157 30159 30161 30162 30177
##  [8966] 30179 30188 30189 30191 30194 30195 30201 30211 30212 30213 30218
##  [8977] 30228 30230 30237 30241 30243 30247 30250 30251 30260 30263 30265
##  [8988] 30267 30271 30272 30273 30275 30277 30278 30279 30283 30284 30286
##  [8999] 30288 30291 30297 30299 30300 30302 30308 30309 30316 30326 30338
##  [9010] 30342 30347 30348 30350 30355 30364 30365 30367 30372 30376 30378
##  [9021] 30381 30384 30390 30395 30396 30398 30400 30404 30407 30411 30415
##  [9032] 30417 30418 30419 30420 30421 30425 30426 30430 30432 30435 30440
##  [9043] 30442 30445 30449 30452 30462 30463 30469 30475 30476 30477 30478
##  [9054] 30479 30481 30484 30485 30491 30492 30494 30497 30502 30503 30508
##  [9065] 30510 30511 30512 30514 30516 30517 30518 30519 30524 30525 30533
##  [9076] 30535 30539 30541 30542 30545 30548 30555 30567 30569 30572 30574
##  [9087] 30576 30581 30583 30584 30586 30588 30594 30595 30596 30602 30603
##  [9098] 30608 30611 30615 30616 30617 30618 30620 30622 30625 30626 30627
##  [9109] 30631 30639 30647 30649 30650 30651 30657 30661 30663 30664 30666
##  [9120] 30671 30679 30680 30687 30691 30694 30696 30697 30698 30706 30708
##  [9131] 30711 30712 30714 30718 30722 30723 30726 30727 30728 30732 30735
##  [9142] 30739 30742 30743 30746 30748 30751 30752 30755 30762 30765 30766
##  [9153] 30772 30780 30782 30783 30789 30790 30791 30792 30793 30794 30795
##  [9164] 30798 30807 30810 30812 30818 30819 30825 30826 30827 30832 30834
##  [9175] 30837 30838 30839 30842 30844 30846 30849 30852 30859 30860 30861
##  [9186] 30862 30864 30868 30870 30871 30873 30877 30880 30892 30896 30897
##  [9197] 30899 30901 30904 30905 30906 30907 30909 30910 30918 30919 30920
##  [9208] 30921 30926 30928 30929 30930 30931 30932 30933 30938 30940 30949
##  [9219] 30950 30951 30953 30954 30955 30959 30961 30962 30968 30970 30972
##  [9230] 30973 30976 30981 30984 30985 30987 30989 30992 30995 30998 30999
##  [9241] 31001 31005 31006 31007 31008 31009 31010 31013 31015 31024 31032
##  [9252] 31042 31046 31047 31049 31052 31059 31060 31062 31064 31065 31071
##  [9263] 31072 31076 31077 31078 31081 31090 31091 31101 31106 31107 31110
##  [9274] 31114 31122 31125 31126 31127 31134 31137 31141 31144 31148 31150
##  [9285] 31152 31157 31158 31163 31165 31167 31175 31177 31181 31183 31185
##  [9296] 31186 31190 31191 31193 31196 31199 31200 31203 31206 31211 31213
##  [9307] 31215 31222 31224 31227 31235 31236 31237 31240 31242 31252 31254
##  [9318] 31256 31270 31273 31274 31276 31279 31281 31282 31283 31284 31285
##  [9329] 31288 31294 31298 31302 31303 31305 31306 31308 31311 31315 31322
##  [9340] 31337 31338 31344 31347 31349 31354 31355 31367 31370 31374 31376
##  [9351] 31378 31379 31380 31381 31383 31386 31388 31399 31405 31407 31408
##  [9362] 31412 31415 31416 31419 31423 31427 31428 31429 31430 31433 31435
##  [9373] 31437 31442 31445 31447 31448 31449 31450 31451 31452 31454 31455
##  [9384] 31458 31460 31462 31465 31472 31473 31474 31475 31485 31489 31490
##  [9395] 31491 31492 31495 31499 31501 31505 31509 31522 31523 31525 31526
##  [9406] 31531 31538 31542 31544 31549 31555 31572 31575 31576 31577 31578
##  [9417] 31579 31590 31592 31593 31597 31598 31604 31613 31614 31615 31617
##  [9428] 31618 31620 31622 31627 31634 31638 31640 31657 31660 31663 31670
##  [9439] 31671 31674 31675 31676 31678 31685 31692 31697 31700 31703 31704
##  [9450] 31709 31712 31722 31723 31728 31729 31734 31737 31739 31740 31741
##  [9461] 31742 31745 31746 31748 31755 31761 31763 31767 31768 31776 31777
##  [9472] 31785 31797 31806 31810 31811 31812 31813 31814 31815 31816 31817
##  [9483] 31822 31825 31828 31829 31830 31831 31835 31839 31843 31849 31851
##  [9494] 31858 31860 31865 31870 31872 31875 31881 31890 31892 31902 31903
##  [9505] 31904 31910 31913 31915 31916 31921 31924 31925 31934 31938 31940
##  [9516] 31941 31944 31945 31955 31958 31960 31961 31962 31964 31974 31975
##  [9527] 31979 31986 31987 31991 31996 31999 32000 32001 32003 32004 32005
##  [9538] 32006 32007 32008 32009 32019 32024 32025 32030 32040 32042 32043
##  [9549] 32045 32051 32052 32054 32055 32056 32057 32059 32061 32065 32071
##  [9560] 32072 32073 32074 32076 32077 32082 32088 32090 32093 32094 32095
##  [9571] 32100 32101 32107 32109 32110 32111 32112 32114 32115 32116 32117
##  [9582] 32121 32124 32126 32127 32130 32143 32156 32157 32159 32160 32161
##  [9593] 32162 32166 32167 32168 32170 32174 32176 32178 32179 32180 32181
##  [9604] 32182 32184 32186 32187 32190 32193 32196 32201 32202 32203 32211
##  [9615] 32214 32217 32221 32222 32226 32229 32230 32239 32240 32241 32248
##  [9626] 32255 32262 32267 32271 32272 32278 32281 32285 32286 32287 32295
##  [9637] 32296 32298 32300 32301 32302 32304 32305 32306 32308 32309 32312
##  [9648] 32317 32318 32321 32322 32324 32327 32328 32333 32335 32337 32339
##  [9659] 32341 32352 32355 32356 32361 32363 32365 32367 32372 32373 32375
##  [9670] 32377 32380 32387 32390 32393 32396 32399 32400 32405 32408 32410
##  [9681] 32412 32415 32421 32422 32425 32427 32428 32431 32432 32433 32434
##  [9692] 32437 32439 32443 32444 32445 32450 32454 32458 32462 32464 32466
##  [9703] 32467 32469 32475 32477 32478 32484 32485 32494 32497 32499 32501
##  [9714] 32502 32508 32513 32515 32518 32519 32525 32528 32529 32532 32533
##  [9725] 32534 32541 32544 32549 32551 32552 32555 32558 32559 32564 32565
##  [9736] 32572 32573 32574 32576 32578 32579 32582 32584 32585 32588 32597
##  [9747] 32600 32603 32604 32614 32628 32629 32631 32632 32633 32634 32635
##  [9758] 32640 32641 32644 32647 32649 32650 32653 32661 32663 32666 32670
##  [9769] 32674 32678 32682 32684 32689 32693 32695 32697 32699 32700 32701
##  [9780] 32702 32703 32718 32720 32722 32729 32730 32731 32736 32740 32743
##  [9791] 32745 32747 32753 32755 32758 32761 32764 32769 32770 32772 32786
##  [9802] 32798 32799 32800 32804 32810 32814 32816 32818 32824 32828 32833
##  [9813] 32834 32836 32838 32841 32842 32844 32847 32849 32850 32852 32853
##  [9824] 32854 32859 32861 32865 32866 32876 32878 32883 32884 32891 32893
##  [9835] 32894 32895 32896 32899 32902 32904 32909 32912 32913 32914 32917
##  [9846] 32923 32925 32927 32929 32930 32931 32933 32934 32936 32941 32942
##  [9857] 32943 32944 32946 32949 32950 32956 32959 32964 32968 32976 32979
##  [9868] 32980 32981 32986 32989 32990 32991 32994 32995 32999 33005 33006
##  [9879] 33013 33017 33018 33020 33024 33028 33029 33038 33041 33048 33050
##  [9890] 33051 33057 33060 33065 33066 33068 33071 33073 33074 33079 33080
##  [9901] 33082 33086 33090 33091 33093 33096 33099 33102 33105 33110 33112
##  [9912] 33113 33122 33123 33135 33137 33138 33142 33144 33145 33150 33151
##  [9923] 33154 33155 33160 33164 33171 33172 33174 33180 33182 33207 33209
##  [9934] 33212 33215 33219 33222 33227 33228 33232 33239 33240 33242 33243
##  [9945] 33247 33249 33253 33255 33266 33267 33268 33271 33273 33274 33275
##  [9956] 33281 33285 33286 33287 33290 33292 33294 33302 33304 33306 33307
##  [9967] 33311 33313 33316 33319 33324 33325 33329 33331 33332 33333 33337
##  [9978] 33343 33344 33347 33349 33351 33356 33359 33361 33362 33367 33368
##  [9989] 33369 33379 33382 33390 33392 33394 33398 33400 33402 33406 33409
## [10000] 33412 33413 33415 33417 33419 33420 33421 33426 33427 33428 33430
## [10011] 33433 33437 33438 33442 33443 33447 33449 33451 33454 33455 33461
## [10022] 33462 33463 33465 33469 33470 33471 33472 33480 33483 33485 33488
## [10033] 33491 33493 33495 33505 33516 33519 33525 33526 33529 33531 33533
## [10044] 33534 33537 33542 33546 33548 33550 33551 33552 33560 33561 33565
## [10055] 33566 33567 33568 33569 33576 33577 33580 33587 33589 33591 33594
## [10066] 33598 33605 33612 33616 33619 33620 33623 33624 33625 33626 33629
## [10077] 33630 33632 33635 33637 33638 33640 33644 33653 33658 33660 33662
## [10088] 33666 33667 33670 33672 33675 33682 33683 33684 33686 33688 33696
## [10099] 33698 33700 33702 33703 33704 33707 33713 33714 33717 33718 33722
## [10110] 33723 33728 33740 33744 33745 33746 33747 33753 33757 33759 33760
## [10121] 33765 33773 33779 33785 33787 33788 33790 33791 33794 33796 33798
## [10132] 33802 33805 33806 33808 33811 33817 33819 33820 33824 33830 33833
## [10143] 33834 33835 33841 33842 33844 33847 33848 33854 33857 33858 33861
## [10154] 33863 33865 33866 33869 33871 33878 33882 33883 33884 33888 33889
## [10165] 33891 33892 33894 33895 33896 33900 33903 33906 33909 33914 33915
## [10176] 33920 33924 33926 33934 33939 33941 33946 33952 33953 33956 33957
## [10187] 33958 33964 33966 33973 33977 33978 33983 33985 33987 33989 33994
## [10198] 33995 33998 34005 34008 34010 34012 34014 34015 34016 34017 34018
## [10209] 34024 34033 34034 34036 34038 34039 34041 34047 34048 34049 34053
## [10220] 34060 34062 34063 34064 34068 34071 34072 34077 34082 34090 34092
## [10231] 34094 34097 34099 34100 34101 34103 34109 34112 34124 34125 34127
## [10242] 34132 34141 34142 34149 34150 34154 34166 34167 34169 34173 34177
## [10253] 34181 34183 34188 34189 34191 34192 34195 34199 34204 34206 34208
## [10264] 34212 34213 34222 34225 34228 34230 34232 34233 34240 34246 34250
## [10275] 34252 34256 34261 34263 34269 34272 34273 34275 34276 34277 34283
## [10286] 34286 34287 34291 34293 34294 34297 34298 34300 34311 34320 34326
## [10297] 34332 34334 34335 34337 34338 34339 34348 34349 34353 34355 34357
## [10308] 34360 34362 34366 34369 34370 34373 34374 34379 34380 34381 34383
## [10319] 34385 34387 34388 34389 34390 34396 34399 34400 34409 34414 34420
## [10330] 34423 34424 34426 34427 34428 34432 34435 34438 34439 34445 34455
## [10341] 34456 34461 34463 34466 34474 34483 34484 34486 34487 34488 34493
## [10352] 34505 34508 34510 34519 34520 34524 34525 34529 34533 34534 34537
## [10363] 34539 34541 34543 34545 34546 34547 34548 34555 34559 34560 34571
## [10374] 34574 34578 34582 34584 34587 34590 34594 34597 34602 34604 34606
## [10385] 34607 34608 34610 34612 34614 34619 34620 34621 34622 34623 34625
## [10396] 34627 34628 34631 34632 34633 34638 34641 34646 34650 34653 34657
## [10407] 34659 34660 34663 34668 34670 34671 34672 34675 34676 34680 34684
## [10418] 34687 34688 34692 34693 34709 34712 34715 34717 34719 34720 34723
## [10429] 34724 34726 34727 34735 34739 34740 34742 34744 34747 34748 34751
## [10440] 34755 34763 34766 34769 34770 34772 34773 34779 34780 34782 34784
## [10451] 34787 34788 34794 34796 34797 34810 34812 34814 34815 34820 34823
## [10462] 34826 34828 34839 34841 34856 34857 34867 34868 34872 34874 34876
## [10473] 34877 34879 34881 34882 34883 34887 34889 34892 34896 34897 34898
## [10484] 34908 34909 34910 34911 34913 34915 34930 34931 34933 34934 34937
## [10495] 34939 34946 34952 34955 34957 34958 34962 34964 34965 34969 34971
## [10506] 34975 34983 34985 34994 34996 34997 34999 35003 35008 35010 35011
## [10517] 35015 35016 35019 35024 35033 35034 35041 35043 35050 35055 35056
## [10528] 35059 35069 35071 35075 35076 35078 35080 35081 35084 35086 35087
## [10539] 35090 35092 35093 35098 35101 35103 35106 35111 35114 35116 35120
## [10550] 35126 35127 35128 35133 35134 35139 35142 35152 35153 35159 35160
## [10561] 35161 35165 35173 35175 35178 35179 35181 35185 35187 35189 35190
## [10572] 35192 35194 35198 35200 35203 35208 35211 35212 35214 35216 35218
## [10583] 35219 35234 35235 35236 35240 35241 35242 35243 35244 35253 35257
## [10594] 35258 35260 35261 35262 35265 35266 35270 35272 35273 35274 35279
## [10605] 35284 35295 35302 35329 35330 35337 35338 35340 35346 35348 35349
## [10616] 35355 35356 35359 35366 35368 35369 35374 35378 35382 35389 35394
## [10627] 35396 35398 35403 35406 35409 35412 35425 35431 35439 35440 35443
## [10638] 35451 35455 35456 35464 35465 35467 35468 35470 35471 35475 35476
## [10649] 35478 35481 35484 35493 35494 35497 35498 35504 35505 35507 35509
## [10660] 35514 35519 35532 35534 35537 35538 35546 35550 35555 35557 35558
## [10671] 35559 35560 35563 35566 35585 35586 35590 35602 35603 35604 35605
## [10682] 35611 35612 35620 35621 35623 35624 35626 35632 35635 35638 35642
## [10693] 35655 35657 35664 35668 35671 35673 35674 35685 35687 35688 35690
## [10704] 35691 35695 35696 35700 35701 35703 35705 35712 35717 35722 35724
## [10715] 35728 35729 35731 35732 35733 35737 35742 35747 35748 35749 35751
## [10726] 35752 35755 35757 35760 35765 35766 35770 35772 35780 35789 35797
## [10737] 35798 35802 35804 35806 35811 35812 35813 35814 35823 35827 35830
## [10748] 35831 35835 35841 35844 35848 35849 35850 35851 35852 35853 35856
## [10759] 35857 35859 35860 35862 35863 35864 35865 35873 35878 35882 35888
## [10770] 35889 35890 35895 35898 35904 35905 35908 35916 35923 35926 35927
## [10781] 35934 35937 35939 35940 35942 35944 35947 35949 35951 35955 35957
## [10792] 35961 35967 35969 35973 35975 35978 35986 35992 36001 36006 36008
## [10803] 36011 36012 36017 36034 36036 36037 36045 36047 36048 36049 36050
## [10814] 36051 36057 36062 36063 36069 36072 36073 36074 36075 36077 36079
## [10825] 36085 36086 36088 36091 36093 36098 36108 36109 36115 36123 36124
## [10836] 36125 36129 36130 36132 36140 36142 36144 36146 36147 36151 36158
## [10847] 36160 36163 36165 36183 36184 36189 36192 36193 36194 36196 36201
## [10858] 36207 36209 36214 36215 36216 36218 36219 36220 36221 36223 36231
## [10869] 36234 36235 36236 36241 36248 36250 36253 36254 36259 36270 36278
## [10880] 36279 36281 36283 36285 36289 36291 36292 36301 36304 36306 36307
## [10891] 36308 36309 36310 36311 36312 36317 36318 36323 36328 36329 36330
## [10902] 36333 36335 36336 36341 36343 36350 36351 36354 36357 36358 36363
## [10913] 36368 36378 36379 36385 36387 36394 36396 36408 36409 36411 36412
## [10924] 36416 36419 36425 36430 36431 36433 36434 36438 36439 36440 36443
## [10935] 36446 36447 36448 36453 36460 36461 36462 36477 36478 36479 36480
## [10946] 36485 36487 36488 36489 36490 36491 36493 36496 36498 36507 36512
## [10957] 36513 36516 36519 36528 36545 36546 36547 36548 36552

Write the final prediction to broto_submission.csv

index_value <- data.frame(index = test_data$ID, prediction = test_data$target)
write.csv(index_value, "broto_submission_bankruptcy.csv", na="")

Predict bankrupt in the subsequent years or not.

Broto

20 June 2018

Goal

Agenda

Libraries used

Reading & Understanding the Data

Read the Data

Understand the data

Data Description

Data Pre-processing

Verify the data types assigned to the variables in the dataset

Plot the data to understand

Feature Engineering

Substracting current gross income from gross income in 3 years

Check for missing values

Check for class imbalance

Find the corelation between the features.

Split the Data into train and test sets

Build a Decision Tree

Model the tree

Variable Importance in trees

Rules from trees

Plotting the tree

Evaluating the model

Predictions on the test data

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

CART Trees

Goal

Tree Explicability

Predictions on the test data

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

Build KNN model

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

Model Building using Random Forest and tuning

Important attributes

Extract and store important variables obtained from the random forest model

plot (directly prints the important attributes)

Predict on Train data

Build confusion matrix and find accuracy

Predicton Test Data

Build confusion matrix and find accuracy

Check the accuracy in train and test

Build randorm forest using top 9 important attributes.

Build the classification model using randomForest

Print and understand the model

Important attributes

Predict on Train data

Build confusion matrix and find accuracy

Predicton Test Data

Build confusion matrix and find accuracy

Print the accuracy

Top Important attributes

Print the tuned model

Predict on Train data

Build confusion matrix and find accuracy

Predicton Test Data

Build confusion matrix and find accuracy

Get the accuracy on train and test

Error metrics for classification can be accessed through the “confusionMatrix()” function from the caret package

Finding the F1 score since it is very important to have high precision and recall for this problem

Print F1 score

Write the final prediction to broto_submission.csv