Auto Finannce Limeted

Sameer Mathur

Importing and Reviewing the Data

REVIEW OF THE DATA

List of Data Columns

DEFAULT

  • Defaulter Flag

1: Customer has delayed paying at least once

0: Otherwise

  • Defaulter Type

0: Never Delayed Paying (Good Customer)

1: At least one delay, but always paid before 90 days (OK Customer)

2: At least one delay and did not pay even after 90 days (Bad Customer)

DEMOGRAPHIC VARIABLES



Gender

AGE

Education

QUALHSC

QUAL_PG

Income

Monthly Income in Thousands (MTHINCTH)

Owns a Fridge (FRICODE)

Owns a Washing Machine (WASHCODE)



Profession

PROFBUS

No. of Dependents

NOOFDEPE

Region

LOAN-RELATED VARIABLES

Loan Tenor (TENORYR)

Fraction of Loan in Down Payment (DWNPMFR)

Gave Post-Dated Cheques (FULLPDC)

Branch

Importing Data

library(data.table)
# reading data
autoF.dt <- fread("AutoFinanaceData.csv")
attach(autoF.dt)
# dimension of the dataset
dim(autoF.dt)
[1] 28906    21

Data Structure

# structure of the data table
str(autoF.dt)
Classes 'data.table' and 'data.frame':  28906 obs. of  21 variables:
 $ Agmt No       : chr  "AP18100057" "AP18100140" "AP18100198" "AP18100217" ...
 $ ContractStatus: chr  "Closed" "Closed" "Closed" "Closed" ...
 $ StartDate     : chr  "19-01-01" "10-05-01" "05-08-01" "03-09-01" ...
 $ AGE           : int  26 28 32 31 36 33 41 47 43 27 ...
 $ NOOFDEPE      : int  2 2 2 0 2 2 2 0 0 0 ...
 $ MTHINCTH      : num  4.5 5.59 8.8 5 12 ...
 $ SALDATFR      : num  1 1 1 1 1 1 1 1 0.97 1 ...
 $ TENORYR       : num  1.5 2 1 1 1 2 1 2 1.5 2 ...
 $ DWNPMFR       : num  0.27 0.25 0.51 0.66 0.17 0.18 0.37 0.42 0.27 0.47 ...
 $ PROFBUS       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ QUALHSC       : int  0 0 0 0 0 0 1 0 0 0 ...
 $ QUAL_PG       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ SEXCODE       : int  1 1 1 1 1 1 1 1 1 1 ...
 $ FULLPDC       : int  1 1 1 1 1 0 0 1 1 1 ...
 $ FRICODE       : int  0 1 1 1 1 0 0 0 0 0 ...
 $ WASHCODE      : int  0 0 1 1 0 0 0 0 0 0 ...
 $ Region        : chr  "AP2" "AP2" "AP2" "AP2" ...
 $ Branch        : chr  "Vizag" "Vizag" "Vizag" "Vizag" ...
 $ DefaulterFlag : int  0 0 0 0 0 0 0 0 0 0 ...
 $ DefaulterType : int  0 0 0 0 0 0 0 0 0 0 ...
 $ DATASET       : chr  "" "BUILD" "BUILD" "BUILD" ...
 - attr(*, ".internal.selfref")=<externalptr> 

Convert catgorical variables to `factor`

# convert 'PROFBUS' to a factor
autoF.dt[, PROFBUS := factor(PROFBUS)]
# convert 'QUALHSC' to a factor
autoF.dt[, QUALHSC := factor(QUALHSC)]
# convert 'QUAL_PG' to a factor
autoF.dt[, QUAL_PG := factor(QUAL_PG)]
# convert 'SEXCODE' to a factor
autoF.dt[, SEXCODE := factor(SEXCODE)]
# convert 'FULLPDC' to a factor
autoF.dt[, FULLPDC := factor(FULLPDC)]
# convert 'FRICODE' to a factor
autoF.dt[, FRICODE := factor(FRICODE)]
# convert 'WASHCODE' to a factor
autoF.dt[, WASHCODE := factor(WASHCODE)]
# convert 'DefaulterFlag' to a factor
autoF.dt[, DefaulterFlag := factor(DefaulterFlag)]
# convert 'DefaulterType' to a factor
autoF.dt[, DefaulterType := factor(DefaulterType)]
# convert 'Region' to a factor
autoF.dt[, Region := factor(Region)]
# convert 'Branch' to a factor
autoF.dt[, Branch := factor(Branch)]
# verify conversion
str(autoF.dt)
Classes 'data.table' and 'data.frame':  28906 obs. of  21 variables:
 $ Agmt No       : chr  "AP18100057" "AP18100140" "AP18100198" "AP18100217" ...
 $ ContractStatus: chr  "Closed" "Closed" "Closed" "Closed" ...
 $ StartDate     : chr  "19-01-01" "10-05-01" "05-08-01" "03-09-01" ...
 $ AGE           : int  26 28 32 31 36 33 41 47 43 27 ...
 $ NOOFDEPE      : int  2 2 2 0 2 2 2 0 0 0 ...
 $ MTHINCTH      : num  4.5 5.59 8.8 5 12 ...
 $ SALDATFR      : num  1 1 1 1 1 1 1 1 0.97 1 ...
 $ TENORYR       : num  1.5 2 1 1 1 2 1 2 1.5 2 ...
 $ DWNPMFR       : num  0.27 0.25 0.51 0.66 0.17 0.18 0.37 0.42 0.27 0.47 ...
 $ PROFBUS       : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ QUALHSC       : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 2 1 1 1 ...
 $ QUAL_PG       : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ SEXCODE       : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
 $ FULLPDC       : Factor w/ 2 levels "0","1": 2 2 2 2 2 1 1 2 2 2 ...
 $ FRICODE       : Factor w/ 2 levels "0","1": 1 2 2 2 2 1 1 1 1 1 ...
 $ WASHCODE      : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 1 1 1 ...
 $ Region        : Factor w/ 8 levels "AP1","AP2","Chennai",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Branch        : Factor w/ 14 levels "Bangalore","Chennai",..: 14 14 14 14 14 14 14 14 14 14 ...
 $ DefaulterFlag : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ DefaulterType : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ DATASET       : chr  "" "BUILD" "BUILD" "BUILD" ...
 - attr(*, ".internal.selfref")=<externalptr> 

DEFAULT

Defaulters (based on DefaulterFlag)

DefaulterFlag
    0     1 
28.82 71.18 

Pie Chart

plot of chunk unnamed-chunk-7

Defaulters (based on DefaulterType)

DefaulterType
    0     1     2 
28.82 57.65 13.53 

Pie Chart

plot of chunk unnamed-chunk-9

DEMOGRAPHIC VARIABLES

Defaulters based on Gender

             SEXCODE
DefaulterFlag     0     1
            0  9.17 90.83
            1  7.06 92.94
             SEXCODE
DefaulterFlag      0      1
          0    34.48  28.35
          1    65.52  71.65
          Sum 100.00 100.00

SEXCODE = 1 (Male) SEXCODE = 0 (Female)

Barplot

plot of chunk unnamed-chunk-12

Defaulters based on their Profession

             PROFBUS
DefaulterFlag     0     1
            0 84.59 15.41
            1 85.39 14.61
             PROFBUS
DefaulterFlag      0      1
          0    28.63  29.93
          1    71.37  70.07
          Sum 100.00 100.00

PROFBUS = 1 (BUSINESS)

PROFBUS = 0 (PROFESSIONAL)

Barplot

plot of chunk unnamed-chunk-15

Education

Defaulters based on their Education (HSC)

             QUALHSC
DefaulterFlag     0     1
            0 79.05 20.95
            1 75.92 24.08

Barplot

plot of chunk unnamed-chunk-17

Defaulters based on their Education (QUAL_PG)

             QUAL_PG
DefaulterFlag     0     1
            0 94.48  5.52
            1 96.56  3.44

Income

Defaulters based on Monthly Income

   DefaulterFlag AvgIncome
1:             0       9.5
2:             1       8.7

Defaulters based on whether they gave post-dated cheques FULLPDC

             FULLPDC
DefaulterFlag     0     1
            0 38.40 61.60
            1 70.03 29.97

Defaulters based on whether they own a Fridge

             FRICODE
DefaulterFlag     0     1
            0 49.51 50.49
            1 61.31 38.69

Defaulters based on whether they own a Washing Machine

             WASHCODE
DefaulterFlag     0     1
            0 74.78 25.22
            1 83.52 16.48