library(tidyverse)
library(dplyr)
library(ggplot2)

app_data <- read.csv("application_record.csv")
credit_data <- read.csv("credit_record.csv") 

Level 1: Understanding the Data(Basic Exploration)

Question 1.1 What is the structure(rows , columns and data types) of the application_record and credit_record?

cat("--- Application Record Structure ---")
## --- Application Record Structure ---
colnames(app_data)
##  [1] "ID"                  "CODE_GENDER"         "FLAG_OWN_CAR"       
##  [4] "FLAG_OWN_REALTY"     "CNT_CHILDREN"        "AMT_INCOME_TOTAL"   
##  [7] "NAME_INCOME_TYPE"    "NAME_EDUCATION_TYPE" "NAME_FAMILY_STATUS" 
## [10] "NAME_HOUSING_TYPE"   "DAYS_BIRTH"          "DAYS_EMPLOYED"      
## [13] "FLAG_MOBIL"          "FLAG_WORK_PHONE"     "FLAG_PHONE"         
## [16] "FLAG_EMAIL"          "OCCUPATION_TYPE"     "CNT_FAM_MEMBERS"
cat("--- Credit Record Structure ---")
## --- Credit Record Structure ---
colnames(credit_data)
## [1] "ID"             "MONTHS_BALANCE" "STATUS"
str(app_data)
## 'data.frame':    438557 obs. of  18 variables:
##  $ ID                 : int  5008804 5008805 5008806 5008808 5008809 5008810 5008811 5008812 5008813 5008814 ...
##  $ CODE_GENDER        : chr  "M" "M" "M" "F" ...
##  $ FLAG_OWN_CAR       : chr  "Y" "Y" "Y" "N" ...
##  $ FLAG_OWN_REALTY    : chr  "Y" "Y" "Y" "Y" ...
##  $ CNT_CHILDREN       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ AMT_INCOME_TOTAL   : num  427500 427500 112500 270000 270000 ...
##  $ NAME_INCOME_TYPE   : chr  "Working" "Working" "Working" "Commercial associate" ...
##  $ NAME_EDUCATION_TYPE: chr  "Higher education" "Higher education" "Secondary / secondary special" "Secondary / secondary special" ...
##  $ NAME_FAMILY_STATUS : chr  "Civil marriage" "Civil marriage" "Married" "Single / not married" ...
##  $ NAME_HOUSING_TYPE  : chr  "Rented apartment" "Rented apartment" "House / apartment" "House / apartment" ...
##  $ DAYS_BIRTH         : int  -12005 -12005 -21474 -19110 -19110 -19110 -19110 -22464 -22464 -22464 ...
##  $ DAYS_EMPLOYED      : int  -4542 -4542 -1134 -3051 -3051 -3051 -3051 365243 365243 365243 ...
##  $ FLAG_MOBIL         : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ FLAG_WORK_PHONE    : int  1 1 0 0 0 0 0 0 0 0 ...
##  $ FLAG_PHONE         : int  0 0 0 1 1 1 1 0 0 0 ...
##  $ FLAG_EMAIL         : int  0 0 0 1 1 1 1 0 0 0 ...
##  $ OCCUPATION_TYPE    : chr  "" "" "Security staff" "Sales staff" ...
##  $ CNT_FAM_MEMBERS    : num  2 2 2 1 1 1 1 1 1 1 ...
str(credit_data)
## 'data.frame':    1048575 obs. of  3 variables:
##  $ ID            : int  5001711 5001711 5001711 5001711 5001712 5001712 5001712 5001712 5001712 5001712 ...
##  $ MONTHS_BALANCE: int  0 -1 -2 -3 0 -1 -2 -3 -4 -5 ...
##  $ STATUS        : chr  "X" "0" "0" "0" ...
  • Interpretation: The initial exploration of the datasets reveals a relational structure between the applicant demographics and their credit history.

  • Application Record: This dataset contains 438,557 records with 18 variables. It includes categorical data such as Gender, Education, and Housing type, alongside numerical data like Income and Family member counts.

  • Credit Record: The dataset contains 1,048,575 records with 3 specific variables. It includes numerical data such as MONTHS_BALANCE and categorical data in the STATUS column.

Question 1.2: How many missing values exist in each dataset, and what is the percentage of missing data in OCCUPATION_TYPE?

# Checking for missing values in Application dataset
colSums(is.na(app_data))
##                  ID         CODE_GENDER        FLAG_OWN_CAR     FLAG_OWN_REALTY 
##                   0                   0                   0                   0 
##        CNT_CHILDREN    AMT_INCOME_TOTAL    NAME_INCOME_TYPE NAME_EDUCATION_TYPE 
##                   0                   0                   0                   0 
##  NAME_FAMILY_STATUS   NAME_HOUSING_TYPE          DAYS_BIRTH       DAYS_EMPLOYED 
##                   0                   0                   0                   0 
##          FLAG_MOBIL     FLAG_WORK_PHONE          FLAG_PHONE          FLAG_EMAIL 
##                   0                   0                   0                   0 
##     OCCUPATION_TYPE     CNT_FAM_MEMBERS 
##                   0                   0
# Checking for missing values in Application dataset
colSums(is.na(credit_data))
##             ID MONTHS_BALANCE         STATUS 
##              0              0              0
# Calculating the percentage of missing values in `OCCUPATION_TYPE`
occ_missing_count <- sum(app_data$OCCUPATION_TYPE == "" | is.na(app_data$OCCUPATION_TYPE))

occ_missing_percentage <- (occ_missing_count/nrow(app_data)) *100

cat("No. of misssing values:",occ_missing_count)
## No. of misssing values: 134203
cat("\nPercentage of missing values:",occ_missing_percentage)
## 
## Percentage of missing values: 30.60104
  • Interpretation:The missing value audit is a crucial step in ensuring data quality before performing any advanced analytics or modeling.
  • Application Record: There are no NA-type missing values in the primary demographic columns.However, the OCCUPATION_TYPE column contains a significant number of empty strings (““).Approximately 30.62% of applicants have not specified their occupation.

Queston 1.3: How many unique Customer IDs are present in both datasets, and how many overlap?

# 1. Unique IDs 
unique_app_ids <- n_distinct(app_data$ID)
unique_credit_ids <- n_distinct(credit_data$ID)
cat("Unique IDs in Application Record:", unique_app_ids, "\n")
## Unique IDs in Application Record: 438510
cat("Unique IDs in Credit Record:", unique_credit_ids)
## Unique IDs in Credit Record: 45985
# 2. Overlapping IDs 
common_ids <- length(intersect(app_data$ID, credit_data$ID))
cat("Common IDs :", common_ids)
## Common IDs : 36457
  • Interpretation:The identification of unique records and their intersection is critical for defining the scope of our predictive modeling:There are 438510 unique applicant profiles available in the application Record.We have payment history data for 45985 unique customers.Only 36457 individuals exist in both datasets simultaneously.This overlap represents our actual working population.

Queston 1.4: What are the different categories of the STATUS column, and which one occurs most frequently?

status_counts <- table(credit_data$STATUS)
print(status_counts)
## 
##      0      1      2      3      4      5      C      X 
## 383120  11090    868    320    223   1693 442031 209230
most_common_status <- names(which.max(status_counts))
most_common_value <- max(status_counts)
cat("\nThe most frequently category status:",most_common_status,"\n")
## 
## The most frequently category status: C
cat("The most common values:", most_common_value)
## The most common values: 442031
  • Interpretation: The STATUS is the most important variable of our credit risk analysis its define the payment behaviour of each amount.The column have 8 distinct categories:0,1,2,3,4,5,C and X.
  • Meaning of status:
    • ‘C’(Closed) and ‘X’(No debit) indicate healthy accounts.
    • ‘0’to’5’ represent the severity of payment delays (by increasing the 0 to 5 it represents the no. of days delayed in the payment 1-29 days for 0 and over 150 days for 5 ).

Question 1.5: Are there any duplicate IDs in the applicaton_record that need to be removed?

duplicate_ids_count <- sum(duplicated(app_data$ID))
cat("Total duplicate CUstomer IDs found:",duplicate_ids_count)
## Total duplicate CUstomer IDs found: 47
app_data <-app_data %>%  distinct(ID , .keep_all = TRUE)
nrow(app_data)
## [1] 438510
  • Interpretation:Duplicate entries can lead to statistical bias and “Data Leakage” where the model learns from the same person’s information multiple times.Our initial audit revealed 47 duplicate Customer IDs within the application dataset.We utilized the distinct() function to remove these redundant entries.After the cleaning process, the record count decreased from 438,557 to 438,510 unique entries.

This step ensures a “One ID = One Customer”relationship. Establishing a unique primary key is essential before we merge datasets or begin the predictive modeling phase, as it ensures that each applicant’s profile is weighed accurately.

Level 2: Data Extraction & Filtering or Data Pre-processing

Question 2.1: Perform a filtered extraction to identify the top 10 applicants with the highest reported AMT_INCOME_TOTAL for wealth-ter analysis.

top_ten_income <- app_data %>% 
  arrange(desc(AMT_INCOME_TOTAL)) %>% 
  head(10)

print(top_ten_income)
##         ID CODE_GENDER FLAG_OWN_CAR FLAG_OWN_REALTY CNT_CHILDREN
## 1  5987963           M            Y               N            0
## 2  5987964           M            Y               N            0
## 3  5987966           M            Y               N            0
## 4  5987967           M            Y               N            0
## 5  5987968           M            Y               N            0
## 6  5987969           M            Y               N            0
## 7  7987964           M            Y               N            0
## 8  6123707           M            Y               Y            0
## 9  6123708           M            Y               Y            0
## 10 6123709           M            Y               Y            0
##    AMT_INCOME_TOTAL NAME_INCOME_TYPE NAME_EDUCATION_TYPE NAME_FAMILY_STATUS
## 1           6750000          Working    Higher education            Married
## 2           6750000          Working    Higher education            Married
## 3           6750000          Working    Higher education            Married
## 4           6750000          Working    Higher education            Married
## 5           6750000          Working    Higher education            Married
## 6           6750000          Working    Higher education            Married
## 7           6750000          Working    Higher education            Married
## 8           4500000          Working    Higher education            Married
## 9           4500000          Working    Higher education            Married
## 10          4500000          Working    Higher education            Married
##    NAME_HOUSING_TYPE DAYS_BIRTH DAYS_EMPLOYED FLAG_MOBIL FLAG_WORK_PHONE
## 1  House / apartment     -19341          -443          1               1
## 2  House / apartment     -19341          -443          1               1
## 3  House / apartment     -19341          -443          1               1
## 4  House / apartment     -19341          -443          1               1
## 5  House / apartment     -19341          -443          1               1
## 6  House / apartment     -19341          -443          1               1
## 7  House / apartment     -19341          -443          1               1
## 8  House / apartment     -18784         -3618          1               1
## 9  House / apartment     -18784         -3618          1               1
## 10 House / apartment     -18784         -3618          1               1
##    FLAG_PHONE FLAG_EMAIL OCCUPATION_TYPE CNT_FAM_MEMBERS
## 1           1          0        Laborers               2
## 2           1          0        Laborers               2
## 3           1          0        Laborers               2
## 4           1          0        Laborers               2
## 5           1          0        Laborers               2
## 6           1          0        Laborers               2
## 7           1          0        Laborers               2
## 8           0          0        Managers               2
## 9           0          0        Managers               2
## 10          0          0        Managers               2
  • Interpretation: The extraction of high-income individuals is a vital step in segmenting the customer base for premium banking services:- High-income segments are often prioritized for higher credit limits and premium cards, as their capacity to repay is statistically superior.

Question 2.2: Filter the records for high-asset applicants who own both a car and reality while supporting more than 2 children.

# Applying Multiple conditions
# flag_own_car == 'Y' # Car owner
# flag_own_realty == 'Y' # House Owner
# cnt_children > 2 (more than two children)


high_asset_familes <- app_data %>% 
  filter(FLAG_OWN_CAR == "Y" & FLAG_OWN_REALTY == 'Y' & CNT_CHILDREN >2)


cat("TOtal high asset families found:", nrow(high_asset_familes))
## TOtal high asset families found: 2069
head(high_asset_familes[, c("ID","FLAG_OWN_CAR", "FLAG_OWN_REALTY", "CNT_CHILDREN", "AMT_INCOME_TOTAL")])
##        ID FLAG_OWN_CAR FLAG_OWN_REALTY CNT_CHILDREN AMT_INCOME_TOTAL
## 1 5008836            Y               Y            3           270000
## 2 5008837            Y               Y            3           270000
## 3 5021339            Y               Y            3           270000
## 4 5021340            Y               Y            3           270000
## 5 5021341            Y               Y            3           270000
## 6 5021342            Y               Y            3           270000
  • Interpretation:By focusing on applicants who own both a car and reality, we are identifying individuals with high collateral value, which is a positive indicator for credit recovery.This segment is ideal for specialized family-oriented financial products or high-limit credit lines due to their established asset base.

Question 2.3: Extract all demographic records for "Pensioners" who are currently living in “Rented apartments” to assess housing stability.

pensioners_rented <- app_data %>% 
  filter(NAME_INCOME_TYPE == 'Pensioner' & NAME_HOUSING_TYPE == "Rented apartment")

pensioners_rented
##          ID CODE_GENDER FLAG_OWN_CAR FLAG_OWN_REALTY CNT_CHILDREN
## 1   5009033           F            N               N            0
## 2   5009034           F            N               N            0
## 3   5009035           F            N               N            0
## 4   5009036           F            N               N            0
## 5   5009037           F            N               N            0
## 6   5009038           F            N               N            0
## 7   5009592           F            N               N            0
## 8   5090748           F            N               N            0
## 9   5024876           F            Y               Y            0
## 10  6063350           F            Y               Y            0
## 11  6063351           F            Y               Y            0
## 12  6063352           F            Y               Y            0
## 13  6063353           F            Y               Y            0
## 14  5052757           F            N               Y            0
## 15  5052758           F            N               Y            0
## 16  5087864           F            Y               Y            0
## 17  5087865           F            Y               Y            0
## 18  5087866           F            Y               Y            0
## 19  5090645           F            N               Y            0
## 20  5791715           F            N               Y            0
## 21  5791716           F            N               Y            0
## 22  5791717           F            N               Y            0
## 23  5791718           F            N               Y            0
## 24  5791719           F            N               Y            0
## 25  5791720           F            N               Y            0
## 26  5791721           F            N               Y            0
## 27  5791722           F            N               Y            0
## 28  5791723           F            N               Y            0
## 29  5091910           F            N               Y            0
## 30  5513299           F            N               Y            0
## 31  5115562           M            N               Y            0
## 32  5115563           M            N               Y            0
## 33  5115564           M            N               Y            0
## 34  5115565           M            N               Y            0
## 35  5115566           M            N               Y            0
## 36  5115567           M            N               Y            0
## 37  5115568           M            N               Y            0
## 38  5115571           M            N               Y            0
## 39  5115572           M            N               Y            0
## 40  5115573           M            N               Y            0
## 41  5115574           M            N               Y            0
## 42  5116264           F            N               Y            0
## 43  5116266           F            N               Y            0
## 44  5116267           F            N               Y            0
## 45  5177428           F            N               Y            0
## 46  5177429           F            N               Y            0
## 47  5177872           F            N               Y            0
## 48  5177873           F            N               Y            0
## 49  5279529           F            N               Y            0
## 50  5279530           F            N               Y            0
## 51  5279531           F            N               Y            0
## 52  5279532           F            N               Y            0
## 53  5279533           F            N               Y            0
## 54  5279534           F            N               Y            0
## 55  5279535           F            N               Y            0
## 56  6268956           F            N               Y            0
## 57  5290729           M            N               N            1
## 58  5876688           M            N               N            1
## 59  5876689           M            N               N            1
## 60  5876690           M            N               N            1
## 61  5403340           F            N               Y            0
## 62  5403341           F            N               Y            0
## 63  5403342           F            N               Y            0
## 64  5403343           F            N               Y            0
## 65  5405297           F            N               N            0
## 66  5405298           F            N               N            0
## 67  5565711           F            N               N            0
## 68  5541487           F            N               Y            0
## 69  5541488           F            N               Y            0
## 70  5541490           F            N               Y            0
## 71  5541491           F            N               Y            0
## 72  5541492           F            N               Y            0
## 73  5541493           F            N               Y            0
## 74  5541495           F            N               Y            0
## 75  5541496           F            N               Y            0
## 76  5541497           F            N               Y            0
## 77  5541498           F            N               Y            0
## 78  5541499           F            N               Y            0
## 79  5626559           F            N               N            0
## 80  5626560           F            N               N            0
## 81  5626561           F            N               N            0
## 82  5627319           F            N               Y            0
## 83  5627320           F            N               Y            0
## 84  6007579           F            N               Y            0
## 85  6007580           F            N               Y            0
## 86  5668131           F            N               N            0
## 87  5707096           F            N               Y            0
## 88  5707097           F            N               Y            0
## 89  5707098           F            N               Y            0
## 90  5707099           F            N               Y            0
## 91  5707100           F            N               Y            0
## 92  5707101           F            N               Y            0
## 93  5707102           F            N               Y            0
## 94  5707103           F            N               Y            0
## 95  5707104           F            N               Y            0
## 96  5707105           F            N               Y            0
## 97  5707106           F            N               Y            0
## 98  5707107           F            N               Y            0
## 99  5707108           F            N               Y            0
## 100 5707109           F            N               Y            0
## 101 5713030           F            N               Y            0
## 102 5713819           M            N               N            0
## 103 5739421           F            N               Y            0
## 104 5739422           F            N               Y            0
## 105 6099143           F            N               Y            0
## 106 6099144           F            N               Y            0
## 107 5808141           F            N               Y            0
## 108 5808142           F            N               Y            0
## 109 5808143           F            N               Y            0
## 110 5808144           F            N               Y            0
## 111 5808145           F            N               Y            0
## 112 5808146           F            N               Y            0
## 113 5872918           M            Y               N            0
## 114 6806204           M            Y               N            0
## 115 6806205           M            Y               N            0
## 116 6806206           M            Y               N            0
## 117 6806207           M            Y               N            0
## 118 5877642           F            N               N            0
## 119 5877643           F            N               N            0
## 120 5877644           F            N               N            0
## 121 5888706           F            N               Y            0
## 122 5888708           F            N               Y            0
## 123 5888709           F            N               Y            0
## 124 5929281           F            N               Y            0
## 125 5929282           F            N               Y            0
## 126 5929283           F            N               Y            0
## 127 5944649           M            N               N            3
## 128 5944650           M            N               N            3
## 129 5944651           M            N               N            3
## 130 5944652           M            N               N            3
## 131 5957938           M            N               N            0
## 132 5969574           F            N               Y            0
## 133 6007217           F            Y               Y            0
## 134 6586225           F            Y               Y            0
## 135 6586226           F            Y               Y            0
## 136 6586227           F            Y               Y            0
## 137 6586228           F            Y               Y            0
## 138 6008211           F            N               Y            0
## 139 6008212           F            N               Y            0
## 140 6008213           F            N               Y            0
## 141 6008214           F            N               Y            0
## 142 6014539           F            Y               Y            0
## 143 6014540           F            Y               Y            0
## 144 6037327           M            N               Y            0
## 145 6037328           M            N               Y            0
## 146 6037329           M            N               Y            0
## 147 6037330           M            N               Y            0
## 148 6037331           M            N               Y            0
## 149 6054391           M            Y               Y            0
## 150 6054392           M            Y               Y            0
## 151 6054393           M            Y               Y            0
## 152 6073730           F            N               N            0
## 153 6684305           F            N               N            0
## 154 6684306           F            N               N            0
## 155 6684307           F            N               N            0
## 156 6684308           F            N               N            0
## 157 6088149           F            N               N            0
## 158 6088150           F            N               N            0
## 159 6088151           F            N               N            0
## 160 6088152           F            N               N            0
## 161 6088153           F            N               N            0
## 162 6088154           F            N               N            0
## 163 6088155           F            N               N            0
## 164 6088156           F            N               N            0
## 165 6088157           F            N               N            0
## 166 6088158           F            N               N            0
## 167 6148226           F            N               Y            0
## 168 6148227           F            N               Y            0
## 169 6230408           F            N               Y            0
## 170 6230409           F            N               Y            0
## 171 6230411           F            N               Y            0
## 172 6230412           F            N               Y            0
## 173 6230413           F            N               Y            0
## 174 6230414           F            N               Y            0
## 175 6230417           F            N               Y            0
## 176 6230418           F            N               Y            0
## 177 6230419           F            N               Y            0
## 178 6230422           F            N               Y            0
## 179 6230423           F            N               Y            0
## 180 6230424           F            N               Y            0
## 181 6230425           F            N               Y            0
## 182 6230426           F            N               Y            0
## 183 6230427           F            N               Y            0
## 184 6230428           F            N               Y            0
## 185 6230429           F            N               Y            0
## 186 6240975           F            N               N            1
## 187 6240976           F            N               N            1
## 188 6240977           F            N               N            1
## 189 6264617           F            N               Y            1
## 190 6336049           F            N               Y            1
## 191 6336050           F            N               Y            1
## 192 6336051           F            N               Y            1
## 193 6336053           F            N               Y            1
## 194 6280517           F            N               N            0
## 195 6280518           F            N               N            0
## 196 6375585           F            N               N            0
## 197 6375586           F            N               N            0
## 198 6375587           F            N               N            0
## 199 6375588           F            N               N            0
## 200 6375590           F            N               N            0
## 201 6375591           F            N               N            0
## 202 6375593           F            N               N            0
## 203 6375594           F            N               N            0
## 204 6397321           F            N               Y            0
## 205 6397322           F            N               Y            0
## 206 6410106           F            N               Y            0
## 207 6424262           F            N               N            0
## 208 6424263           F            N               N            0
## 209 6424264           F            N               N            0
## 210 6424265           F            N               N            0
## 211 6424266           F            N               N            0
## 212 6424267           F            N               N            0
## 213 6424268           F            N               N            0
## 214 6424269           F            N               N            0
## 215 6424270           F            N               N            0
## 216 6424271           F            N               N            0
## 217 6424272           F            N               N            0
## 218 6424273           F            N               N            0
## 219 6424274           F            N               N            0
## 220 6424275           F            N               N            0
## 221 6424276           F            N               N            0
## 222 6424277           F            N               N            0
## 223 6424278           F            N               N            0
## 224 6424279           F            N               N            0
## 225 6424280           F            N               N            0
## 226 6425392           F            N               N            0
## 227 6425393           F            N               N            0
## 228 6425394           F            N               N            0
## 229 6830653           F            N               N            0
## 230 6830654           F            N               N            0
## 231 6830655           F            N               N            0
## 232 6830656           F            N               N            0
## 233 6830658           F            N               N            0
## 234 6830659           F            N               N            0
## 235 6830660           F            N               N            0
## 236 6830661           F            N               N            0
## 237 6830662           F            N               N            0
## 238 6830663           F            N               N            0
## 239 6830664           F            N               N            0
## 240 6830666           F            N               N            0
## 241 6830667           F            N               N            0
## 242 6830668           F            N               N            0
## 243 6830669           F            N               N            0
## 244 6830670           F            N               N            0
## 245 6830671           F            N               N            0
## 246 6830672           F            N               N            0
## 247 6830673           F            N               N            0
## 248 6830674           F            N               N            0
## 249 6830675           F            N               N            0
## 250 6830676           F            N               N            0
## 251 6830677           F            N               N            0
## 252 6830679           F            N               N            0
## 253 6429545           F            N               Y            0
## 254 6429547           F            N               Y            0
## 255 6429550           F            N               Y            0
## 256 6519201           F            N               Y            0
## 257 6519202           F            N               Y            0
## 258 6528317           F            N               Y            0
## 259 6528318           F            N               Y            0
## 260 6528319           F            N               Y            0
## 261 6528320           F            N               Y            0
## 262 6528321           F            N               Y            0
## 263 6528322           F            N               Y            0
## 264 6528323           F            N               Y            0
## 265 6528324           F            N               Y            0
## 266 6528325           F            N               Y            0
## 267 6528326           F            N               Y            0
## 268 6529595           M            N               N            0
## 269 6605845           M            N               N            0
## 270 6625464           F            N               N            0
## 271 6625465           F            N               N            0
## 272 6625466           F            N               N            0
## 273 6661353           F            Y               N            1
## 274 6661354           F            Y               N            1
## 275 6661355           F            Y               N            1
## 276 6661356           F            Y               N            1
## 277 6661357           F            Y               N            1
## 278 6661358           F            Y               N            1
## 279 7424274           F            N               N            0
## 280 7230417           F            N               Y            0
## 281 7007579           F            N               Y            0
## 282 7888709           F            N               Y            0
## 283 7424270           F            N               N            0
## 284 7888706           F            N               Y            0
## 285 7230418           F            N               Y            0
## 286 7872918           M            Y               N            0
## 287 5400793           M            N               Y            0
## 288 6116043           M            Y               Y            0
## 289 6560405           F            N               N            0
##     AMT_INCOME_TOTAL NAME_INCOME_TYPE           NAME_EDUCATION_TYPE
## 1             255150        Pensioner             Incomplete higher
## 2             255150        Pensioner             Incomplete higher
## 3             255150        Pensioner             Incomplete higher
## 4             255150        Pensioner             Incomplete higher
## 5             255150        Pensioner             Incomplete higher
## 6             255150        Pensioner             Incomplete higher
## 7             112500        Pensioner Secondary / secondary special
## 8             112500        Pensioner Secondary / secondary special
## 9             211500        Pensioner Secondary / secondary special
## 10            211500        Pensioner Secondary / secondary special
## 11            211500        Pensioner Secondary / secondary special
## 12            211500        Pensioner Secondary / secondary special
## 13            211500        Pensioner Secondary / secondary special
## 14            126000        Pensioner Secondary / secondary special
## 15            126000        Pensioner Secondary / secondary special
## 16            112500        Pensioner Secondary / secondary special
## 17            112500        Pensioner Secondary / secondary special
## 18            112500        Pensioner Secondary / secondary special
## 19            144000        Pensioner Secondary / secondary special
## 20            144000        Pensioner Secondary / secondary special
## 21            144000        Pensioner Secondary / secondary special
## 22            144000        Pensioner Secondary / secondary special
## 23            144000        Pensioner Secondary / secondary special
## 24            144000        Pensioner Secondary / secondary special
## 25            144000        Pensioner Secondary / secondary special
## 26            144000        Pensioner Secondary / secondary special
## 27            144000        Pensioner Secondary / secondary special
## 28            144000        Pensioner Secondary / secondary special
## 29            157500        Pensioner Secondary / secondary special
## 30            157500        Pensioner Secondary / secondary special
## 31            126000        Pensioner Secondary / secondary special
## 32            126000        Pensioner Secondary / secondary special
## 33            126000        Pensioner Secondary / secondary special
## 34            126000        Pensioner Secondary / secondary special
## 35            126000        Pensioner Secondary / secondary special
## 36            126000        Pensioner Secondary / secondary special
## 37            126000        Pensioner Secondary / secondary special
## 38            126000        Pensioner Secondary / secondary special
## 39            126000        Pensioner Secondary / secondary special
## 40            126000        Pensioner Secondary / secondary special
## 41            126000        Pensioner Secondary / secondary special
## 42            139500        Pensioner Secondary / secondary special
## 43            139500        Pensioner Secondary / secondary special
## 44            139500        Pensioner Secondary / secondary special
## 45            360000        Pensioner Secondary / secondary special
## 46            360000        Pensioner Secondary / secondary special
## 47            247500        Pensioner Secondary / secondary special
## 48            247500        Pensioner Secondary / secondary special
## 49            202500        Pensioner Secondary / secondary special
## 50            202500        Pensioner Secondary / secondary special
## 51            202500        Pensioner Secondary / secondary special
## 52            202500        Pensioner Secondary / secondary special
## 53            202500        Pensioner Secondary / secondary special
## 54            202500        Pensioner Secondary / secondary special
## 55            202500        Pensioner Secondary / secondary special
## 56            202500        Pensioner Secondary / secondary special
## 57            319500        Pensioner Secondary / secondary special
## 58            319500        Pensioner Secondary / secondary special
## 59            319500        Pensioner Secondary / secondary special
## 60            319500        Pensioner Secondary / secondary special
## 61            112500        Pensioner Secondary / secondary special
## 62            112500        Pensioner Secondary / secondary special
## 63            112500        Pensioner Secondary / secondary special
## 64            112500        Pensioner Secondary / secondary special
## 65            135000        Pensioner              Higher education
## 66            135000        Pensioner              Higher education
## 67            135000        Pensioner              Higher education
## 68             74250        Pensioner Secondary / secondary special
## 69             74250        Pensioner Secondary / secondary special
## 70             74250        Pensioner Secondary / secondary special
## 71             74250        Pensioner Secondary / secondary special
## 72             74250        Pensioner Secondary / secondary special
## 73             74250        Pensioner Secondary / secondary special
## 74             74250        Pensioner Secondary / secondary special
## 75             74250        Pensioner Secondary / secondary special
## 76             74250        Pensioner Secondary / secondary special
## 77             74250        Pensioner Secondary / secondary special
## 78             74250        Pensioner Secondary / secondary special
## 79            202500        Pensioner Secondary / secondary special
## 80            202500        Pensioner Secondary / secondary special
## 81            202500        Pensioner Secondary / secondary special
## 82            112500        Pensioner Secondary / secondary special
## 83            112500        Pensioner Secondary / secondary special
## 84            112500        Pensioner Secondary / secondary special
## 85            112500        Pensioner Secondary / secondary special
## 86            162000        Pensioner Secondary / secondary special
## 87            247500        Pensioner              Higher education
## 88            247500        Pensioner              Higher education
## 89            247500        Pensioner              Higher education
## 90            247500        Pensioner              Higher education
## 91            247500        Pensioner              Higher education
## 92            247500        Pensioner              Higher education
## 93            247500        Pensioner              Higher education
## 94            247500        Pensioner              Higher education
## 95            247500        Pensioner              Higher education
## 96            247500        Pensioner              Higher education
## 97            247500        Pensioner              Higher education
## 98            247500        Pensioner              Higher education
## 99            247500        Pensioner              Higher education
## 100           247500        Pensioner              Higher education
## 101           157500        Pensioner Secondary / secondary special
## 102            32139        Pensioner Secondary / secondary special
## 103           157500        Pensioner Secondary / secondary special
## 104           157500        Pensioner Secondary / secondary special
## 105           157500        Pensioner Secondary / secondary special
## 106           157500        Pensioner Secondary / secondary special
## 107            90000        Pensioner Secondary / secondary special
## 108            90000        Pensioner Secondary / secondary special
## 109            90000        Pensioner Secondary / secondary special
## 110            90000        Pensioner Secondary / secondary special
## 111            90000        Pensioner Secondary / secondary special
## 112            90000        Pensioner Secondary / secondary special
## 113           382500        Pensioner              Higher education
## 114           382500        Pensioner              Higher education
## 115           382500        Pensioner              Higher education
## 116           382500        Pensioner              Higher education
## 117           382500        Pensioner              Higher education
## 118           180000        Pensioner Secondary / secondary special
## 119           180000        Pensioner Secondary / secondary special
## 120           180000        Pensioner Secondary / secondary special
## 121           112500        Pensioner Secondary / secondary special
## 122           112500        Pensioner Secondary / secondary special
## 123           112500        Pensioner Secondary / secondary special
## 124           135000        Pensioner Secondary / secondary special
## 125           135000        Pensioner Secondary / secondary special
## 126           135000        Pensioner Secondary / secondary special
## 127           337500        Pensioner Secondary / secondary special
## 128           337500        Pensioner Secondary / secondary special
## 129           337500        Pensioner Secondary / secondary special
## 130           337500        Pensioner Secondary / secondary special
## 131           157500        Pensioner Secondary / secondary special
## 132            45000        Pensioner Secondary / secondary special
## 133           247500        Pensioner               Lower secondary
## 134           247500        Pensioner               Lower secondary
## 135           247500        Pensioner               Lower secondary
## 136           247500        Pensioner               Lower secondary
## 137           247500        Pensioner               Lower secondary
## 138           112500        Pensioner Secondary / secondary special
## 139           112500        Pensioner Secondary / secondary special
## 140           112500        Pensioner Secondary / secondary special
## 141           112500        Pensioner Secondary / secondary special
## 142            73350        Pensioner Secondary / secondary special
## 143            73350        Pensioner Secondary / secondary special
## 144           121500        Pensioner Secondary / secondary special
## 145           121500        Pensioner Secondary / secondary special
## 146           121500        Pensioner Secondary / secondary special
## 147           121500        Pensioner Secondary / secondary special
## 148           121500        Pensioner Secondary / secondary special
## 149           135000        Pensioner Secondary / secondary special
## 150           135000        Pensioner Secondary / secondary special
## 151           135000        Pensioner Secondary / secondary special
## 152           135000        Pensioner Secondary / secondary special
## 153           135000        Pensioner Secondary / secondary special
## 154           135000        Pensioner Secondary / secondary special
## 155           135000        Pensioner Secondary / secondary special
## 156           135000        Pensioner Secondary / secondary special
## 157           135000        Pensioner Secondary / secondary special
## 158           135000        Pensioner Secondary / secondary special
## 159           135000        Pensioner Secondary / secondary special
## 160           135000        Pensioner Secondary / secondary special
## 161           135000        Pensioner Secondary / secondary special
## 162           135000        Pensioner Secondary / secondary special
## 163           135000        Pensioner Secondary / secondary special
## 164           135000        Pensioner Secondary / secondary special
## 165           135000        Pensioner Secondary / secondary special
## 166           135000        Pensioner Secondary / secondary special
## 167           162000        Pensioner Secondary / secondary special
## 168           162000        Pensioner Secondary / secondary special
## 169           162000        Pensioner Secondary / secondary special
## 170           162000        Pensioner Secondary / secondary special
## 171           162000        Pensioner Secondary / secondary special
## 172           162000        Pensioner Secondary / secondary special
## 173           162000        Pensioner Secondary / secondary special
## 174           162000        Pensioner Secondary / secondary special
## 175           162000        Pensioner Secondary / secondary special
## 176           162000        Pensioner Secondary / secondary special
## 177           162000        Pensioner Secondary / secondary special
## 178           162000        Pensioner Secondary / secondary special
## 179           162000        Pensioner Secondary / secondary special
## 180           162000        Pensioner Secondary / secondary special
## 181           162000        Pensioner Secondary / secondary special
## 182           162000        Pensioner Secondary / secondary special
## 183           162000        Pensioner Secondary / secondary special
## 184           162000        Pensioner Secondary / secondary special
## 185           162000        Pensioner Secondary / secondary special
## 186           403650        Pensioner Secondary / secondary special
## 187           403650        Pensioner Secondary / secondary special
## 188           403650        Pensioner Secondary / secondary special
## 189           292500        Pensioner Secondary / secondary special
## 190           292500        Pensioner Secondary / secondary special
## 191           292500        Pensioner Secondary / secondary special
## 192           292500        Pensioner Secondary / secondary special
## 193           292500        Pensioner Secondary / secondary special
## 194           157500        Pensioner Secondary / secondary special
## 195           157500        Pensioner Secondary / secondary special
## 196           121500        Pensioner Secondary / secondary special
## 197           121500        Pensioner Secondary / secondary special
## 198           121500        Pensioner Secondary / secondary special
## 199           121500        Pensioner Secondary / secondary special
## 200           121500        Pensioner Secondary / secondary special
## 201           121500        Pensioner Secondary / secondary special
## 202           121500        Pensioner Secondary / secondary special
## 203           121500        Pensioner Secondary / secondary special
## 204           135000        Pensioner Secondary / secondary special
## 205           135000        Pensioner Secondary / secondary special
## 206           135000        Pensioner Secondary / secondary special
## 207           103500        Pensioner Secondary / secondary special
## 208           103500        Pensioner Secondary / secondary special
## 209           103500        Pensioner Secondary / secondary special
## 210           103500        Pensioner Secondary / secondary special
## 211           103500        Pensioner Secondary / secondary special
## 212           103500        Pensioner Secondary / secondary special
## 213           103500        Pensioner Secondary / secondary special
## 214           103500        Pensioner Secondary / secondary special
## 215           103500        Pensioner Secondary / secondary special
## 216           103500        Pensioner Secondary / secondary special
## 217           103500        Pensioner Secondary / secondary special
## 218           103500        Pensioner Secondary / secondary special
## 219           103500        Pensioner Secondary / secondary special
## 220           103500        Pensioner Secondary / secondary special
## 221           103500        Pensioner Secondary / secondary special
## 222           103500        Pensioner Secondary / secondary special
## 223           103500        Pensioner Secondary / secondary special
## 224           103500        Pensioner Secondary / secondary special
## 225           103500        Pensioner Secondary / secondary special
## 226            72000        Pensioner Secondary / secondary special
## 227            72000        Pensioner Secondary / secondary special
## 228            72000        Pensioner Secondary / secondary special
## 229            72000        Pensioner Secondary / secondary special
## 230            72000        Pensioner Secondary / secondary special
## 231            72000        Pensioner Secondary / secondary special
## 232            72000        Pensioner Secondary / secondary special
## 233            72000        Pensioner Secondary / secondary special
## 234            72000        Pensioner Secondary / secondary special
## 235            72000        Pensioner Secondary / secondary special
## 236            72000        Pensioner Secondary / secondary special
## 237            72000        Pensioner Secondary / secondary special
## 238            72000        Pensioner Secondary / secondary special
## 239            72000        Pensioner Secondary / secondary special
## 240            72000        Pensioner Secondary / secondary special
## 241            72000        Pensioner Secondary / secondary special
## 242            72000        Pensioner Secondary / secondary special
## 243            72000        Pensioner Secondary / secondary special
## 244            72000        Pensioner Secondary / secondary special
## 245            72000        Pensioner Secondary / secondary special
## 246            72000        Pensioner Secondary / secondary special
## 247            72000        Pensioner Secondary / secondary special
## 248            72000        Pensioner Secondary / secondary special
## 249            72000        Pensioner Secondary / secondary special
## 250            72000        Pensioner Secondary / secondary special
## 251            72000        Pensioner Secondary / secondary special
## 252            72000        Pensioner Secondary / secondary special
## 253           180000        Pensioner Secondary / secondary special
## 254           180000        Pensioner Secondary / secondary special
## 255           180000        Pensioner Secondary / secondary special
## 256           135000        Pensioner Secondary / secondary special
## 257           135000        Pensioner Secondary / secondary special
## 258           180000        Pensioner Secondary / secondary special
## 259           180000        Pensioner Secondary / secondary special
## 260           180000        Pensioner Secondary / secondary special
## 261           180000        Pensioner Secondary / secondary special
## 262           180000        Pensioner Secondary / secondary special
## 263           180000        Pensioner Secondary / secondary special
## 264           180000        Pensioner Secondary / secondary special
## 265           180000        Pensioner Secondary / secondary special
## 266           180000        Pensioner Secondary / secondary special
## 267           180000        Pensioner Secondary / secondary special
## 268           180000        Pensioner Secondary / secondary special
## 269           112500        Pensioner Secondary / secondary special
## 270           166500        Pensioner Secondary / secondary special
## 271           166500        Pensioner Secondary / secondary special
## 272           166500        Pensioner Secondary / secondary special
## 273            90000        Pensioner              Higher education
## 274            90000        Pensioner              Higher education
## 275            90000        Pensioner              Higher education
## 276            90000        Pensioner              Higher education
## 277            90000        Pensioner              Higher education
## 278            90000        Pensioner              Higher education
## 279           103500        Pensioner Secondary / secondary special
## 280           162000        Pensioner Secondary / secondary special
## 281           112500        Pensioner Secondary / secondary special
## 282           112500        Pensioner Secondary / secondary special
## 283           103500        Pensioner Secondary / secondary special
## 284           112500        Pensioner Secondary / secondary special
## 285           162000        Pensioner Secondary / secondary special
## 286           382500        Pensioner              Higher education
## 287           135000        Pensioner Secondary / secondary special
## 288           270000        Pensioner              Higher education
## 289           144000        Pensioner              Higher education
##       NAME_FAMILY_STATUS NAME_HOUSING_TYPE DAYS_BIRTH DAYS_EMPLOYED FLAG_MOBIL
## 1         Civil marriage  Rented apartment     -18682        365243          1
## 2         Civil marriage  Rented apartment     -18682        365243          1
## 3         Civil marriage  Rented apartment     -18682        365243          1
## 4         Civil marriage  Rented apartment     -18682        365243          1
## 5         Civil marriage  Rented apartment     -18682        365243          1
## 6         Civil marriage  Rented apartment     -18682        365243          1
## 7   Single / not married  Rented apartment     -23532        365243          1
## 8   Single / not married  Rented apartment     -23532        365243          1
## 9                Married  Rented apartment     -21935        365243          1
## 10               Married  Rented apartment     -21935        365243          1
## 11               Married  Rented apartment     -21935        365243          1
## 12               Married  Rented apartment     -21935        365243          1
## 13               Married  Rented apartment     -21935        365243          1
## 14               Married  Rented apartment     -21823        365243          1
## 15               Married  Rented apartment     -21823        365243          1
## 16               Married  Rented apartment     -24319        365243          1
## 17               Married  Rented apartment     -24319        365243          1
## 18               Married  Rented apartment     -24319        365243          1
## 19  Single / not married  Rented apartment     -21942        365243          1
## 20  Single / not married  Rented apartment     -21942        365243          1
## 21  Single / not married  Rented apartment     -21942        365243          1
## 22  Single / not married  Rented apartment     -21942        365243          1
## 23  Single / not married  Rented apartment     -21942        365243          1
## 24  Single / not married  Rented apartment     -21942        365243          1
## 25  Single / not married  Rented apartment     -21942        365243          1
## 26  Single / not married  Rented apartment     -21942        365243          1
## 27  Single / not married  Rented apartment     -21942        365243          1
## 28  Single / not married  Rented apartment     -21942        365243          1
## 29               Married  Rented apartment     -16006        365243          1
## 30               Married  Rented apartment     -16006        365243          1
## 31               Married  Rented apartment     -16491        365243          1
## 32               Married  Rented apartment     -16491        365243          1
## 33               Married  Rented apartment     -16491        365243          1
## 34               Married  Rented apartment     -16491        365243          1
## 35               Married  Rented apartment     -16491        365243          1
## 36               Married  Rented apartment     -16491        365243          1
## 37               Married  Rented apartment     -16491        365243          1
## 38               Married  Rented apartment     -16491        365243          1
## 39               Married  Rented apartment     -16491        365243          1
## 40               Married  Rented apartment     -16491        365243          1
## 41               Married  Rented apartment     -16491        365243          1
## 42               Married  Rented apartment     -22311        365243          1
## 43               Married  Rented apartment     -22311        365243          1
## 44               Married  Rented apartment     -22311        365243          1
## 45                 Widow  Rented apartment     -20833        365243          1
## 46                 Widow  Rented apartment     -20833        365243          1
## 47             Separated  Rented apartment     -17152        365243          1
## 48             Separated  Rented apartment     -17152        365243          1
## 49             Separated  Rented apartment     -20566        365243          1
## 50             Separated  Rented apartment     -20566        365243          1
## 51             Separated  Rented apartment     -20566        365243          1
## 52             Separated  Rented apartment     -20566        365243          1
## 53             Separated  Rented apartment     -20566        365243          1
## 54             Separated  Rented apartment     -20566        365243          1
## 55             Separated  Rented apartment     -20566        365243          1
## 56             Separated  Rented apartment     -20566        365243          1
## 57               Married  Rented apartment     -20629        365243          1
## 58               Married  Rented apartment     -20629        365243          1
## 59               Married  Rented apartment     -20629        365243          1
## 60               Married  Rented apartment     -20629        365243          1
## 61               Married  Rented apartment     -23510        365243          1
## 62               Married  Rented apartment     -23510        365243          1
## 63               Married  Rented apartment     -23510        365243          1
## 64               Married  Rented apartment     -23510        365243          1
## 65               Married  Rented apartment     -24510        365243          1
## 66               Married  Rented apartment     -24510        365243          1
## 67               Married  Rented apartment     -24510        365243          1
## 68                 Widow  Rented apartment     -23716        365243          1
## 69                 Widow  Rented apartment     -23716        365243          1
## 70                 Widow  Rented apartment     -23716        365243          1
## 71                 Widow  Rented apartment     -23716        365243          1
## 72                 Widow  Rented apartment     -23716        365243          1
## 73                 Widow  Rented apartment     -23716        365243          1
## 74                 Widow  Rented apartment     -23716        365243          1
## 75                 Widow  Rented apartment     -23716        365243          1
## 76                 Widow  Rented apartment     -23716        365243          1
## 77                 Widow  Rented apartment     -23716        365243          1
## 78                 Widow  Rented apartment     -23716        365243          1
## 79               Married  Rented apartment     -23459        365243          1
## 80               Married  Rented apartment     -23459        365243          1
## 81               Married  Rented apartment     -23459        365243          1
## 82                 Widow  Rented apartment     -20704        365243          1
## 83                 Widow  Rented apartment     -20704        365243          1
## 84                 Widow  Rented apartment     -20704        365243          1
## 85                 Widow  Rented apartment     -20704        365243          1
## 86        Civil marriage  Rented apartment     -20884        365243          1
## 87               Married  Rented apartment     -21123        365243          1
## 88               Married  Rented apartment     -21123        365243          1
## 89               Married  Rented apartment     -21123        365243          1
## 90               Married  Rented apartment     -21123        365243          1
## 91               Married  Rented apartment     -21123        365243          1
## 92               Married  Rented apartment     -21123        365243          1
## 93               Married  Rented apartment     -21123        365243          1
## 94               Married  Rented apartment     -21123        365243          1
## 95               Married  Rented apartment     -21123        365243          1
## 96               Married  Rented apartment     -21123        365243          1
## 97               Married  Rented apartment     -21123        365243          1
## 98               Married  Rented apartment     -21123        365243          1
## 99               Married  Rented apartment     -21123        365243          1
## 100              Married  Rented apartment     -21123        365243          1
## 101 Single / not married  Rented apartment     -19635        365243          1
## 102              Married  Rented apartment     -22540        365243          1
## 103                Widow  Rented apartment     -20750        365243          1
## 104                Widow  Rented apartment     -20750        365243          1
## 105                Widow  Rented apartment     -20750        365243          1
## 106                Widow  Rented apartment     -20750        365243          1
## 107              Married  Rented apartment     -20880        365243          1
## 108              Married  Rented apartment     -20880        365243          1
## 109              Married  Rented apartment     -20880        365243          1
## 110              Married  Rented apartment     -20880        365243          1
## 111              Married  Rented apartment     -20880        365243          1
## 112              Married  Rented apartment     -20880        365243          1
## 113              Married  Rented apartment     -10068        365243          1
## 114              Married  Rented apartment     -10068        365243          1
## 115              Married  Rented apartment     -10068        365243          1
## 116              Married  Rented apartment     -10068        365243          1
## 117              Married  Rented apartment     -10068        365243          1
## 118              Married  Rented apartment     -20821        365243          1
## 119              Married  Rented apartment     -20821        365243          1
## 120              Married  Rented apartment     -20821        365243          1
## 121              Married  Rented apartment     -23107        365243          1
## 122              Married  Rented apartment     -23107        365243          1
## 123              Married  Rented apartment     -23107        365243          1
## 124              Married  Rented apartment     -21630        365243          1
## 125              Married  Rented apartment     -21630        365243          1
## 126              Married  Rented apartment     -21630        365243          1
## 127              Married  Rented apartment     -16082        365243          1
## 128              Married  Rented apartment     -16082        365243          1
## 129              Married  Rented apartment     -16082        365243          1
## 130              Married  Rented apartment     -16082        365243          1
## 131 Single / not married  Rented apartment     -20342        365243          1
## 132            Separated  Rented apartment     -21925        365243          1
## 133                Widow  Rented apartment     -20234        365243          1
## 134                Widow  Rented apartment     -20234        365243          1
## 135                Widow  Rented apartment     -20234        365243          1
## 136                Widow  Rented apartment     -20234        365243          1
## 137                Widow  Rented apartment     -20234        365243          1
## 138 Single / not married  Rented apartment     -24290        365243          1
## 139 Single / not married  Rented apartment     -24290        365243          1
## 140 Single / not married  Rented apartment     -24290        365243          1
## 141 Single / not married  Rented apartment     -24290        365243          1
## 142              Married  Rented apartment     -20638        365243          1
## 143              Married  Rented apartment     -20638        365243          1
## 144              Married  Rented apartment     -22171        365243          1
## 145              Married  Rented apartment     -22171        365243          1
## 146              Married  Rented apartment     -22171        365243          1
## 147              Married  Rented apartment     -22171        365243          1
## 148              Married  Rented apartment     -22171        365243          1
## 149              Married  Rented apartment     -23449        365243          1
## 150              Married  Rented apartment     -23449        365243          1
## 151              Married  Rented apartment     -23449        365243          1
## 152              Married  Rented apartment     -23254        365243          1
## 153              Married  Rented apartment     -23254        365243          1
## 154              Married  Rented apartment     -23254        365243          1
## 155              Married  Rented apartment     -23254        365243          1
## 156              Married  Rented apartment     -23254        365243          1
## 157              Married  Rented apartment     -23314        365243          1
## 158              Married  Rented apartment     -23314        365243          1
## 159              Married  Rented apartment     -23314        365243          1
## 160              Married  Rented apartment     -23314        365243          1
## 161              Married  Rented apartment     -23314        365243          1
## 162              Married  Rented apartment     -23314        365243          1
## 163              Married  Rented apartment     -23314        365243          1
## 164              Married  Rented apartment     -23314        365243          1
## 165              Married  Rented apartment     -23314        365243          1
## 166              Married  Rented apartment     -23314        365243          1
## 167                Widow  Rented apartment     -23948        365243          1
## 168                Widow  Rented apartment     -23948        365243          1
## 169                Widow  Rented apartment     -23948        365243          1
## 170                Widow  Rented apartment     -23948        365243          1
## 171                Widow  Rented apartment     -23948        365243          1
## 172                Widow  Rented apartment     -23948        365243          1
## 173                Widow  Rented apartment     -23948        365243          1
## 174                Widow  Rented apartment     -23948        365243          1
## 175                Widow  Rented apartment     -23948        365243          1
## 176                Widow  Rented apartment     -23948        365243          1
## 177                Widow  Rented apartment     -23948        365243          1
## 178                Widow  Rented apartment     -23948        365243          1
## 179                Widow  Rented apartment     -23948        365243          1
## 180                Widow  Rented apartment     -23948        365243          1
## 181                Widow  Rented apartment     -23948        365243          1
## 182                Widow  Rented apartment     -23948        365243          1
## 183                Widow  Rented apartment     -23948        365243          1
## 184                Widow  Rented apartment     -23948        365243          1
## 185                Widow  Rented apartment     -23948        365243          1
## 186 Single / not married  Rented apartment     -14583        365243          1
## 187 Single / not married  Rented apartment     -14583        365243          1
## 188 Single / not married  Rented apartment     -14583        365243          1
## 189       Civil marriage  Rented apartment     -13835        365243          1
## 190       Civil marriage  Rented apartment     -13835        365243          1
## 191       Civil marriage  Rented apartment     -13835        365243          1
## 192       Civil marriage  Rented apartment     -13835        365243          1
## 193       Civil marriage  Rented apartment     -13835        365243          1
## 194              Married  Rented apartment     -22904        365243          1
## 195              Married  Rented apartment     -22904        365243          1
## 196                Widow  Rented apartment     -23375        365243          1
## 197                Widow  Rented apartment     -23375        365243          1
## 198                Widow  Rented apartment     -23375        365243          1
## 199                Widow  Rented apartment     -23375        365243          1
## 200                Widow  Rented apartment     -23375        365243          1
## 201                Widow  Rented apartment     -23375        365243          1
## 202                Widow  Rented apartment     -23375        365243          1
## 203                Widow  Rented apartment     -23375        365243          1
## 204                Widow  Rented apartment     -23153        365243          1
## 205                Widow  Rented apartment     -23153        365243          1
## 206       Civil marriage  Rented apartment     -21319        365243          1
## 207 Single / not married  Rented apartment     -20132        365243          1
## 208 Single / not married  Rented apartment     -20132        365243          1
## 209 Single / not married  Rented apartment     -20132        365243          1
## 210 Single / not married  Rented apartment     -20132        365243          1
## 211 Single / not married  Rented apartment     -20132        365243          1
## 212 Single / not married  Rented apartment     -20132        365243          1
## 213 Single / not married  Rented apartment     -20132        365243          1
## 214 Single / not married  Rented apartment     -20132        365243          1
## 215 Single / not married  Rented apartment     -20132        365243          1
## 216 Single / not married  Rented apartment     -20132        365243          1
## 217 Single / not married  Rented apartment     -20132        365243          1
## 218 Single / not married  Rented apartment     -20132        365243          1
## 219 Single / not married  Rented apartment     -20132        365243          1
## 220 Single / not married  Rented apartment     -20132        365243          1
## 221 Single / not married  Rented apartment     -20132        365243          1
## 222 Single / not married  Rented apartment     -20132        365243          1
## 223 Single / not married  Rented apartment     -20132        365243          1
## 224 Single / not married  Rented apartment     -20132        365243          1
## 225 Single / not married  Rented apartment     -20132        365243          1
## 226              Married  Rented apartment     -21878        365243          1
## 227              Married  Rented apartment     -21878        365243          1
## 228              Married  Rented apartment     -21878        365243          1
## 229              Married  Rented apartment     -21878        365243          1
## 230              Married  Rented apartment     -21878        365243          1
## 231              Married  Rented apartment     -21878        365243          1
## 232              Married  Rented apartment     -21878        365243          1
## 233              Married  Rented apartment     -21878        365243          1
## 234              Married  Rented apartment     -21878        365243          1
## 235              Married  Rented apartment     -21878        365243          1
## 236              Married  Rented apartment     -21878        365243          1
## 237              Married  Rented apartment     -21878        365243          1
## 238              Married  Rented apartment     -21878        365243          1
## 239              Married  Rented apartment     -21878        365243          1
## 240              Married  Rented apartment     -21878        365243          1
## 241              Married  Rented apartment     -21878        365243          1
## 242              Married  Rented apartment     -21878        365243          1
## 243              Married  Rented apartment     -21878        365243          1
## 244              Married  Rented apartment     -21878        365243          1
## 245              Married  Rented apartment     -21878        365243          1
## 246              Married  Rented apartment     -21878        365243          1
## 247              Married  Rented apartment     -21878        365243          1
## 248              Married  Rented apartment     -21878        365243          1
## 249              Married  Rented apartment     -21878        365243          1
## 250              Married  Rented apartment     -21878        365243          1
## 251              Married  Rented apartment     -21878        365243          1
## 252              Married  Rented apartment     -21878        365243          1
## 253              Married  Rented apartment     -20300        365243          1
## 254              Married  Rented apartment     -20300        365243          1
## 255              Married  Rented apartment     -20300        365243          1
## 256 Single / not married  Rented apartment     -21162        365243          1
## 257 Single / not married  Rented apartment     -21162        365243          1
## 258              Married  Rented apartment     -21229        365243          1
## 259              Married  Rented apartment     -21229        365243          1
## 260              Married  Rented apartment     -21229        365243          1
## 261              Married  Rented apartment     -21229        365243          1
## 262              Married  Rented apartment     -21229        365243          1
## 263              Married  Rented apartment     -21229        365243          1
## 264              Married  Rented apartment     -21229        365243          1
## 265              Married  Rented apartment     -21229        365243          1
## 266              Married  Rented apartment     -21229        365243          1
## 267              Married  Rented apartment     -21229        365243          1
## 268       Civil marriage  Rented apartment     -17135        365243          1
## 269 Single / not married  Rented apartment     -12381        365243          1
## 270                Widow  Rented apartment     -23780        365243          1
## 271                Widow  Rented apartment     -23780        365243          1
## 272                Widow  Rented apartment     -23780        365243          1
## 273              Married  Rented apartment     -15183        365243          1
## 274              Married  Rented apartment     -15183        365243          1
## 275              Married  Rented apartment     -15183        365243          1
## 276              Married  Rented apartment     -15183        365243          1
## 277              Married  Rented apartment     -15183        365243          1
## 278              Married  Rented apartment     -15183        365243          1
## 279 Single / not married  Rented apartment     -20132        365243          1
## 280                Widow  Rented apartment     -23948        365243          1
## 281                Widow  Rented apartment     -20704        365243          1
## 282              Married  Rented apartment     -23107        365243          1
## 283 Single / not married  Rented apartment     -20132        365243          1
## 284              Married  Rented apartment     -23107        365243          1
## 285                Widow  Rented apartment     -23948        365243          1
## 286              Married  Rented apartment     -10068        365243          1
## 287 Single / not married  Rented apartment     -10971         -1284          1
## 288              Married  Rented apartment     -21478          -276          1
## 289            Separated  Rented apartment     -13691          -293          1
##     FLAG_WORK_PHONE FLAG_PHONE FLAG_EMAIL OCCUPATION_TYPE CNT_FAM_MEMBERS
## 1                 0          0          0                               2
## 2                 0          0          0                               2
## 3                 0          0          0                               2
## 4                 0          0          0                               2
## 5                 0          0          0                               2
## 6                 0          0          0                               2
## 7                 0          1          0                               1
## 8                 0          1          0                               1
## 9                 0          1          0                               2
## 10                0          1          0                               2
## 11                0          1          0                               2
## 12                0          1          0                               2
## 13                0          1          0                               2
## 14                0          0          0                               2
## 15                0          0          0                               2
## 16                0          0          0                               2
## 17                0          0          0                               2
## 18                0          0          0                               2
## 19                0          0          0                               1
## 20                0          0          0                               1
## 21                0          0          0                               1
## 22                0          0          0                               1
## 23                0          0          0                               1
## 24                0          0          0                               1
## 25                0          0          0                               1
## 26                0          0          0                               1
## 27                0          0          0                               1
## 28                0          0          0                               1
## 29                0          0          0                               2
## 30                0          0          0                               2
## 31                0          0          0                               2
## 32                0          0          0                               2
## 33                0          0          0                               2
## 34                0          0          0                               2
## 35                0          0          0                               2
## 36                0          0          0                               2
## 37                0          0          0                               2
## 38                0          0          0                               2
## 39                0          0          0                               2
## 40                0          0          0                               2
## 41                0          0          0                               2
## 42                0          0          0                               2
## 43                0          0          0                               2
## 44                0          0          0                               2
## 45                0          0          0                               1
## 46                0          0          0                               1
## 47                0          1          1                               1
## 48                0          1          1                               1
## 49                0          0          0                               1
## 50                0          0          0                               1
## 51                0          0          0                               1
## 52                0          0          0                               1
## 53                0          0          0                               1
## 54                0          0          0                               1
## 55                0          0          0                               1
## 56                0          0          0                               1
## 57                0          0          0                               3
## 58                0          0          0                               3
## 59                0          0          0                               3
## 60                0          0          0                               3
## 61                0          0          0                               2
## 62                0          0          0                               2
## 63                0          0          0                               2
## 64                0          0          0                               2
## 65                0          0          0                               2
## 66                0          0          0                               2
## 67                0          0          0                               2
## 68                0          0          0                               1
## 69                0          0          0                               1
## 70                0          0          0                               1
## 71                0          0          0                               1
## 72                0          0          0                               1
## 73                0          0          0                               1
## 74                0          0          0                               1
## 75                0          0          0                               1
## 76                0          0          0                               1
## 77                0          0          0                               1
## 78                0          0          0                               1
## 79                0          0          0                               2
## 80                0          0          0                               2
## 81                0          0          0                               2
## 82                0          0          0                               1
## 83                0          0          0                               1
## 84                0          0          0                               1
## 85                0          0          0                               1
## 86                0          0          0                               2
## 87                0          0          1                               2
## 88                0          0          1                               2
## 89                0          0          1                               2
## 90                0          0          1                               2
## 91                0          0          1                               2
## 92                0          0          1                               2
## 93                0          0          1                               2
## 94                0          0          1                               2
## 95                0          0          1                               2
## 96                0          0          1                               2
## 97                0          0          1                               2
## 98                0          0          1                               2
## 99                0          0          1                               2
## 100               0          0          1                               2
## 101               0          0          0                               1
## 102               0          0          0                               2
## 103               0          0          1                               1
## 104               0          0          1                               1
## 105               0          0          1                               1
## 106               0          0          1                               1
## 107               0          0          0                               2
## 108               0          0          0                               2
## 109               0          0          0                               2
## 110               0          0          0                               2
## 111               0          0          0                               2
## 112               0          0          0                               2
## 113               0          0          0                               2
## 114               0          0          0                               2
## 115               0          0          0                               2
## 116               0          0          0                               2
## 117               0          0          0                               2
## 118               0          0          0                               2
## 119               0          0          0                               2
## 120               0          0          0                               2
## 121               0          1          0                               2
## 122               0          1          0                               2
## 123               0          1          0                               2
## 124               0          0          0                               2
## 125               0          0          0                               2
## 126               0          0          0                               2
## 127               0          0          0                               5
## 128               0          0          0                               5
## 129               0          0          0                               5
## 130               0          0          0                               5
## 131               0          1          1                               1
## 132               0          0          0                               1
## 133               0          1          0                               1
## 134               0          1          0                               1
## 135               0          1          0                               1
## 136               0          1          0                               1
## 137               0          1          0                               1
## 138               0          0          0                               1
## 139               0          0          0                               1
## 140               0          0          0                               1
## 141               0          0          0                               1
## 142               0          0          0                               2
## 143               0          0          0                               2
## 144               0          0          0                               2
## 145               0          0          0                               2
## 146               0          0          0                               2
## 147               0          0          0                               2
## 148               0          0          0                               2
## 149               0          0          0                               2
## 150               0          0          0                               2
## 151               0          0          0                               2
## 152               0          0          0                               2
## 153               0          0          0                               2
## 154               0          0          0                               2
## 155               0          0          0                               2
## 156               0          0          0                               2
## 157               0          0          0                               2
## 158               0          0          0                               2
## 159               0          0          0                               2
## 160               0          0          0                               2
## 161               0          0          0                               2
## 162               0          0          0                               2
## 163               0          0          0                               2
## 164               0          0          0                               2
## 165               0          0          0                               2
## 166               0          0          0                               2
## 167               0          0          0                               1
## 168               0          0          0                               1
## 169               0          0          0                               1
## 170               0          0          0                               1
## 171               0          0          0                               1
## 172               0          0          0                               1
## 173               0          0          0                               1
## 174               0          0          0                               1
## 175               0          0          0                               1
## 176               0          0          0                               1
## 177               0          0          0                               1
## 178               0          0          0                               1
## 179               0          0          0                               1
## 180               0          0          0                               1
## 181               0          0          0                               1
## 182               0          0          0                               1
## 183               0          0          0                               1
## 184               0          0          0                               1
## 185               0          0          0                               1
## 186               0          0          0                               2
## 187               0          0          0                               2
## 188               0          0          0                               2
## 189               0          0          0                               3
## 190               0          0          0                               3
## 191               0          0          0                               3
## 192               0          0          0                               3
## 193               0          0          0                               3
## 194               0          0          0                               2
## 195               0          0          0                               2
## 196               0          0          0                               1
## 197               0          0          0                               1
## 198               0          0          0                               1
## 199               0          0          0                               1
## 200               0          0          0                               1
## 201               0          0          0                               1
## 202               0          0          0                               1
## 203               0          0          0                               1
## 204               0          0          0                               1
## 205               0          0          0                               1
## 206               0          0          0                               2
## 207               0          0          0                               1
## 208               0          0          0                               1
## 209               0          0          0                               1
## 210               0          0          0                               1
## 211               0          0          0                               1
## 212               0          0          0                               1
## 213               0          0          0                               1
## 214               0          0          0                               1
## 215               0          0          0                               1
## 216               0          0          0                               1
## 217               0          0          0                               1
## 218               0          0          0                               1
## 219               0          0          0                               1
## 220               0          0          0                               1
## 221               0          0          0                               1
## 222               0          0          0                               1
## 223               0          0          0                               1
## 224               0          0          0                               1
## 225               0          0          0                               1
## 226               0          0          0                               2
## 227               0          0          0                               2
## 228               0          0          0                               2
## 229               0          0          0                               2
## 230               0          0          0                               2
## 231               0          0          0                               2
## 232               0          0          0                               2
## 233               0          0          0                               2
## 234               0          0          0                               2
## 235               0          0          0                               2
## 236               0          0          0                               2
## 237               0          0          0                               2
## 238               0          0          0                               2
## 239               0          0          0                               2
## 240               0          0          0                               2
## 241               0          0          0                               2
## 242               0          0          0                               2
## 243               0          0          0                               2
## 244               0          0          0                               2
## 245               0          0          0                               2
## 246               0          0          0                               2
## 247               0          0          0                               2
## 248               0          0          0                               2
## 249               0          0          0                               2
## 250               0          0          0                               2
## 251               0          0          0                               2
## 252               0          0          0                               2
## 253               0          0          0                               2
## 254               0          0          0                               2
## 255               0          0          0                               2
## 256               0          0          0                               1
## 257               0          0          0                               1
## 258               0          0          0                               2
## 259               0          0          0                               2
## 260               0          0          0                               2
## 261               0          0          0                               2
## 262               0          0          0                               2
## 263               0          0          0                               2
## 264               0          0          0                               2
## 265               0          0          0                               2
## 266               0          0          0                               2
## 267               0          0          0                               2
## 268               0          0          0                               2
## 269               0          0          0                               1
## 270               0          0          0                               1
## 271               0          0          0                               1
## 272               0          0          0                               1
## 273               0          0          0                               3
## 274               0          0          0                               3
## 275               0          0          0                               3
## 276               0          0          0                               3
## 277               0          0          0                               3
## 278               0          0          0                               3
## 279               0          0          0                               1
## 280               0          0          0                               1
## 281               0          0          0                               1
## 282               0          1          0                               2
## 283               0          0          0                               1
## 284               0          1          0                               2
## 285               0          0          0                               1
## 286               0          0          0                               2
## 287               0          0          0        Managers               1
## 288               0          0          0         Drivers               2
## 289               0          0          0        Laborers               1
total_pensioners_rented <- nrow(pensioners_rented)
cat("Total Pensioners in Rented Housing:",total_pensioners_rented)
## Total Pensioners in Rented Housing: 289
head(pensioners_rented[,c("ID","NAME_INCOME_TYPE", "NAME_HOUSING_TYPE", "AMT_INCOME_TOTAL")])
##        ID NAME_INCOME_TYPE NAME_HOUSING_TYPE AMT_INCOME_TOTAL
## 1 5009033        Pensioner  Rented apartment           255150
## 2 5009034        Pensioner  Rented apartment           255150
## 3 5009035        Pensioner  Rented apartment           255150
## 4 5009036        Pensioner  Rented apartment           255150
## 5 5009037        Pensioner  Rented apartment           255150
## 6 5009038        Pensioner  Rented apartment           255150
  • Interpretation: This filter helps in understanding if this group has enough income to cover both rent and potential credit card repayments. Pensioners typically have steady, guaranteed income, but staying in a rented property indicates a lack of permanent housing assets.

Question 2.4: Filter the credit record to isolate high-risk users who have reached "Status 5" indicating a serious default of over 150 days.

# Status 5 means  critical default
high_risk_defaults <- credit_data %>% 
  filter(STATUS == '5')

total_high_risk_records <- nrow(high_risk_defaults)
unique_defaulters_count <- length(unique(high_risk_defaults$ID))

cat("Total entries with status 5: ", total_high_risk_records)
## Total entries with status 5:  1693
cat("\nActual unique customers in default:",unique_defaulters_count)
## 
## Actual unique customers in default: 195
  • Interpretation: “Status 5” represents the highest level of payment delay, where an applicant has failed to pay for over 150 days, often leading to a “Written-off” status.we isolate the “Bad” category of our population.

Question 2.5: Calculate the count of applicants who demonstrate professional stability with an employment history of more than 5 years.

# Employment history filtering 
# 5 year = 1825 days
stable_employees <- app_data %>% 
  filter(DAYS_EMPLOYED < (-1825))

total_stable_applicants <- nrow(stable_employees)

cat("Total number of applicants with >5 year of employment:",total_stable_applicants)
## Total number of applicants with >5 year of employment: 188948
head(stable_employees[,c("ID", "DAYS_EMPLOYED", "NAME_INCOME_TYPE", "AMT_INCOME_TOTAL")])
##        ID DAYS_EMPLOYED     NAME_INCOME_TYPE AMT_INCOME_TOTAL
## 1 5008804         -4542              Working           427500
## 2 5008805         -4542              Working           427500
## 3 5008808         -3051 Commercial associate           270000
## 4 5008809         -3051 Commercial associate           270000
## 5 5008810         -3051 Commercial associate           270000
## 6 5008811         -3051 Commercial associate           270000
  • Interpretation:The duration of employment is a primary indicator of financial stability and reliable repayment behavior:
    • Applicants in this group are generally considered lower risk because a long tenure at a job suggests a steady income stream and a lower probability of sudden unemployment.

                    **Detail Report**

      The initial phases of our analysis successfully established the data’s structural integrity and customer segmentation, starting with the removal of 47 duplicate records to ensure high data quality. Through Level 2 filtering, we identified high-value segments, such as the Top 10 high-income earners and asset-rich applicants (Car/Home owners), while simultaneously isolating "Status 5" critical defaulters to form the foundation of our risk assessment. Our technical audit revealed a professionally stable applicant pool, with a significant portion possessing over 5 years of experience. By decoding the negative values in the DAYS_EMPLOYED column as time-offsets from the application date, we have prepared the dataset for accurate feature engineering and predictive credit scoring.

Level 3: Advanced Data Transformation & Insights

Question 3.1: Calculate the average income for each education level while simultaneously engineering a new Age feature (derived from DAYS_BIRTH) to determine how maturity and education interact to influence an applicant’s financial capacity.

app_data <- app_data %>% 
  mutate(Age=round(abs(DAYS_BIRTH)/365,0))

# Grouping by education and calculating mean Income and Age
edu_profile<-app_data %>% 
  group_by(NAME_EDUCATION_TYPE) %>% 
  summarise(
    Average_Income = round(mean(AMT_INCOME_TOTAL,na.rm = TRUE),2),
    Average_Age = round(mean(Age,na.rm=TRUE),1),
    Total_Applicants = n()
  ) %>% 
  arrange(desc(Average_Income))

print(edu_profile)
## # A tibble: 5 × 4
##   NAME_EDUCATION_TYPE           Average_Income Average_Age Total_Applicants
##   <chr>                                  <dbl>       <dbl>            <int>
## 1 Academic degree                      240692.        44.9              312
## 2 Higher education                     226110.        41.5           117512
## 3 Incomplete higher                    207330.        35.4            14847
## 4 Secondary / secondary special        172057.        45.1           301788
## 5 Lower secondary                      143934.        48.2             4051
  • Interpretation:The analysis reveals that higher education acts as a financial accelerator, allowing younger applicants (averaging 35–41 years old) to significantly out-earn older applicants (averaging 48 years old) who lack advanced qualifications. This data proves that an applicant’s degree is a better predictor of wealth than their age alone. For credit risk assessment, the bank should prioritize these younger, highly-educated segments as they offer high repayment capacity and a longer potential relationship with the bank.

Question 3.2: Handle the outlier to create a clean Years_of_Experience variable and analyze the volume of applications across different employment types.

outliers_check <- app_data %>% 
  group_by(NAME_INCOME_TYPE) %>% 
  summarise(
    Min_Days = min(DAYS_EMPLOYED),
    Max_Days = max(DAYS_EMPLOYED),
    Avg_Days = mean(DAYS_EMPLOYED)
  )
cat("Outliers are :","\n")
## Outliers are :
print(outliers_check)
## # A tibble: 5 × 4
##   NAME_INCOME_TYPE     Min_Days Max_Days Avg_Days
##   <chr>                   <int>    <int>    <dbl>
## 1 Commercial associate   -16495      -12   -2313.
## 2 Pensioner              -11662   365243  364443.
## 3 State servant          -16767      -16   -3569.
## 4 Student                 -3904     -382   -2468.
## 5 Working                -17531      -12   -2610.
  • Interpretation:we performed a diagnostic check which revealed a massive outliers of 365,243 in the DAYS_EMPLOYED column, specifically tied to the Pensioner category.this value suggests over 1,000 years of employment which is impossible.
app_data <- app_data %>% 
  mutate(Years_of_Experience = ifelse(DAYS_EMPLOYED > 0 ,0 ,abs(DAYS_EMPLOYED)/365))

employment_analysis <- app_data %>% 
  group_by(NAME_INCOME_TYPE) %>% 
  summarise(
    Application_Count = n(),
    Average_Years_Experience = round(mean(Years_of_Experience,na.rm = TRUE),1)) %>% 
  arrange(desc(Application_Count))
  

print(employment_analysis)
## # A tibble: 5 × 3
##   NAME_INCOME_TYPE     Application_Count Average_Years_Experience
##   <chr>                            <int>                    <dbl>
## 1 Working                         226076                      7.2
## 2 Commercial associate            100744                      6.3
## 3 Pensioner                        75488                      0  
## 4 State servant                    36185                      9.8
## 5 Student                             17                      6.8
  • Interpretation:we neutralized this anomaly by treating positive values as zero and converting the remaining negative offsets into a readable Years_of_Experience feature. This correction was vital for a realistic analysis.

Question 3.3: Household Per-Capita Financial Assessment Calculate the median income based on family status and engineer the Income_per_Member feature to evaluate the actual disposable income available per person in the household.

app_data <- app_data %>%
  mutate(Income_per_Member = AMT_INCOME_TOTAL / CNT_FAM_MEMBERS)

family_financial_summary <- app_data %>% 
  group_by(NAME_FAMILY_STATUS) %>% 
  summarise(
    Median_Total_Income = median(AMT_INCOME_TOTAL,na.rm = TRUE),
    Average_Income_Per_Member = round(mean(Income_per_Member,na.rm=TRUE),2),
    Applicant_Count = n()
    ) %>% 
  
  arrange(desc(Average_Income_Per_Member))


print(family_financial_summary)
## # A tibble: 5 × 4
##   NAME_FAMILY_STATUS  Median_Total_Income Average_Income_Per_M…¹ Applicant_Count
##   <chr>                             <dbl>                  <dbl>           <int>
## 1 Single / not marri…              180000                175225.           55258
## 2 Separated                        180000                167838.           27251
## 3 Widow                            157500                162607.           19674
## 4 Civil marriage                   180000                 85780.           36529
## 5 Married                          157500                 79668.          299798
## # ℹ abbreviated name: ¹​Average_Income_Per_Member
  • Interpretation:Single or Separated individuals typically show the highest average income per person, meaning they have more “financial breathing room” compared to Married couples who, despite having higher median total incomes, must divide those resources among more members. For the bank, this “Per-Capita” metric is a superior predictor of repayment capacity, as it identifies applicants who have a higher surplus of cash at the end of the month to service their debt.

Question 3.4: Risk Target Engineering: Implement the logic to create the binary Is_Bad target column (1 for high-risk, 0 for healthy) using the credit_record status history to prepare the data for predictive modeling.

credit_labels <- credit_data %>% 
  mutate(Is_Bad_Flag = ifelse(STATUS %in% c("1", "2", "3", "4", "5"),1,0)) %>% 
  group_by(ID) %>% 
  summarise(Is_Bad = max(Is_Bad_Flag))


summary_table <- table(credit_labels$Is_Bad)
print(summary_table)
## 
##     0     1 
## 40635  5350
housing_profile <- app_data %>%
  group_by(NAME_HOUSING_TYPE) %>%
  summarise(Avg_Age = round(mean(Age, na.rm = TRUE), 1))

print(housing_profile)
## # A tibble: 6 × 2
##   NAME_HOUSING_TYPE   Avg_Age
##   <chr>                 <dbl>
## 1 Co-op apartment        39.3
## 2 House / apartment      44.5
## 3 Municipal apartment    45.5
## 4 Office apartment       40.2
## 5 Rented apartment       37  
## 6 With parents           32.6
  • Interpretation:This step is where we decide who is a “Good” or “Bad” customer. We looked at every customer’s past payments. If someone was even one month late, we marked them as Risky (1). If they always paid on time, they are Safe (0).This “Tag” is the most important part of the project because it tells the bank exactly who they can trust with a credit card.

Question 3.5: Master Dataset Integration: Perform a robust left_join to merge the demographic and credit datasets and generate a final summary of the “Good” vs “Bad” class distribution to check for data imbalance.

# Removing those who don't have a credit record
master_data <- app_data %>%
  left_join(credit_labels, by = "ID") %>%
  filter(!is.na(Is_Bad)) 


# This tells us if our data is 'Imbalanced' (more Good than Bad)
final_summary <- master_data %>%
  group_by(Is_Bad) %>%
  summarise(
    Total_Count = n(),
    Percentage = round((n() / nrow(master_data)) * 100, 2)
  )


print(final_summary)
## # A tibble: 2 × 3
##   Is_Bad Total_Count Percentage
##    <dbl>       <int>      <dbl>
## 1      0       32166       88.2
## 2      1        4291       11.8
  • Interpretation:This is the final step where we put all the pieces of the puzzle together. We have joined our two separate files into one Master Dataset. Now, for every person, we can see their Age, Income, and Education right next to their “Good” or “Bad” tag.It is also know as Class Imbalance.
print(colnames(master_data))
##  [1] "ID"                  "CODE_GENDER"         "FLAG_OWN_CAR"       
##  [4] "FLAG_OWN_REALTY"     "CNT_CHILDREN"        "AMT_INCOME_TOTAL"   
##  [7] "NAME_INCOME_TYPE"    "NAME_EDUCATION_TYPE" "NAME_FAMILY_STATUS" 
## [10] "NAME_HOUSING_TYPE"   "DAYS_BIRTH"          "DAYS_EMPLOYED"      
## [13] "FLAG_MOBIL"          "FLAG_WORK_PHONE"     "FLAG_PHONE"         
## [16] "FLAG_EMAIL"          "OCCUPATION_TYPE"     "CNT_FAM_MEMBERS"    
## [19] "Age"                 "Years_of_Experience" "Income_per_Member"  
## [22] "Is_Bad"
head(master_data)
##        ID CODE_GENDER FLAG_OWN_CAR FLAG_OWN_REALTY CNT_CHILDREN
## 1 5008804           M            Y               Y            0
## 2 5008805           M            Y               Y            0
## 3 5008806           M            Y               Y            0
## 4 5008808           F            N               Y            0
## 5 5008809           F            N               Y            0
## 6 5008810           F            N               Y            0
##   AMT_INCOME_TOTAL     NAME_INCOME_TYPE           NAME_EDUCATION_TYPE
## 1           427500              Working              Higher education
## 2           427500              Working              Higher education
## 3           112500              Working Secondary / secondary special
## 4           270000 Commercial associate Secondary / secondary special
## 5           270000 Commercial associate Secondary / secondary special
## 6           270000 Commercial associate Secondary / secondary special
##     NAME_FAMILY_STATUS NAME_HOUSING_TYPE DAYS_BIRTH DAYS_EMPLOYED FLAG_MOBIL
## 1       Civil marriage  Rented apartment     -12005         -4542          1
## 2       Civil marriage  Rented apartment     -12005         -4542          1
## 3              Married House / apartment     -21474         -1134          1
## 4 Single / not married House / apartment     -19110         -3051          1
## 5 Single / not married House / apartment     -19110         -3051          1
## 6 Single / not married House / apartment     -19110         -3051          1
##   FLAG_WORK_PHONE FLAG_PHONE FLAG_EMAIL OCCUPATION_TYPE CNT_FAM_MEMBERS Age
## 1               1          0          0                               2  33
## 2               1          0          0                               2  33
## 3               0          0          0  Security staff               2  59
## 4               0          1          1     Sales staff               1  52
## 5               0          1          1     Sales staff               1  52
## 6               0          1          1     Sales staff               1  52
##   Years_of_Experience Income_per_Member Is_Bad
## 1           12.443836            213750      1
## 2           12.443836            213750      1
## 3            3.106849             56250      0
## 4            8.358904            270000      0
## 5            8.358904            270000      0
## 6            8.358904            270000      0

Level 4: Exploratory Data Analysis (EDA)

Question 4.1: Target Class Frequency Analysis:

  • Task: Generate a frequency bar chart to visualize the distribution of the Is_Bad target variable. This analysis is essential to identify Class Imbalance, which determines the baseline accuracy required for future predictive modeling.
  • Objective: To determine the ratio of healthy (0) vs. risky (1) accounts in the master dataset.
ggplot(master_data, aes(x = as.factor(Is_Bad), fill = as.factor(Is_Bad))) +
  geom_bar(color = "black", width = 0.6) +
  # Adding count labels on top of bars
  geom_text(stat='count', aes(label=..count..), vjust=-0.3, size=4) +
  
  scale_fill_manual(values = c("0" = "skyblue", "1" = "orange"), 
                    labels = c("Healthy (0)", "Risky (1)")) +
  labs(title = "Visualizing Class Imbalance in Applicant Data",
       x = "Credit Category (0 = Good, 1 = Bad)",
       y = "Number of Applicants",
       fill = "Legend") +
   theme_minimal()
## Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(count)` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

  • Interpretation:we can observe a clear Class Imbalance. The number of healthy customers significantly outweights the number of risky ones. In Data Science, identifying this imbalance is crucial because it tells us that our future predictive model might become biased.This visualization helps us decide that we need better evaluation metrics (like Precision and Recall) instead of just relying on overall accuracy.

Question 4.2: Demographic Age Distribution and Skewness:

  • Task: Construct a histogram to analyze the age distribution of the applicant pool using the engineered Age feature. Apply a density overlay to observe the data spread and identify the peak age group for credit applications.
  • Objective: To understand the life-stage and maturity profile of the primary customer base.
#  Creating the Age Histogram with KDE Curve
ggplot(master_data, aes(x = Age)) +
  geom_histogram(aes(y = ..density..), binwidth = 2, fill = "lightgreen", color = "white") +
  geom_density(alpha = 0.3,  color = "red") +
  # Reference line for Average Age
  geom_vline(aes(xintercept = mean(Age)), color = "black", linetype = "dashed", size = 1) +
  labs(title = "Applicant Age Distribution",
       subtitle = "Identifying the Peak Age Group for Credit Seeking",
       x = "Age (Years)",
       y = "Density",
       caption = "Black line represents the Average Age") +
  theme_classic()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

  • Interpretation:By combining a histogram with a KDE (Density) curve, we can observe that the majority of credit seekers fall within the 30 to 50 age bracket.From a risk perspective, this is a very positive sign for the bank. Applicants aged 40-50 are traditionally considered low-risk because of job stability and financial experience.This maturity translates to lower credit risk, as they have a more disciplined approach to debt repayment.

Question 4.3: Financial Variance & Outliers Detection:

Task: Create a categorical boxplot comparing AMT_INCOME_TOTAL across different NAME_HOUSING_TYPE categories. Utilize a log-scale to normalize high-income variations and identify significant financial outliers. Objective: To investigate the relationship between housing stability and income levels.

# 3. Boxplot: Income vs Housing Type
ggplot(master_data, aes(x = NAME_HOUSING_TYPE, y = AMT_INCOME_TOTAL, fill = NAME_HOUSING_TYPE)) +
  geom_boxplot(outlier.color = "red", outlier.shape = 16, outlier.size = 2) +
  # Using Log scale to normalize the huge income gaps
  scale_y_log10() + 
  # Using simple, distinct color names for clarity
  scale_fill_manual(values = c("House / apartment" = "skyblue", 
                               "With parents" = "gold", 
                               "Municipal apartment" = "lightpink",
                               "Rented apartment" = "lightgreen",
                               "Office apartment" = "orange",
                               "Co-op apartment" = "plum")) +
  labs(title = "Income Distribution by Housing Type",
       subtitle = "Comparing Financial Strength across Living Situations",
       x = "Housing Category",
       y = "Total Income (Log Scale)") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), # Rotating names for neatness
        legend.position = "none") # Removing legend since X-axis labels are enough