Mortgage Approval Data

This data set contains information on mortgage approvals. The variable descriptions are as follows.

Data Dictionary:
Loan_ID: Unique loan ID
Gender: Male/Female
Married: Yes/No
Dependents: Number of dependents
Education: Graduate/Not Graduate
Self_Employed: Yes/No
ApplicantIncome: Monthly income
CoapplicantIncome: Monthly income of the coapplicant
LoanAmount: Applied loan amount 1000 dollars
Loan_Amount_Term: Loan term in months
Credit_History: 1 if the applicant has a credit history, 0 otherwise
Property_Area: Urban, Semiurban, or Rural
Loan_Approved: 1 = Yes, 0 = No



rm(list=ls())
library(data.table)
library(ggplot2)
library(curl)
## Using libcurl 7.84.0 with Schannel

Q1: Read the loan_approval_data.csv as a data.table and print the first few rows of the data.table.

File: https://raw.githubusercontent.com/dratnadiwakara/fin4820/master/loan_approval_data.csv

##     Loan_ID Gender Married Dependents    Education Self_Employed
## 1: LP001002   Male      No          0     Graduate            No
## 2: LP001003   Male     Yes          1     Graduate            No
## 3: LP001005   Male     Yes          0     Graduate           Yes
## 4: LP001006   Male     Yes          0 Not Graduate            No
## 5: LP001008   Male      No          0     Graduate            No
## 6: LP001011   Male     Yes          2     Graduate           Yes
##    ApplicantIncome CoapplicantIncome LoanAmount Loan_Amount_Term Credit_History
## 1:            5849                 0         NA              360              1
## 2:            4583              1508        128              360              1
## 3:            3000                 0         66              360              1
## 4:            2583              2358        120              360              1
## 5:            6000                 0        141              360              1
## 6:            5417              4196        267              360              1
##    Property_Area Loan_Approved
## 1:         Urban             1
## 2:         Rural             0
## 3:         Urban             1
## 4:         Urban             1
## 5:         Urban             1
## 6:         Urban             1



Q2: How many loans in the sample were approved and how many were declined?

##    Loan_Approved   N
## 1:             1 422
## 2:             0 192



Q3: Distribution of Loan Amount and the mean loan amount

I have added a line showing the mean loan amount using geom_vline command.

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 22 rows containing non-finite values (`stat_bin()`).

## [1] 146.4122



Q4: What is the approval rate for Males and Females?

##    Gender approval_rate
## 1:   Male     0.6932515
## 2: Female     0.6696429



Q5: Scatter plot showing the correlation between income and loan amount. Restrict the sample to income less than 20,000 and LoanAmount less than 400.



Q6: Show the difference in LoanAmount distribution in Urban, Semiurban, and Rural areas

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.