Introduction

A reputed financing company deals with home loans.They have a wide presence across all types of communities like Urban,Semi-Urban and Rural.They have introduced low interest loan scheme to attract more custmers.

As a result company got alarming appliactions which have to be processed.They want to automate task the task of approving the loan inreal time.

where customer fills data and the customer is immediately notified if he/she is eligible for loan.

Data fields explanation

Variable - Description

Loan_ID - Unique Loan ID

Gender - Male/ Female

Married - Applicant married (Y/N)

Dependents - Number of dependents

Education - Applicant Education (Graduate/ Under Graduate)

Self_Employed- Self employed (Y/N)

ApplicantIncome- ApplicantIncome

CoapplicantIncome- Coapplicant income

LoanAmount - Loan amount in thousands

Loan_Amount_Term -Term of loan in months

Credit_History -credit history meets guidelines

Property_Area -Urban/ Semi Urban/ Rural

Loan_Status -Loan approved (Y/N)

Solution

Step 1 :

Pre processing data ,which involves filling null data fields based on behaviour of data

Step 2 :

Doing exploratory data analysis like : calculating mean,median,mode,visualising using different parameters.

Step 3 :

Running significance tests like Chi Square test on different catagorical variable

Step 4 :

Running hypothesis tests using students t-test

Step 5 :

Predicting status of loans using differenr models,The models we used here are

1.)Logistic regression : Predicted with 77% Accuracy

2.)Support vector Machine(SVM) : Preicted with 68% Accuracy

The reason being SVM works better when there are more than 20 varibles.