A reputed financing company deals with home loans.They have a wide presence across all types of communities like Urban,Semi-Urban and Rural.They have introduced low interest loan scheme to attract more custmers.
As a result company got alarming appliactions which have to be processed.They want to automate task the task of approving the loan inreal time.
where customer fills data and the customer is immediately notified if he/she is eligible for loan.
Variable - Description
Loan_ID - Unique Loan ID
Gender - Male/ Female
Married - Applicant married (Y/N)
Dependents - Number of dependents
Education - Applicant Education (Graduate/ Under Graduate)
Self_Employed- Self employed (Y/N)
ApplicantIncome- ApplicantIncome
CoapplicantIncome- Coapplicant income
LoanAmount - Loan amount in thousands
Loan_Amount_Term -Term of loan in months
Credit_History -credit history meets guidelines
Property_Area -Urban/ Semi Urban/ Rural
Loan_Status -Loan approved (Y/N)
Pre processing data ,which involves filling null data fields based on behaviour of data
Doing exploratory data analysis like : calculating mean,median,mode,visualising using different parameters.
Running significance tests like Chi Square test on different catagorical variable
Running hypothesis tests using students t-test
Predicting status of loans using differenr models,The models we used here are
1.)Logistic regression : Predicted with 77% Accuracy
2.)Support vector Machine(SVM) : Preicted with 68% Accuracy
The reason being SVM works better when there are more than 20 varibles.