loan_data <- read.csv("D:/R work folder/R project/Loan_Data.csv")
str(loan_data)
## 'data.frame': 289 obs. of 13 variables:
## $ Date : chr "10/31/2022" "7/27/2022" "5/11/2022" "2/23/2022" ...
## $ Loan_ID : chr "LP001015" "LP001022" "LP001031" "LP001051" ...
## $ Gender : chr "Male" "Male" "Male" "Male" ...
## $ Married : chr "Yes" "Yes" "Yes" "No" ...
## $ Dependents : chr "0" "1" "2" "0" ...
## $ Education : chr "Graduate" "Graduate" "Graduate" "Not Graduate" ...
## $ Self_Employed : chr "No" "No" "No" "No" ...
## $ ApplicantIncome : int 5720 3076 5000 3276 2165 2226 3881 2400 3091 4666 ...
## $ CoapplicantIncome: int 0 1500 1800 0 3422 0 0 2400 0 0 ...
## $ LoanAmount : int 110 126 208 78 152 59 147 123 90 124 ...
## $ Loan_Amount_Term : int 360 360 360 360 360 360 360 360 360 360 ...
## $ Credit_History : int 1 1 1 1 1 1 0 1 1 1 ...
## $ Property_Area : chr "Urban" "Urban" "Urban" "Urban" ...
head(loan_data)
## Date Loan_ID Gender Married Dependents Education Self_Employed
## 1 10/31/2022 LP001015 Male Yes 0 Graduate No
## 2 7/27/2022 LP001022 Male Yes 1 Graduate No
## 3 5/11/2022 LP001031 Male Yes 2 Graduate No
## 4 2/23/2022 LP001051 Male No 0 Not Graduate No
## 5 4/30/2022 LP001054 Male Yes 0 Not Graduate Yes
## 6 6/10/2022 LP001055 Female No 1 Not Graduate No
## ApplicantIncome CoapplicantIncome LoanAmount Loan_Amount_Term Credit_History
## 1 5720 0 110 360 1
## 2 3076 1500 126 360 1
## 3 5000 1800 208 360 1
## 4 3276 0 78 360 1
## 5 2165 3422 152 360 1
## 6 2226 0 59 360 1
## Property_Area
## 1 Urban
## 2 Urban
## 3 Urban
## 4 Urban
## 5 Urban
## 6 Semiurban
The dataset contains information about loan applications. It includes 13 columns that provide details about the applicants and their loan requests. The key variables in this dataset are as follows:
The dataset allows to analyze relationships between applicant financial background (income, loan amount, credit history) and the loan terms. The goal of this project is to analyze how applicant characteristics influence their loan eligibility.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
ggplot(loan_data, aes(x = ApplicantIncome, y = LoanAmount, color = as.character(Credit_History))) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Loan Amount vs Applicant Income",
subtitle = "Colored by Credit History",
x = "Applicant Income",
y = "Loan Amount",
color = "Credit History") +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
The scatter plot showcases x- asxis representing the Applicant Income and the y-axis representing the Loan Amount. The point colors differs based on the Credit History of the applicant. Here red color idicates the applicants with credit history (Credit History = 1) and green color is for the applicants with no credit history (Credit History =0 ). The regressio line shows the relationship between Applicant Income and Loan Amount.
library(plotly)
## Warning: package 'plotly' was built under R version 4.4.3
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
interactive_plot <- ggplot(loan_data, aes(x = ApplicantIncome, y = LoanAmount, color = as.character(Credit_History))) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Loan Amount vs Applicant Income",
subtitle = "Colored by Credit History",
x = "Applicant Income",
y = "Loan Amount")
ggplotly(interactive_plot)
## `geom_smooth()` using formula = 'y ~ x'
The interactive chart showcasing the regression line shows that there is no significant trend between the Applicant Income and Loan Amount.
The animation shows how the relationship between Applicant Income and Loan Amount changes over time. This is based on the Date column from the dataset, showing how loan application evolves and the date changes.
library(ggplot2)
library(gganimate)
## Warning: package 'gganimate' was built under R version 4.4.3
library(gifski)
## Warning: package 'gifski' was built under R version 4.4.3
library(av)
## Warning: package 'av' was built under R version 4.4.3
loan_data$Date <- as.Date(loan_data$Date, format = "%m/%d/%Y")
animated_plot <- ggplot(loan_data, aes(x = ApplicantIncome, y = LoanAmount, color = as.character(Credit_History), size = LoanAmount)) +
geom_point(alpha = 0.7) +
labs(title = 'Date: {frame_time}',
x = "Applicant Income",
y = "Loan Amount",
color = "Credit History",
size = "Loan Amount") +
transition_time(Date) +
ease_aes('linear') +
scale_size_continuous(range = c(1, 10)) +
scale_color_manual(values = c("red", "green"))
anim_save("loan_animation_fixed.gif", animated_plot, renderer = gifski_renderer())
animate(animated_plot, nframes = 100, fps = 10, renderer = av_renderer())
In the animation, the size of the points changes based on the Loan Amount, with larger loans represented by larger points. As the animation moves through the dates, it can be seen how the distribution of Loan Amount and Applicant Income chnages. The color of the points indicates if the applicant has a Credit History with red indicating yes and green indicating no. The animaton shows that applicants with **No Credit History* mostly requested for larger loan amounts though it is not a frequent incident. Also applicants with no credit histories tend to request for loan more than applicants with a credit history. Moreover, during a certain preiod of time the request for loan is very frequent, it can be due to an economic shift and change of demand.