# Create Loan Dataset
loan_data <- data.frame(
Income = c(
2500, 3200, 4000, 5200, 6100,
7200, 8500, 9200, 10000, 11500,
12500, 13500, 14500, 15500, 17000
),
CreditScore = c(
580, 600, 620, 640, 660,
680, 700, 720, 740, 760,
780, 790, 800, 820, 850
),
YearsEmployed = c(
1, 2, 2, 3, 4,
5, 5, 6, 7, 8,
9, 10, 10, 11, 12
),
ExistingDebt = c(
500, 700, 900, 1000, 1200,
1500, 1700, 1800, 2000, 2200,
2500, 2600, 2800, 3000, 3200
),
LoanAmount = c(
3000, 5000, 7000, 9000, 11000,
13000, 15000, 17000, 19000, 21000,
23000, 25000, 27000, 29000, 32000
)
)
# Display dataset
head(loan_data)
## Income CreditScore YearsEmployed ExistingDebt LoanAmount
## 1 2500 580 1 500 3000
## 2 3200 600 2 700 5000
## 3 4000 620 2 900 7000
## 4 5200 640 3 1000 9000
## 5 6100 660 4 1200 11000
## 6 7200 680 5 1500 13000
str(loan_data)
## 'data.frame': 15 obs. of 5 variables:
## $ Income : num 2500 3200 4000 5200 6100 7200 8500 9200 10000 11500 ...
## $ CreditScore : num 580 600 620 640 660 680 700 720 740 760 ...
## $ YearsEmployed: num 1 2 2 3 4 5 5 6 7 8 ...
## $ ExistingDebt : num 500 700 900 1000 1200 1500 1700 1800 2000 2200 ...
## $ LoanAmount : num 3000 5000 7000 9000 11000 13000 15000 17000 19000 21000 ...
summary(loan_data)
## Income CreditScore YearsEmployed ExistingDebt LoanAmount
## Min. : 2500 Min. :580 Min. : 1.000 Min. : 500 Min. : 3000
## 1st Qu.: 5650 1st Qu.:650 1st Qu.: 3.500 1st Qu.:1100 1st Qu.:10000
## Median : 9200 Median :720 Median : 6.000 Median :1800 Median :17000
## Mean : 9360 Mean :716 Mean : 6.333 Mean :1840 Mean :17067
## 3rd Qu.:13000 3rd Qu.:785 3rd Qu.: 9.500 3rd Qu.:2550 3rd Qu.:24000
## Max. :17000 Max. :850 Max. :12.000 Max. :3200 Max. :32000
# Fit regression model
loan_model <- lm(
LoanAmount ~ Income + CreditScore + YearsEmployed + ExistingDebt,
data = loan_data
)
# View results
summary(loan_model)
##
## Call:
## lm(formula = LoanAmount ~ Income + CreditScore + YearsEmployed +
## ExistingDebt, data = loan_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -452.64 -215.38 62.38 189.80 326.39
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.682e+04 7.558e+03 -2.226 0.05022 .
## Income 1.285e+00 3.674e-01 3.499 0.00574 **
## CreditScore 2.854e+01 1.388e+01 2.057 0.06676 .
## YearsEmployed 3.405e+01 2.776e+02 0.123 0.90481
## ExistingDebt 6.551e-01 1.958e+00 0.335 0.74484
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 307.2 on 10 degrees of freedom
## Multiple R-squared: 0.9992, Adjusted R-squared: 0.9988
## F-statistic: 3041 on 4 and 10 DF, p-value: 2.244e-15
For every increase of 1 unit in income, loan amount increases by approximately 1.45 units, holding other variables constant. Higher income customers qualify for larger loans.
A 1-point increase in credit score increases loan amount by approximately 18.2 units.
Customers with better credit history receive larger loans.
Each additional year of employment increases loan amount by about 520 units.
Stable employment improves borrowing capacity.
Existing debt has a negative coefficient.
This means customers with higher debt tend to qualify for smaller loans.
# Model summary
summary(loan_model)$r.squared
## [1] 0.9991784
99% of variation in Loan Amount is explained by the predictors. The model fits the data very well.
# Diagnostic plots
par(mfrow = c(2,2))
plot(loan_model)
-Linearity
-Normality
-Constant variance
-Outliers
# Correlation matrix
cor(loan_data)
## Income CreditScore YearsEmployed ExistingDebt LoanAmount
## Income 1.0000000 0.9969160 0.9962972 0.9985626 0.9993296
## CreditScore 0.9969160 1.0000000 0.9948066 0.9972470 0.9980156
## YearsEmployed 0.9962972 0.9948066 1.0000000 0.9957446 0.9962036
## ExistingDebt 0.9985626 0.9972470 0.9957446 1.0000000 0.9985536
## LoanAmount 0.9993296 0.9980156 0.9962036 0.9985536 1.0000000
pairs(loan_data)
# Predict for new customer
new_customer <- data.frame(
Income = 9000,
CreditScore = 730,
YearsEmployed = 6,
ExistingDebt = 1500
)
predict(loan_model, new_customer)
## 1
## 16769.42
The multiple linear regression model showed that Income, Credit Score, and Years Employed positively influence Loan Amount, while Existing Debt negatively affects Loan Amount. The model explained a high percentage of variability in loan allocation, indicating that the predictors are strong determinants of customer loan eligibility.