Data Description

Research Question

This research has many objectives, exploring the association between Saving_amount and Credit_score, may be counted as chief among them. Then a Research Question for this research may be: is the probability of a borrower having a Savings amount greater than or equal to the mean Savings amount for borrowers related to the credit score of a borrower.

Exploratory Data Analysis

Exploratory Data Analysis is now conducted. Currently, Saving_amount is continuous variable. However, in order to conduct simple logistic regression, it must be dichotomous. So it was dichotomized along its mean, with \(Y=1\) associated to Saving_amount variables >= mean(Saving_amount) and \(Y=0\) associated with all other Saving_amount variables that do not meet this criteria. This newly dichotomized Saving_amount was then name SA.g. Afterwards, the Credit_score variables were z-score normalized.

Note that the response variable is a binary factor variable that was stored as an integer, with the integer 1 associated with Saving_amount >= mean(Saving_amount), so \(P(Y=1)=P(SA.g=1)\). Next, the simple logistic regression is fitted.

Simple Logistic Regression Model

The summary of relevant statistics follows.

The summary stats: Estimates,95% CI, OR
Estimate Std. Error z value Pr(>|z|) 2.5 % 97.5 % odds.ratio
(Intercept) 0.1251667 0.0642219 1.948972 0.0512988 -0.0005875 0.2512476 1.133337
Credit_score 0.3368738 0.0668700 5.037744 0.0000005 0.2074455 0.4698873 1.400562

Based on the output, it seems that the association between Credit_score and SA.g may be statistically significant (z=5.0377, p <.05, 95% CI=[0.2074, 0.4699]). Further, the odds ratio associated with Credit_score is equal to 1.4006. This may mean that as Credit_score increases by one unit, the odds a borrower having a savings amount greater than or equal to the mean savings amount for borrowers may be 1.4 greater than the odds of a borrower having a savings amount less than the mean savings amount for borrowers. The goodness-of-fit measures provided by the model will not be interpreted; there is no other model present for a comparison. Next, the probability curve and the rate of change in that curve is output.

The Preceding graphs should depict the S curve used to fit the data in the regression as well as a rate of change for the P(Y=1).

Summary and Conclusion

This analysis focused on the association between the savings amount of a borrower and a borrowers credit score. An SLR was conducted. The results of that SLR found that the association between Credit_score and SA.g may be statistically significant (z=5.0377, p <.05, 95% CI=[0.2074, 0.4699]). The association seemed to be positive, \(\beta_1=0.3369\). Also, it revealed that the odds ratio associated with Credit_score is equal to 1.4006, which may mean that as Credit_score increases by one unit, the odds a borrower having a savings amount greater than or equal to the mean savings amount for borrowers may be 1.4 times greater than the odds of a borrower having a savings amount less than the mean savings amount for borrowers. Further, No goodness-of-fit measures were interpreted; there were no other models present for a comparison. Graphs depicting the S curve used to fit the data in the regression as well as a rate of change for the P(Y=1) were output.