This research has many objectives, exploring the association between Saving_amount and Credit_score, may be counted as chief among them. Then a Research Question for this research may be: is the probability of a borrower having a Savings amount greater than or equal to the mean Savings amount for borrowers related to the credit score of a borrower.
Exploratory Data Analysis is now conducted. Currently, Saving_amount is continuous variable. However, in order to conduct simple logistic regression, it must be dichotomous. So it was dichotomized along its mean, with \(Y=1\) associated to Saving_amount variables >= mean(Saving_amount) and \(Y=0\) associated with all other Saving_amount variables that do not meet this criteria. This newly dichotomized Saving_amount was then name SA.g. Afterwards, the Credit_score variables were z-score normalized.
Note that the response variable is a binary factor variable that was stored as an integer, with the integer 1 associated with Saving_amount >= mean(Saving_amount), so \(P(Y=1)=P(SA.g=1)\). Next, the simple logistic regression is fitted.
The summary of relevant statistics follows.
| Estimate | Std. Error | z value | Pr(>|z|) | 2.5 % | 97.5 % | odds.ratio | |
|---|---|---|---|---|---|---|---|
| (Intercept) | 0.1251667 | 0.0642219 | 1.948972 | 0.0512988 | -0.0005875 | 0.2512476 | 1.133337 |
| Credit_score | 0.3368738 | 0.0668700 | 5.037744 | 0.0000005 | 0.2074455 | 0.4698873 | 1.400562 |
Based on the output, it seems that the association between Credit_score and SA.g may be statistically significant (z=5.0377, p <.05, 95% CI=[0.2074, 0.4699]). Further, the odds ratio associated with Credit_score is equal to 1.4006. This may mean that as Credit_score increases by one unit, the odds a borrower having a savings amount greater than or equal to the mean savings amount for borrowers may be 1.4 greater than the odds of a borrower having a savings amount less than the mean savings amount for borrowers. The goodness-of-fit measures provided by the model will not be interpreted; there is no other model present for a comparison. Next, the probability curve and the rate of change in that curve is output.
The Preceding graphs should depict the S curve used to fit the data in the regression as well as a rate of change for the P(Y=1).
This analysis focused on the association between the savings amount of a borrower and a borrowers credit score. An SLR was conducted. The results of that SLR found that the association between Credit_score and SA.g may be statistically significant (z=5.0377, p <.05, 95% CI=[0.2074, 0.4699]). The association seemed to be positive, \(\beta_1=0.3369\). Also, it revealed that the odds ratio associated with Credit_score is equal to 1.4006, which may mean that as Credit_score increases by one unit, the odds a borrower having a savings amount greater than or equal to the mean savings amount for borrowers may be 1.4 times greater than the odds of a borrower having a savings amount less than the mean savings amount for borrowers. Further, No goodness-of-fit measures were interpreted; there were no other models present for a comparison. Graphs depicting the S curve used to fit the data in the regression as well as a rate of change for the P(Y=1) were output.