===================================================================================================
The loan data set contains 81 variables gathered from proster.com. Each row contains information on a loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information.
For these data we will explore the following hypothesis questions:
To explore this question, we first needed to create a new factor variable with each quarter the loan was taken, regardless of the year. We called this variable \(quarter\) and assigned it four levels, \(Q1:Jan-Mar\), \(Q2:Apr-Jun\), \(Q3:Jul-Sep\), \(Q4:Oct-Dec\). We then subest our data by each factor level and can plot each density funtion.
Above we can see peaks at $4000, $10,000, $15,000, $20,000, and $25,000. Those relative maximums repesent the most popular loan amounts. We also notice groupings of \(Q2\) and \(Q3\) and also \(Q1\) and \(Q4\), these can be interpreted as summer and winter respectively. In the summer months we see a higher number of the samller ($4,000 range) loans, whereas in the winter months we see a higher number of the larger (> $20,000) loans.
Potential Prosper lenders may want to know if they should target a short term investment, or if its worth it to have their money tied up longer term. Below we can see the median estimated return is at about 9.7%, with 95% of the data falling in-between (4.6% , 15.6%).
## 2.5% 50% 97.5%
## 0.0460 0.0917 0.1561
Prosper offers 1-year, 3-year, and 5-year loan terms. We would like to investigate the estimated return for each of those terms.
## Term Median
## 1 12 0.05630
## 2 36 0.09211
## 3 60 0.09414
Above we can see the median expected return for each term individually. We notice that for the 1-year loan term, a lender can expect a return of 5.63% on their investment. Whereas for a 3-year term the the median jumps significantly, to 9.21%, and finally the median takes a much smaller jump up to 9.41% for a 5-year term. In conclusion, a prosper lender should consider targeting a borrower looking for a 3-year term period, if he wants to maximize his expected return. The two remaining years of the 5-year term period, do not show a significant jump in expected return. The histogram below will illustrate this more clearly. The green vertical line shows the median for the Term, and the black lines show the 95% confidence bounds for the expected return as a whole,independent of the term.
Lenders will likely want to know if the borrower is likely to default on the loan inorder to protect their investment. To investigate this question, we first needed to create a new binary variable that will indicate wheter the borrower defaulted or not. We can then plot the borrowers indicated monthly payment againts their APR and compare the two subsets. This is done below.
Above we can see that the most frequent loan in general, consists of monthly payments less than $500 and an APR from 10% to 40%. We also notice that the relative frequency is a bit higher for mothly payments . The range of monthly payments is less for the defaulted group, almost exclusively less then $1000. In conclusion, if the borrower has indicated monthly payments around $100 and an APR aobove 15%, they are more of a risk to to default. Also if a borrower has monthly payments over $1000, the data suggests they are most likely to not default on the loan.
A student borrower might question how their interest rate compares to the rest of the population. We first needed to subset the data, then we can create a scatter plot and compare the students to the non-students. This is done below. We can see a positive correlation between the rate and the loan amount for student borrowers, which is unexpected. For smaller loans (< $8,000), students on average have a smaller interest rate compared to the rest of the population. However, for loans greater that $8,000, students are on average can expect higher interest rates than the rest of the population.