LendingClub headquarters in San Francisco, California, which is the first peer-to-peer lender to register its offerings as securities with the SEC. As a leader in this industry, LendingClub has issued $41 Billion in loans by September 2018. In 2020, LendingClub has made many critical decisions. First, they halt the business loan service but partner with Opportunity Fund, which means borrowers can still apply for the business loan via this platform but provided by different company. Second, they made an acquisition of Radius bank in the same year of October, meanwhile slow down the P2P individual lending due to the pandemic. Although, in December, the P2P lending service comes to an end, the large amount of transaction data it leaves still has great value on the analysis of this field.
The empirical analysis of this papers will focus on data from LendingClub since it is nowadays the world’s largest P2P lending platform. Using this database as a case study to develop a qualified borrower portrait can contribute to understanding risk assessment in loaning service and minimum financial loss by lending to wrong borrowers.
With the development of internet, peer to peer (P2P) lending networking brings more possibilities to the current loan system and has greatly facilitated the funding process for microfinances, low-income enterprises or individual with unsecured loans all over the world (Galloway, 2009). Instead of funding through banks or third-party agencies, some P2P lending platforms create a direct dynamic via an auction mechanism (Ashat & Assadi, 2009) between borrowers and lenders where both parties can favorably match their expectation to a right counterparty at the best rate. This is a sharp contrast to traditional funding via banks.
The earliest P2P lending platform is Zopa.com which is established in the UK in March 2005. By 2008, Zopa.com has successfully entered into the markets of Italy, US and Japan (Everett, 2015). In the same year, Kiva.org, an American non-profit P2P Lending was established to serve the microenterprises in developing countries (Galloway, 2009). MicroPlace is another global funding platform that has a similar market segment and a mission statement as Kiva.com; the only difference is that investors in MicroPlace are able to get a financial return as it is a registered brokerage-dealer in the US. Driven by the huge unsecured demand globally, this crowdfunding model (Mach, 2014) has been successful worldwide. Key players in European market (Everett,2015) include MyC4 (Denmark), Smava.de (Germany), Boober.it (Italy), and Loanland (Sweden). In the US, 6 competitors including Prosper, Peerform, funding Circle, Street Share, Kiva and Lending club are now actively funding for small business and individuals in the society (Proctor, 2020).
This section will compare the differences between the business models of some P2P Lending platforms; Prosper and Lending Club. According to Proctor (2020), Prosper offers loans ranging from $2,000 to 300,000 with payment term of 3-5 years. The cost consists of interest rates ranging from 7.95% to 35.99% and a small origination fee. Prosper adopts a double-blind1 auction mechanism (Everett, 2009) in which both investors and borrowers can maximize their welfare. For example, borrowers would post “listings” which specify the amount, purposes of loan, maturity, highest interest rate they are willing to pay and optional soft personal information (Morse, 2015). Then the loan will be funded and bided upon by multiple investors until the auction closes. In this way, the crowdfunding ends with the most favorable interest rate for the borrowers.
Lending Club provides personal loan, business loan, auto refinancing and patient solutions with APR ranging from 4.99 %-35.89 % given the differences of the borrowers’ characteristics. The loan amount is between 1,000 dollars to 500,000 dollars with a payment term of 1-5 years. Lending Club uses an intermediate pricing model (Galloway, 2009) which means the interest rate is predetermined by the platform.
Lending Club provides personal loan, business loan, auto refinancing and patient solutions with APR ranging from 4.99%-35.89% given the differences of the borrowers’ characteristics. The loan amount is between 1,000 dollars to 500,000 dollars with a payment term of 1-5 years. Lending Club uses an intermediate pricing model (Galloway, 2009) which means the interest rate is predetermined by the platform.
The profit of P2P platforms mainly come from the origination fee, which is usually 1%-6% of the amount taken out from investors after the loan is fully funded. Lending Club also charges delinquency fees from borrowers and collection fees from investors (Morse, 2015). So, APR is the index that measures the total borrowing cost.
Online private loans for individual and small businesses are considered to be risker. First, information asymmetry (Bachmann et al., 2011) has been a “headache” in a trust-based funding system to assess risk. Borrowers and lenders are facing a dilemma since borrowers try to hide their adverse information and lenders would like to obtain borrower’s full financial status to access the risk (Mach et al, 2014). Under an unsecured loan, private investors are exposed to larger default risk and principal losses due to underwriting and serving challenges (Galloway, 2009). They might fail to screen the negative information on the borrower ‘listings’ (Morse, 2015); when default or fraud occur, they have difficulties to acquire help from the lending mechanism. Thus, it is necessary for investors to be familiar with the funding process and risk precaution in the first place. In the following part of the essay, we will discuss how investors perceive the creditworthiness and credibility of investors when selecting investment profiles in the pool. By researching the literature reviews, we summarize some characteristics of those who are successfully funded including social connection, financial characteristics, personal soft information and the profile description manners. Firstly, analyzing trust portrait helps to offset the downside effect of information asymmetry. Ashat and Assadi (2009) identify three dimensions including personality of trustor, trustee and third party as sources of trust. This finding is coherent with the investment group features founded by Everett (2010).
As a lender identifying the right person to fund is essential. With intermediate absence in the loan process, the P2P lending emphasizes the social connection between borrowers and investors. (Morse, 2015) further explains investors would consider the borrowers who have proximity and relationship in his pol. (Freeman and Jin, 2014) reveals that proximity loans have a 4% lower delinquency rate relative to similar risk borrowers. Earlier, (Nahaphiet & Ghoshal, 2018) explored the quality of relations as a social capital. These findings Highlight the importance of social distance between borrow and lender in loan approval.
What factors are the determinants related to being funded successfully? We first consider the financial characteristics of borrowers as the most important determinant intuitively. Like traditional lending from banks or the third institutions, online lending verifies borrowers’ qualification based on their credit scores, for example, the scoring system of Lending Club is divided into 7 levels from A~G. Borrowers’ creditworthiness is validated by an external agency (Bachmann el at, 2011) mainly relying on financial characteristics. Klafft (2008) finds that credit rating has the most significant impact on interest rate while a verified bank account or a home ownership doesn’t show a statistical significance in the test. Herzenstein, Sonenshein and Dholakia (2011) find that an economic hardship identity results in a 0.009 point lower in default.
The second factor discussed here is soft personal information. Demographics, economic condition, age, gender, race and occupation are referred here (Morse, 2015; Bachmann et al, 2011) but there are mixed findings on these matters. Demographical factors have little impact on success of being funded (Pope & Sydnor, 2010) while sharing a similar background on race, gender or living community increase the chance to get a loan or deduce interest rate (Ravina, 2007). Agrawal, Catalini and Golafarb (2011) further demonstrate the relation between distance and the willingness of whether investor would take the loan. Their empirical result shows an average diffusion distance of 3000 miles within the same region.
The manner in which borrowers disclose their information affect investors’ first perception in profile selections. For the online P2P funding process, investors rarely have the opportunity to meet borrowers physically. As a result, the borrower profiles posted on the platforms serve as the first window to mine and screen valuable information and assist decision making.
Given the problems of information asymmetry, verification of soft information and underwriting and serving challenges mentioned above, we summarize some methodologies and finding regarding borrower profile narrative. Herzenstein, Sonenshein and Dholakia (2011) identify trustworthy and successful identity claim stand out among the six identity claims (Miles and Huberman, 1997) when reading the application narrative. To give more insights for investing profile selections, we should carefully decide the criteria and definitions of these two dimensions, but in this paper, we will not discuss this topic detailly.
Massive textual analysis and techniques are used to sketch the portrait of qualified borrowers. Mitra and Gilbert (2014) generate a “winning list” of phrases and words in the funding bids, which indicate the preferences and biases of investor. Gao and Lin (2013) empirically test the relationship between the linguistic choices and default rate. Their findings demonstrate that higher complexity of the narrative indicate a 3.6 percentage higher default rate, holding other factors constant. Michel (2012) evaluates the narrative in nine dimensions: purpose of the loan, income, income source, education, other debt, interest rate on other debt, an explanation for poor credit grade, expenses, and picture (Morse, 2015). He highlights the importance of the completeness of the information disclosure, among which purpose of the loan, other debt and explanation of poor credit grade are the most persuasive.
The main data source for is all Lending Club data from Kaggle covered from 2007 to 2018. The original Kaggle data base contains 74 variables in total. We select 38 variables with 2260701 observations that are most related to areas we are interested in investigating. We selected areas such as interest rates, payment term, loan amount, installment, loan purpose, loan status (fully paid, current and charged off) and application type (individual or joint application with two individuals) when describing and monitoring a loan profile, a full table and description of all variables are included. We define charged off as bad loan. These and other financial characteristics are the most important determinants in interest rate determination and a successful biding behavior. This is reflected in the assigned loan grade from A~G and financial document verification. We categorize soft personal information as follows: occupation, annual income and home ownership.
This database offers a multi-dimension to analyze successful bidding behaviors and quantitative loan performance. However, we didn’t include rejected loan (the rejected loan database was incomplete), which limits us to further identified what cause creditworthiness suspiciousness. The database we use doesn’t contain much the narrative variables, thus, our experiment design will not focus much on textual analysis.| Variable | Description |
|---|---|
| loan_amnt | The listed amount of the loan applied for by the borrower. If at some point in time, the credit department reduces the loan amount, then it will be reflected in this value. |
| funded_amnt | The total amount committed to that loan at that point in time. |
| disbursement_method | The method by which the borrower receives their loan. Possible values are: CASH, DIRECT_PAY. |
| term | The number of payments on the loan. Values are in months and can be either 36 or 60. |
| int_rate | Interest Rate on the loan |
| installment | The monthly payment owed by the borrower if the loan originates.Interest Rate on the loan |
| grade | Lending Club assigned loan graded. |
| emp_title | The job title supplied by the Borrower when applying for the loan. |
| emp_length | Employment length in years. Possible values are between 0 and 10 where 0 means less than one year and 10 means ten or more years. |
| home_ownership | The home ownership status provided by the borrower during registration or obtained from the credit report. Our values are: RENT, OWN, MORTGAGE, OTHER |
| annual_inc | The self-reported annual income provided by the borrower during registration. |
| annual_inc_joint | The self-reported joint annual income provided by the borrower during registration. |
| loan_status | Current status of the loan |
| pymnt_plan | Indicates if a payment plan has been put in place for the loan |
| purpose | A category provided by the borrower for the loan request. |
| title | The loan title provided by the borrower |
| addr_state | The state provided by the borrower in the loan application |
| dti | A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income. |
| dti_joint | Joint debt to income ratio. |
| delinq_2yrs | The number of 30+ days past-due incidences of delinquency in the borrower’s credit file for the past 2 years. |
| delinq_amnt | The past-due amount owed for the accounts on which the borrower is now delinquent. |
| fico_range_low | The lower boundary range the borrower’s FICO at loan origination belongs to. |
| fico_range_high | The upper boundary range the borrower’s FICO at loan origination belongs to. |
| inq_last_6mths | The number of inquiries in past 6 months (excluding auto and mortgage inquiries). |
| open_acc | The number of open credit lines in the borrower’s credit file. |
| pub_rec | Number of derogatory public records. |
| revol_bal | Total credit revolving balance. |
| revol_util | Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit. |
| total_acc | The total number of credit lines currently in the borrower’s credit file. |
| total_rev_hi_lim | Total revolving high credit/credit limit. |
| total_rec_late_fee | Late fees received to date. |
| collections_12_mths_ex_med | Number of collections in 12 months excluding medical collections. |
| application_type | Indicates whether the loan is an individual application or a joint application with two co-borrowers. |
| max_bal_bc | Maximum current balance owed on all revolving accounts. |
| inq_fi | Number of personal finance inquiries. |
| avg_cur_bal | Average current balance of all accounts. |
| tax_liens | Number of tax liens. |
| hardship_flag | Flags whether or not the borrower is on a hardship plan. |
The cleaned dataset offers a remarkably complete array of variables, note that annual_inc_joint and dti_joint (debt to income ratio) show that 94.66% of variables missing, as these columns being populated depend on our applicants filing status.
| Variable | N | Mean | Std. Dev. | Min | Pctl. 25 | Pctl. 75 | Max |
|---|---|---|---|---|---|---|---|
| loan_amnt | 2260668 | 15046.931 | 9190.245 | 500 | 8000 | 20000 | 40000 |
| funded_amnt | 2260668 | 15041.664 | 9188.413 | 500 | 8000 | 20000 | 40000 |
| term | 2260668 | ||||||
| … 36 months | 1609754 | 71.2% | |||||
| … 60 months | 650914 | 28.8% | |||||
| int_rate | 2260668 | 13.093 | 4.832 | 5.31 | 9.49 | 15.99 | 30.99 |
| installment | 2260668 | 445.807 | 267.174 | 4.93 | 251.65 | 593.32 | 1719.83 |
| annual_inc | 2260664 | 77992.429 | 112696.2 | 0 | 46000 | 93000 | 1.1e+08 |
| annual_inc_joint | 120710 | 123624.637 | 74161.346 | 5693.51 | 83400 | 147995 | 7874821 |
| pymnt_plan | 2260668 | ||||||
| … n | 2260048 | 100% | |||||
| … y | 620 | 0% | |||||
| dti | 2258957 | 18.824 | 14.183 | -1 | 11.89 | 24.49 | 999 |
| dti_joint | 120706 | 19.252 | 7.822 | 0 | 13.53 | 24.62 | 69.49 |
| delinq_2yrs | 2260639 | 0.307 | 0.867 | 0 | 0 | 0 | 58 |
| delinq_amnt | 2260639 | 12.37 | 726.465 | 0 | 0 | 0 | 249925 |
| fico_range_low | 2260668 | 698.588 | 33.01 | 610 | 675 | 715 | 845 |
| fico_range_high | 2260668 | 702.588 | 33.011 | 614 | 679 | 719 | 850 |
| inq_last_6mths | 2260638 | 0.577 | 0.886 | 0 | 0 | 1 | 33 |
| open_acc | 2260639 | 11.612 | 5.641 | 0 | 8 | 14 | 101 |
| pub_rec | 2260639 | 0.198 | 0.571 | 0 | 0 | 0 | 86 |
| revol_bal | 2260668 | 16658.458 | 22948.305 | 0 | 5950 | 20246 | 2904836 |
| revol_util | 2258866 | 50.338 | 24.713 | 0 | 31.5 | 69.4 | 892.3 |
| total_acc | 2260639 | 24.163 | 11.988 | 1 | 15 | 31 | 176 |
| total_rev_hi_lim | 2190392 | 34573.943 | 36728.495 | 0 | 14700 | 43200 | 9999999 |
| total_rec_late_fee | 2260668 | 1.518 | 11.842 | 0 | 0 | 0 | 1484.34 |
| collections_12_mths_ex_med | 2260523 | 0.018 | 0.151 | 0 | 0 | 0 | 20 |
| application_type | 2260668 | ||||||
| … Individual | 2139958 | 94.7% | |||||
| … Joint App | 120710 | 5.3% | |||||
| max_bal_bc | 1394539 | 5806.393 | 5690.561 | 0 | 2284 | 7598 | 1170668 |
| inq_fi | 1394539 | 1.013 | 1.489 | 0 | 0 | 1 | 48 |
| avg_cur_bal | 2190322 | 13547.798 | 16474.075 | 0 | 3080 | 18783 | 958084 |
| tax_liens | 2260563 | 0.047 | 0.378 | 0 | 0 | 0 | 85 |
| hardship_flag | 2260668 | ||||||
| … N | 2259836 | 100% | |||||
| … Y | 832 | 0% | |||||
| disbursement_method | 2260668 | ||||||
| … Cash | 2182546 | 96.5% | |||||
| … DirectPay | 78122 | 3.5% |
Additional Plots reveal the distribution of the dataset.
Between 2007-2018 in the accepted cases, the highest application frequency concentrates in Montana, North Dakota, Wyoming, and South Dakota in the northwest region and Maine in the west; however, a plot from Chandra in Kaggle, the loan amount invested largely goes to Chicago, Texas, Florida, Illinoi and New York. There is a disproportional distribution where people need loans the most, which, at the same time, reflects a huge potential market in those under-funded regions.
Geographical Distribution
Distribution of Applications based on Grade
Bar Plots show around 58.2% applicants falls in grade B and C. Qualified applicants with grade A creditworthiness only accounts for 19.2%; we guess this may be due to their availability of a low interest rate in banks. As for the purpose to lend money form P2P platforms, there is no doubt that debt consolidation takes up an overwhelming share no matter at which credit grade. Money used in paying credit card debt decreases as the credit grade decreases. People who have poor grade tend to seek fund to support their small business through P2P lending as the purple section in grade G is the largest. Make up of applications based on purpose and grade.
## Warning: Removed 33 rows containing non-finite values (stat_density).
We open our regression analysis with a very simple fact (that all approved loans are funded for the requested amount, 100% of the time) of LendingClub’s pool of accepted applicants:
\[ ApprovedBalance \sim X\beta_0 + loan_amnt\beta_1 + funded_amnt\beta_2 + \epsilon \]
##
## ==========================================================
## Dependent variable:
## --------------------------------------
## loan_amnt
## ----------------------------------------------------------
## funded_amnt 1.000***
## (0.00001)
##
## Constant 5.949***
## (0.259)
##
## ----------------------------------------------------------
## Observations 2,260,668
## R2 1.000
## Adjusted R2 1.000
## Residual Std. Error 203.311 (df = 2260666)
## F Statistic 4,616,976,779.000*** (df = 1; 2260666)
## ==========================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Next we’d like to address (Herzenstein, Sonenshein and Dholakia, 2011), (Klafft, 2008) concerns about interest rates, credit scores and credit scores and home ownerships effects on harships and interest rate determination respectively :
\[ Y \sim X\beta_0 + X\beta_1 + ficorangelow\beta_1 + ficorangehigh\beta_2 + \epsilon \] \[ Y \sim X\beta_0 + X\beta_1 + \epsilon \]
factors_affecting_interest_rate <- lm(int_rate ~ fico_range_low + fico_range_high + grade + home_ownership + annual_inc + dti, data = accepted_loans)
stargazer(factors_affecting_interest_rate, type = "text")
##
## ==========================================================
## Dependent variable:
## -----------------------------------
## int_rate
## ----------------------------------------------------------
## fico_range_low 0.217***
## (0.070)
##
## fico_range_high -0.220***
## (0.070)
##
## gradeB 3.478***
## (0.003)
##
## gradeC 6.900***
## (0.003)
##
## gradeD 10.873***
## (0.004)
##
## gradeE 14.551***
## (0.005)
##
## gradeF 18.168***
## (0.008)
##
## gradeG 20.780***
## (0.013)
##
## home_ownershipMORTGAGE -0.506***
## (0.046)
##
## home_ownershipNONE -0.231
## (0.211)
##
## home_ownershipOTHER -1.487***
## (0.117)
##
## home_ownershipOWN -0.465***
## (0.046)
##
## home_ownershipRENT -0.445***
## (0.046)
##
## annual_inc -0.00000***
## (0.000)
##
## dti 0.004***
## (0.0001)
##
## Constant 10.891***
## (0.282)
##
## ----------------------------------------------------------
## Observations 2,258,953
## R2 0.909
## Adjusted R2 0.909
## Residual Std. Error 1.454 (df = 2258937)
## F Statistic 1,512,665.000*** (df = 15; 2258937)
## ==========================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Agrawal, A., Catalini, C., & Goldfarb, A. (2010). The Geography of Crowdfunding. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.1692661
Ashta, A., & Assadi, D. (2009). An Analysis of European Online micro-lending Websites. He EMN 6th Annual Conference, 4–5 June 2009, 33(1768–3394), 4–28. https://www.researchgate.net/publication/228321506_An_Analysis_of_European_Online_Micro-Lending_Websites
Bachmann, A., Becker, A., Buerckner, D., Hilker, M., Kock, F., Lehmann, M., & Tiburtius, P. (2011). Online Peer-to-Peer Lending – A Literature Review. Journal of Internet Banking and Commerce, 16(2). https://www.researchgate.net/publication/288764128_Online_Peer-to-Peer_Lending_-_A_Literature_Review
Everett, C. R. (2008). Group Membership, Relationship Banking and Loan Default Risk: The Case of Online Social Lending. Banking and Finance Review, 7(2). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1114428
Freedman, S., & Jin, G. Z. (2008). Do Social Networks Solve Information Problems for Peer-to-Peer Lending? Evidence from Prosper.com. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.1936057
Galloway, I. J. (2009). Peer-to-Peer Lending and Community Development Finance. Federal Reserve Bank of San Francisco. Published. https://ideas.repec.org/p/fip/fedfcw/2009-06.html
Gao, Q., & Lin, M. (2013). Linguistic Features and Peer-to-Peer Loan Quality: A Machine Learning Approach. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.2446114
Herzenstein, M., Dholakia, U. M., & Andrews, R. (2010). Strategic Herding Behavior in Peer-to-Peer Loan Auctions. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.1596899
Klafft, M. (2008). Peer to Peer Lending: Auctioning Microcredits over the Internet. He International Conference on Information Systems, Technology and Management, A. Agarwal, R. Khurana, Eds., IMT, Dubai. Published. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1352383
Mach, T., Carter, C., & Slattery, C. R. (2014). Peer-to-Peer Lending to Small Businesses. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.2390886
Morse, A. (2015). Peer-to-Peer Crowdfunding: Information and the Potential for Disruption in Consumer Lending. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.2551272
Nahapiet, J., & Ghoshal, S. (1998). Social Capital, Intellectual Capital, and the Organizational Advantage. Academy of Management Review, 23(2), 242–266. https://doi.org/10.5465/amr.1998.533225
Pope, D. G., & Sydnor, J. R. (2010). What’s in a Picture? Journal of Human Resources, 46(1), 53–92. https://doi.org/10.3368/jhr.46.1.53
Ravina, E. (2007). Beauty, Personal Characteristics, and Trust in Credit Markets. SSRN Electronic Journal. Published. https://doi.org/10.2139/ssrn.972801
Tanushree, M., & Eric, G. (2014). The language that gets people to give: Phrases that predict success on kickstarter. The 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. Published. https://doi.org/10.1145/2531602.2531656
```