Thursday, May 31, 2012

Lending Club Loan Listing Date - Not All Patterns are Relevant

Loan Volume

As the chart below shows, the volume of loan listings goes down during last week of the month. As I posted in my previous post Month-end Rush to Issue Loans at Lending Club, there is a pattern of high volume of loans issued at the end of the month. It is possible that one team within Lending Club manages both listing and issuing loans at the end of the month, the group's focus shifts from listing to issuing loans. But then, I don't have insights into inner workings of Lending Club organization.

Also, as Peter Renton commented and my previous post Lending Club Loan Application Date - When to Invest? mentioned, the spike in volume of loan issued has shifted to start of the month in 2012. There doesn't appear to be any such shift in loan volume by Loan Listing Date in 2012.

The chart above shows the volume of loan listings by week of the year. The rapid increase in the volume of loan listing masks any discernible patterns in the chart except that there is a spike in loan listings after the Thanksgiving (getting ready to shop for Christmas) and after the new year (Holiday shopping bills start showing up).

Loan Status

The chart below shows, the status of loans based on loan listing date by the day of the month. I observed two interesting patterns in this chart:
  1. The Late (16 -30 days) status only appears for loans that were listed later in the month.
  2. The In Grace Period status mostly appears for loans that were listed earlier in the month.

The chart below shows, the status of loans based on loan listing date by the week of the year. This chart is as confounding as the previous one.
  1. Both loans with Late (16 - 30 days) and loans with In Grace Period status appear to be listed on certain weeks within the year for two to three weeks consecutively.
  2. The loans with In Grace Period status appear to be listed right after the loans with Late (16 - 30 days) status were listed.

At this point, I didn't have faintest of idea on how to explain these patterns in loans with Late (16 - 30 days) and In Grace Period status. Then, I decided to chart for Loan defaults by day of the month and by week number using Application Expiration Date to compare if the pattern for loans with Late (16 - 30 days) and In Grace Period status change or shift.

As the chart below shows, the loans with Late (16 - 30 days) and In Grace Period status have swapped the positions approximately for Application Expiration Date when charted by the day of the month.

As the chart below shows, the loans with Late (16 -30 days) and In Grace Period status also have swapped positions approximately for Application Expiration Date when charted by the week of the year.

These patterns exist for both Application Listing Date and Application Expiration Date. As listing and expiration date are approximately two weeks apart and the pattern shift is approximately two weeks too, these patterns appears to be a factor of when the historical loan data file was last updated. I am working with a data file that was up to date as of end of month. I am sure a data file that was up to date as of middle of month will show patterns for loan with status (16 - 30 days) and In Grace Period that is reverse of patterns shown above.

Key Takeaway

Originally, I was as lost in explaining the patterns in the above charts for Application Listing Date as I was after watching the series finale of TV series LOST!. But after reviewing the similar charts for Application Expiration Date, I am confident that the patterns are neither related to anything specific with loans nor Lending Club process but artifact from transitioning of In Grace Period status to Late (16 - 30 days) status during the month and the time of month when the historical loan data file was updated.

This particular analysis shows that not necessarily all data analysis result in insights. Sometime observed patterns can be result of how and when data was collected.

By all means, if you have any other ideas, please share using comments or direct messages. Thank you for reading.

Checkout view of a venture capitalist on P2P Lending - Grass Roots Capitalism: P2P Lending.

--- Promotion ---

Are you interested in learning about information visualization and data analysis? Consider reading following books:

Tuesday, May 29, 2012

Lending Club Loan Application Date - When to Invest?

Loan Issued Date Recap

As discussed in my previous post Month-end Rush to Issue Loans at Lending Club, I noticed patterns of high volume and high default for the loans issued at the end of the month. Peter Renton mentioned in his comment that the rush to issue loans at the end of the month was due to LC Advisors making most investment in last few days of the month. He also mentioned that high loan origination flow recently has shifted to early in the month. As the year over year chart shows below, the volume of issued loans was definitely high at month-end for 2010 and 2011. In 2012, the loan origination volume seems to have shifted to first few days of the month.

It is possible that LC Advisors was funding the remaining portion of loans at month-end that were partially funded by other lenders, and due to close relationship with Lending Club had shortened loan approval process. It is also possible that LC Advisors shifted to investing early in the month in 2012 after noticing higher defaults for loans issued at month-end. But I suspect that the spike in loan issued was not due to activities of LC Advisors solely. In my opinion, such actions will create serious doubts about integrity of Lending Club platform. Lending Club need to be more transparent around such potential conflict of interest scenarios.

The analysis of Loan Issued Date identified quirks in loan underwriting process and generated some useful observations for trading in secondary market. But it is not a useful factor in selection strategy for new notes as lenders don't know when Lending Club will issue the loans.

Loan Application Date

I assumed that the Loan Application Date, as the name implied, is the date when a borrower submits a loan application to Lending Club. This parameter may provide insights into the behavior of borrowers. But soon after I started reviewing Loan Application Date data in historical loans data file, I realized my assumption was wrong. The Loan Application Date is actually the Loan Listing Date on Lending Club platform when lenders can start buying notes in the loan.

It was disappointing as my hypothesis was that using actual loan application date, I will be able to separate the two clusters of borrowers, ones who believe Lending Club is "one more" source of unsecured loan to borrow from versus the others who believe Lending Club to be "alternate" source of unsecured loan with attractive terms. I expect later to have lower default rate than the former.

Unfortunately, all date parameters in historical loans data file could tell more about underwriting process and lenders rather than borrowers. There are no data parameters that could give insight into behavior of borrowers. Hence, here is my request to Lending Club.
"Hey Lending Club!, can you please consider renaming the Loan Application Date to Loan Listing Date and providing the actual date when a borrower submits the loan application to Lending Club? Thanks."

When to Buy Notes During the Week?

As the chart below shows, the volume of new loan listing declines as the week progresses, highest on Monday and Tuesday and lowest on weekend.

The middle of the workweek appears to be a good time to buy notes because a lender gets the chance to invest in the most number of new listings and early enough for good loans to be not fully funded already. This suggestion assumes that note selection strategy doesn't depend on:
  • Percentage of loan amount already funded, or
  • Number of lenders already invested in the loan, or
  • Average amount already invested in loan per lender, or
  • Focuses on only high interest bearing notes.
Personally, after this analysis, I switched my notes selection day to Wednesday night from Tuesday night as I review available notes only once a week.

Loan Listing Day of the Week and Status

As the chart below shows, there is no specific pattern that stands out for Charged Off, Default, Performing Payment Plan, and Late loan status. But with day of the week, from Saturday to Friday, there is a rise in percentage of loans with In Grace Period status.

Why would loans issued later in the workweek go in to Grace Period more often? At this point, I have no idea about reasons for rising loans in grace period. As I further analyze Loan Listing Date and Loan Status, I hope to gain some insight into this peculiar trend.

Key Takeaways

  1. Don't assume that the data represents what the label implicitly means.
  2. The volume of new loans listings declines as workweek progresses.
  3. Middle of the week is good for lenders who focus on selecting loans once a week. The lenders interested in high-interest bearing notes may need to select loans more frequently during the week.

Zack Miller on Seeking Alpha wrote a very good article "Why I'm A Converted Believer In Investing In P2P Loans." I agree with him that P2P lending deserves to be a new asset class in an investor's portfolio.

--- Promotion ---

Thursday, May 24, 2012

Lending Club Loan Interest Rate and Return - Do Defaults Matter?

Interest Rate

As the chart below shows, since inception the interest rate profile of Lending Club loans has been broadening, from 7.12% to 15.96% in 2007 to 5.42% to 24.59% in 2011. They offer lenders loans with a broad range of interest rates to chose from. The dense colored area within a bar indicates high volume of loans were issued. Whereas the white space within a bar indicates no loans were issued.

For further analysis, I decided to allocate loans in 11 different buckets (called bins) based on interest rate of the loan. One interest rate bin (bucket) has interest rate spread of 1.99% with midpoint listed on the charts below. For example, the interest rate bin labeled 12% includes loans with interest rate from 11% to 12.99%.

From the above chart, two observations stand out right away:

  1. The number of interest rate bins has increased with time. In 2007, there were only five interest rate bins while in 2011, interest rate bins reached to 11 most likely result of increasing volume of loans, diversity of borrowers, return expectations of lenders, and general economic environment.
  2. Though rising, the loan volume at higher interest rate is only a small fraction of total loan volume. This scarce availability of loans with higher interest rate is a challenge for lenders when trying to create a significant size and diverse portfolio of high interest loans in expectation of higher returns.

Default Rate

Peter Renton pointed out in his comment on my previous post Lending Club Loan Issue Date and Default Rate that  loans from 2007 and 2008 may not be representative that of loans issued in 2009 and after due to worst financial crisis in 2007 and 2008 and major changes made by Lending Club in its loan underwriting model.

The significantly small volume of loans issued in 2007 and 2008 also creates larger uncertainty in expected defaults and returns due to small sample size. Though the loan volume is still not sufficiently large in 2009, it is the best we have available to analyze defaults and returns for a 3 year term loan.

The above chart shows the percentage of loans charged off and default as well as fully paid. As the chart indicates, the aging of loans has major impact on both rates - charged off and fully paid. As expected, the default rate rises with rising interest rates (almost linearly with the interest rate bins for years 2009 through 2011). Though default rate at various interest levels in 2009 are significantly better than 2007 and 2008, readers need to keep in mind that only 3 year term loans issued in first four months of 2009 have matured and only about 45% of loans issued in 2009 are fully paid by May 1, 2012.

In my opinion, the default rate for 2009 issued loans are understated and most likely will rise to be somewhere in between the current numbers and ones for 2008 and 2009. It is just a hunch based on the slope of the trend lines for different years and expectation of slope for 2009 trend line to be similar to 2007 and 2008.

Expected Return

At this point, I have information about expected default rates for different ages of loans and  for different interest rate bins. It shouldn't be that difficult to calculate the expected return on three year term loans.

I made the following assumptions to calculate the expected return:

  1. The default rates for past three years 2011, 2010, and 2009 are representative of expected defaults for first, second, and third year respectively of a three year term loan.
  2. The portfolio contains a large number of loans and same amount invested in each loan. I haven't determined an optimum number of loans in portfolio yet. Lending Club claims no negative returns for a portfolio with 800 loans.
  3. All loans that default in a year happen at the same month within that year and subsequent years. For example, default in 1 month indicate that loans defaulted in first month of first year, in first month of second year, and in first month of third year, i.e. 1st month, 13th month, and 25th month in the three year life of the loans.
  4. All payments received until the month of default are full (no partial payments) and after defaults no payments are received.
  5. Late payment fees, collection fees, tax deductions from principal write-off, taxes on interest received are not considered in expected return calculations.
  6. The loan service charge is 1% of monthly payment received.
  7. All loans in portfolio are issued same month and are part of same interest rate bin as defined above.
  8. Inflation and cost of capital are not considered in expected return calculations.
The chart below shows the value of portfolio at the end of 3 year for an initial investment of $2,500 in 100 $25 notes at different interest rate bins and for various default month.

Key Takeaways

  1. The portfolio has positive return for all interest rate bins and various default months. It leads me to ask the question "Do defaults really matter?"
  2. Even though the expected return analysis doesn't consider fluctuations in expected default rate, an hands-off approach with PRIME Account and preset Options with targeted returns offered by Lending Club appear to be attractive feasible options for time-constrained lenders.
  3. Due to low volume of loans with high interest rate, it is difficult to build a diversified portfolio solely from such loans. Also, the expected default rate, and in turn expected return, may have wide variance due to the small sample size for loans with high interest rate.

Brady at Lucrative Lending recently wrote about importance of credit report inquiries in making investing decision in a loan by a lender. I agree with his assertion about filtering available loans based on number of inquiries in past six months. Number of inquiries in past six months is the sixth most important criteria in my loan selection process at Lending Club.

--- Promotion ---

Monday, May 21, 2012

Month-end Rush to Issue Loans at Lending Club

Before I move on to analyzing next variable in Lending Club historical loan database, I decided to explore Loan Issued Date data little bit more.


First, I was curious to find out if there is any seasonality during the year in issuing loans. As the weekly loan issued count chart shows below, the high growth in issued loan, especially in first four months of 2012, masks any discernible seasonal pattern during the year except some potential summer doldrums.

End-of-month Rush

Later, I noticed something interesting in the chart above. If you look at the height of each bar, there appears to be a peak at every fourth or fifth bar. This pattern seems to be consistent for most of the year. Is it possible that these peaks represent the end-of-month and end-of-quarter rush in loan issued to meet the quota for the period?

As the loan issued count by day of the month chart shows below, sure enough there is a big jump in number of loans issued daily during last four days of the month (+50% more from monthly average). This unusual increase in loan issued at the end of the month exceeds the average (gray horizontal line) plus one standard deviation (shaded region above the average line).

We've seen many cases that it is very common for sales people to extend good discounts at the end of the period in order to close sales and meet quota for the period, car dealers for example. Is it possible that Lending Club could be rushing the approval process in order to issue the loans before the month-end? Can this end-of-month rush impact quality of loans?

Rising Default Rate as Month Progresses

Default rate is one potential measurement of quality of loan. As the Loan Default by Day of the Month chart below shows, there is definitely uptick in default rate at the end of the month for loans with Charged-off and Default status. The trend line (red dashes) is upward sloping too for the whole month.

The chart also shows patterns for loans with status of In Grace Period, Late, and Performing Payment Plan, nothing that I can pinpoint to a certain behavior. If you do, your insights via comments are appreciated.

Start-of-week Rush

Similarly, when I created chart for the number of loans issued by day of the week, there is a pattern of greater number of loans issued at the start of week, especially Monday and Tuesday. Also, a few loans get issued during the weekend. The increase in loan issued at start-of-week seems reasonable as backlog of loans to be issued builds up during the weekend. Also, it is possible that most borrowers provide information for loan approval during the end of the week or weekend.

If you read up to this point, I am sure you are as curious as I was whether default rate for loans issued by day of the week shows any patterns. As the chart below LC Loan Default by Day of the Week shows there is no discernible impact on default rate from issuing more loans at the start of the week.

Only the loans issued on the weekend show significantly higher default rate. Considering the small sample of loans issued on the weekend, this default rate may or may not be representative. An amusing thought occurred to me when I noticed the higher default rate for loans issued on weekend. These loans were in gray area with respect to approval or denied criteria for loans to be issued. Such loans were bumped up to a supervisor who primarily reviews loans on the weekend. As these loans were borderline case, the approval of such loans resulted in higher defaults.

Key Takeaways

This analysis suggests two key takeaways from my perspective. Your comments are appreciated.

  1. The loans issued at the end of the month have higher risk of default. As lenders don't have control over when loans are issued, they will be better off investing in loans at Lending Club in last week of the month or first few days of the month.
  2. In addition to borrower default risk, lenders shouldn't ignore the risk resulting from loan underwriting process.

Fast Company recently published an interesting article Shaking Up Crowdfunding about the crowdfunding portion of JOBS (Jumpstart Our Business Startups) Act from the point of view of several characters involved in creation of the Act. Check it out!

--- Promotion ---

Thursday, May 17, 2012

Lending Club Loan Issued Date and Default Rate

Loan Issued Date

There is no doubt that loan transactions at Lending Club have skyrocketed, almost doubled between 2010 and 2011, as the chart below shows. The bars indicate the number of loans issued in Quarter and straight line across year indicate total number of loans issued in the year. The total number of loans issued to date in 2012 are 11,656 (a digit clipped in the image). High number of loans are positive for the continued viability of Lending Club platform. It also reduces the risk for lenders losing their investment because of Lending Club as a middleman going out of business.

Default Rate and Loan-at-Risk Rate

Too many recently issued loans (44,913 loans since 2010) and not enough aged loans (6,529 loans prior to 2010) in historical loan database create a challenge in calculating default rate. As the recent loans haven't aged enough, currently these loans most likely will have much fewer charged off, defaults, in grace period, late and performing payment plans (bad outcome). The high percentage (87.3%)  of recent loans in analysis will underestimate the loan population default rate.

As the chart below shows, the average annual default rate, if only counting Charged Off and Default status is 8.74%, much higher than the 3.64% calculated in my previous post Lending Club Data and Default Rate for the whole population of loans. The average annual default, using my definition for default rate, is 10.29% much higher than the 5.85% calculated in my previous post. Going forward, lets call the average default rate calculated using my definition as Loan-at-risk Rate to better differentiate from the other default rate. For aged loans, those issued in 2009 and earlier, the default rate is much higher than average and most likely the true representation of default risk on performance of loans to maturity.

The bars for each status have their own independent scale for % of total loans (Y-axis). The bars for all status for a particular year (vertically) add up to 100%. The straight horizontal line across the bars show the annualized average for last six years. The shaded band shows +/- 1 standard deviation from average.

One way to interpret this data is that, for example, on average you can expect to have 8.73% of loans in Charged Off and Default status with the uncertainty that this percentage could be as low as 1.35% and as high as 16.11%. Another way to interpret is that, for example, on average the probability of a loan to be Charged Off and Default is 8.73%.

The probability of a loan in Charged Off and Default status rise as age of the loan increases. Both on annual and quarterly basis, the age of loan is found to have very significant effect (p < 0.0001) on probability of loan with Charged Off and Default status. For mathematically minded,
Probability of Loan with Charged Off and Default status (%) = 0.0103481 * Quarters since Loan Issued - 0.0127628

Key Takeaways

The analysis of loan status with respect to loan issued date suggests three takeaways from my perspective. Please feel free to share your insights in comments.

  1. The Lending Club loans have higher default rate, i.e. higher risk on annualized basis than most of the "popular" opinion. The wisdom of the crowd doesn't necessarily overcomes the advantage of extensive background information available to commercial banks.
  2. In order to assess the real performance of your loan portfolio on Lending Club, only consider seasoned loans.
  3. In addition to diversification across lots of loans, selling loans on the secondary market before maturity is a potential way to reduce the impact of loan default risk on portfolio.
I am almost finished analyzing the Loan Issued Date for loans in Lending Club historical loan data file. If you are interested in any particular variable, preferably loan related, please feel free to suggest via comment

Brady at Lucrative Lending recently discussed investing in P2P lending while in debt. He has great advice, check it out. Personally I believe you should only be a P2P lender once you have built a solid financial foundation. Considering the evolving state of P2P lending, you should consider investing in P2P lending only a small fraction of your funds allocated to Junk Bond or high risk assets.

Monday, May 14, 2012

Lending Club Data and Default Rate


For the Lending Club loan analysis, I selected the historical loan data provided by Lending Club (LC). I downloaded the data file on May 1, 2012. It contains 41 distinct variables about Loan, Loan Application, Borrower, and Loan Repayment.

The data file provides information on 51,768 loans issued between June 2007 and April 2012. There are additional 2,749 loans in the data file that are listed as loans that do not meet the current credit policy. These 2,749 loans were not included in this analysis. There are 326 loans with application date from April 3, 2012 onward that are funded and with status 'In Review' but not yet issued. These 326 loans were not included in this analysis either.

Default Rate

Each loan in the data file has one of the nine status listed - Current, Fully Paid, Issued, Charged Off, Default, Performing Payment Plan, Late (31-120 days), Late (16-30 days), and In Grace Period.

According to the data found at the site of Board of Governors of the Federal Reserve System, in Q4 of 2011, the Credit Card Charge-off Rate was 4.53% while Delinquency Rate was 3.32% for all commercial banks on non-seasonally adjusted basis. For Lending Club, the charge-off rate (loans with Charged Off and Default status) was 3.64% and delinquency rate (loans with payment past due 30 days) was 0.96% as shown in chart below.

Even though, LC charge-off and delinquency rate numbers may look as attractive as ones for credit cards, this comparison should not be considered as fair because Federal Reserve defines Charge-Offs and Delinquencies differently as following:
Charge-offs, which are the value of loans removed from the books and charged against loss reserves, are measured net of recoveries as a percentage of average loans and annualized.
Delinquent loans are those past due thirty days or more and still accruing interest as well as those in nonaccrual status. They are measured as a percentage of end-of-period loans.
The LC charge-offs were calculated using number of loans instead of amount charged off and over multiple years instead of on annualized basis. If possible, I will try to rectify this difference in future post when the loan issue date and loan status are reviewed.

I am not sure if Lending Club reports charge-offs and delinquencies to SEC as I didn't find this information in LC's SEC filings. If I missed this information, please let me know. At this point, I assume that the profile of Lending Club charge-offs and delinquency rate is similar to or worse than that of commercial banks considering that the commercial banks can access much more personal information about borrowers than the lenders on Lending Club.

Lending Club reports following recovery rate by Loan Status (over 6 months):

Loan Status Recovery Rate Unrecoverable
In Grace Period 84% 16%
Late (16-30 days) 77% 23%
Late (31 - 120 days) 53% 47%
Default (120+ days) 4% 96%

As shown in chart below, 94.15% of loans are current, fully paid, and issued (a good outcome) while 5.85% of loans are charged off, default, performing payment plan, and late (a bad outcome). Using the unrecoverable rate as the probability of default, the more realistic default rate (a bad outcome) would come out as 4.44%. In this analysis, I will not use this more realistic expectation of default primarily due to two reasons:
  1. Lending Club reports the recovery rate as "partially or fully recovered." Without knowing more about "partially recovered," I believe the calculated default rate of 4.44% underestimates the actual default.
  2. Incorporating unrecoverable rate in default expectation makes the analysis more complex, especially when I start to slice and dice the data. I may attempt to incorporate this complexity in future iteration of analysis.
For the purpose of this analysis, I will consider any loan as "Default" that doesn't have status as Current, Fully Paid, or Issued. Based on this assumption, the default rate is 5.85% for the whole population of loan in this data file from Lending Club.

From next post, I will start exploring loan data further and review each variable for loans and associated default rate. Next up on the block is Loan Issued Date and Default Rate.

Recently Ross Asset Advisors published a good overview of peer-to-peer consumer lending "Investing in P2P Loans." Check it out.

--- Promotion ---

Thursday, May 10, 2012

Crowdfunding and Lending Club


There are lot of different type of crowdfunding sites online where people come together to support non-profits (Kiva), projects (Kickstarter), and borrow and lend money (Lending Club, Prosper). With the crowdfunding portion of JOBS act, I believe crowdfunding will become more popular in a year or two.

Last year, while discussing the negligible interest rate on savings, an EMBA classmate brought my attention to Lending Club where lenders can earn attractive returns on their notes while borrowers can get unsecured loans, similar to credit cards, at attractive interest rates. In fact, LC loan statistics show that 68.50% of borrowers use the loan to consolidate debt or pay off their credit cards and 91.73% of lenders with 800+ notes earn returns between 6% and 18%.

Lending Club

As I looked more into Lending Club, it appeared to be a very attractive alternative investment option. Having participated in equity market for almost two decades, it was my first opportunity to participate directly in debt market specially in consumer loans. LC also provides the export of their historical loan database and I saw the opportunity to use it to refine my knowledge of statistical and data mining tools and techniques.

One of the main risk from lenders' perspective is the risk borrower may not repay their loans. As there is no collateral for the loan (unsecured), there is very little recourse in such events other than relying on LC's recovery process. Though LC claims the overall annualized default rate is below 3%, objective of most lenders is to identify parameters that influence higher default rate or minimize default rate.


Recently, I thought why not share with my blog readers my analysis of the LC historical loan data, as analysis progresses.  This may enable instant feedback and improvement in analysis and techniques.

The specific goals of historical loan data analysis are:

  1. Identify parameters that influence higher default rate or minimize default rate.
  2. Identify lending strategies that result in greater return compared to a randomly selected lending portfolio.
  3. Compute excess return versus excess risks in comparison to a benchmark.
Can you think of any other goals that should be included for such analysis?

I plan to use Microsoft Excel, Python, R, Google Fusion Tables, and Tableau Public as the basic tools for analysis and visualization. Any tips and tricks in using these tools are welcome.

In the next post, I will review the data contained in the loan database and define the default rate.

--- Promotion --- Send Large Files Up to 2GB - Free 14 Day Trial!

Monday, May 07, 2012

Rebooting this Blog

It has been several years since I updated this blog. For last couple of years, I had been busy with Executive MBA program at University of Washington. My interests also had diverged from purely data storage to broader markets and technologies for data storage, data analytics, data mining, statistics and machine learning. In addition, I became interested in Micro-finance specifically peer-to-peer lending sites such as Prosper and Lending Club.

After deliberating whether to start a new blog for my new-found interests or reboot this blog, I decided on later. This blog was always about my random thought and not necessarily focusing on a specific topic so as interests change so does this blog.

Once again, I look forward to slowly become a regular poster on this blog. I look forward to welcoming returning and new readers.