## Estimating the Probability of Default

With Lending Club, an investor can invest in a portfolio of loans.  But how will those loan perform, and which loans to pick?  In order to say anything meaningful about what loans to choose, we must first estimate how loans will do over time.  What percentage of loans will default?  What percentage of loans will get paid off in full?

Loans in repayment can transition through the following states:

• Current,
• In Grace Period,
• Late (16-30 days),
• Late (31-120 days),
• Default,
• Charged Off,
• Fully Paid

From one month to the next, a loan could move, for example, from Current -> In Grace Period, or Current -> Fully Paid, or Late -> Current.

When selecting any given loan, we want to know the probability that the loan will end up in one of two end states: Charged Off (bad loans) or Fully Paid (good loans).

The simplest approach would be to just count up the number of loans with those two end states.  So as of April 1, 2011, we have 23,007 total loans. Of those, 1152 have been Charged Off and 3436 have been Fully Paid.  So 25% of loans are Charged Off.  Not good. But wait, those Charged Off loans only represent 5% of our total loan data.  Many more loans have yet to be paid off, but we expect them to be paid or charged off eventually.

A better way is to look at the how the loans evolve over time.  Take a loan that is Current in February.  We want to know the probability that it will be Current in March.  Using snapshots of the loan data over time, we can construct a Transition Matrix.

Transition Matrix as of April 2011: All Loans

How to interpret this matrix?  Well, if we select a loan that has a status of Current (left side), then it has a 94.84% chance of remaining current in the next month, and a 2.79% chance of being Fully Paid in the next month.  A loan with a status of In Grace Period has a 42.62% chance of returning to a Current status in the next month, and a 30.35% chance of moving to Late (31-120 days).

Why such a low probability for moving from In Grace Period to Late (16-30 days)?  Because we are taking a snapshot only once per month.  So a loan that is In Grace Period in February will most likely either have a payment made within 30 days or become late by another 30 days.  30 days + days into the grace period means that the loan will be more than 30 days late by the next month, thus skipping the Late (16-30 days) category.

Notice how late loans degrade quickly in future performance.  A loan In Grace Period has a 42.62% chance of going back to Current status in the next month.  But a Late (31-120 days) loan has only a 5.80% chance of going back to Current status in the next month.  Late loans tend to default.  Loans in Default either remain in Default or are Charged Off in the next month.

Charged Off and Fully Paid are end states, represented by the 100% probability.  Once a loan enters one of those two states, it cannot return to any other state.

This Transition Matrix allows us to create estimates of future loan performance for any given loan.  The example here is admittedly simple, and there are a lot of modifications that we might consider to make our estimate more sophisticated.  For example, we might want to make separate matrices for each Credit Letter: A loans through G loans.  We would expect to see a different set of probabilities for loans in different credit letter categories.

Future posts will explore that possibility, and also go into more detail about how one constructs a transition matrix.

This entry was posted in Modeling, Modeling Loan Transitions and tagged , , , , , . Bookmark the permalink.

### 2 Responses to Estimating the Probability of Default

1. melondonkey says:

So just to clarify, this transition matrix was constructed using multiple months or just February to March? I ask because there could be seasonal factors influencing the matrix (tax returns, for example). I’m going to start archiving data so I can have a better view of loan performance with monthly snapshots of performance.