# Choosing Loans with Monte Carlo, Diversification Part 2

# Introduction

In a previous article, we conducted a preliminary investigation of applying Modern Portfolio Theory to Lending Club. We assumed that each loan grade on Lending Club (\(A\) through \(G\)) corresponded to an asset class, calculated variances (\(\sigma^{2}_{i}\)) and covariances (\(\sigma^{2}_{i,j}\)) of each asset class , and constructed an efficient frontier for Lending Club portfolios. While this constituted a useful exercise in theory, the results were not applicable for an individual investor because combining all \(A\) grade loans into a single \(A\) grade asset class is only achievable if one owned all \(A\) grade loans. Realistically, this is something no individual investor would be able to achieve. This time we aim to find variances and covariances at the individual loan level, which allows us to calculate the variance of a portfolio by treating each loan in the portfolio as an asset.

With the variance and covariance numbers, when faced with several potential new loans to invest in, we can add prospective loans to the portfolio, analyze how the standard deviation (\(\sigma\)) and Expected Return (\(E(r)\)) change, and decide which loans to invest in based on a desired risk/return profile.

The following analysis was done on mature loans, with a grade breakdown depicted in the table below:

Grade | A | B | C | D | E | F | G | Total |
---|---|---|---|---|---|---|---|---|

# of loans |
20,757 | 27,351 | 16,903 | 9,569 | 2,722 | 736 | 343 | 78,381 |

## Assumptions

**As in our first diversification article, we take our calculated Expected Return at each month as the return you could get for buying/selling your loan at that month on the secondary market.**- We believe our Expected Return serves as a reasonable proxy because it behaves as one would expect loans to based on maturity and status; seasoned notes that are current are likely to be more valuable than their counterparts of younger age and/or undesirable loan status.

**Loans of the same grade and term are assumed to have the same Expected Return, and loans of the same grade, term, and age (or age difference) are assumed to have the same amount of expected variance (covariance).**- Since our methodology relies on Monte Carlo simulations, we attempt to categorize loans as specifically as possible before making our generalization assumptions. In the case of Expected Return, our granularity reached the term and grade levels, and for variance/covariance we could go one step deeper and found that the age of loans (months beyond issuance date, capped at the loan’s term) was important as well. Further details are in the methodology section.

## Methods & Methodologies

Given \(n\) assets with known Expected Returns, variances, and covariances between the \(n\) assets, a portfolio of those assets will have an Expected Return and standard deviation calculated with the following formulas:

\[E(r) = \sum_{i=1}^{n}w_iE(r_i)\]

\[\sigma = \sqrt{\sum_{i=1}^{n}w_i^2\sigma_i^2 + \sum_{i, j=1, i\neq j}^{n}2w_iw_j\sigma_{ij}^2}\]

where \(w_i\) is the portfolio weight of asset \(i\) and \(\sigma_{ij}^2\) is the covariance between the \(i\)th and \(j\)th asset in the portfolio.

With these, we can look at a portfolio, add new loans to the portfolio, and see how the standard deviation and Expected Return change. Then, based on an investor’s selected levels of risk (standard deviation/variance) and return, we can choose the appropriate new loans to invest in.

### Expected Return

To generate the Expected Returns of individual loans, we aggregated all of the cashflows of loans within a specified term and grade and calculated a compound annualized Internal Rate of Return (\(IRR\)). More details can be found in this article.

### Variance

For the variance of a loan of specific term, grade, and age, we identified all matching loans, calculated the variance based off of the loan’s return series, and took the average of the variances. More concretely, we calculated variance of an \(A\) grade 36 term 0 months old loan, \(A\) grade 36 term 1 months old loan, etc.

When determining the variance of individual loans, we realized that the term, grade, and age of the loan impacts the variance of the loan. Term and grade impacting a loan’s variance should be self-evident, but age requires a little bit of explanation. Take loans that have an age of 9 months (a fairly dangerous time based on our hazard curve findings). They could have little variance if paying consistently or lots of variance if defaulting:

Date | Sep 2011 | Oct 2011 | Nov 2011 | Dec 2011 | Jan 2012 | Feb 2012 | Mar 2012 | Apr 2012 | May 2012 | Variance \(\sigma^{2}\) |
---|---|---|---|---|---|---|---|---|---|---|

Paying | 7.60% | 7.85% | 8.15% | 8.48% | 8.88% | 9.19% | 9.56% | 9.93% | 13.42% | 2.7\(\%^{2}\) |

Defaulting | 7.60% | 7.85% | 8.15% | 8.48% | 8.83% | -7.00% | -42.3% | -56.4% | -99.9% | 1369.0\(\%^{2}\) |

Compare this to loans that are 35 months old; regardless of if the loan defaults or prepays by the end of the 35th month, the return series in each case will be similar to differences only in the last few months of returns, and the difference in variances of the defaulting and paying loans will be much smaller than the 9 months example above.

Date | Sep 2011 | Oct 2011 | Nov 2011 | Dec 2011 | … | June 2014 | July 2014 | Aug 2014 | Sept 2014 | Variance \(\sigma^{2}\) |
---|---|---|---|---|---|---|---|---|---|---|

Paying | 7.60% | 7.85% | 8.15% | 8.48% | … | 14.40% | 14.41% | 14.42% | 14.57% | 5.1\(\%^{2}\) |

Defaulting | 7.60% | 7.85% | 8.15% | 8.48% | … | 14.34% | 11.04% | 3.60% | 1.05% | 9.8\(\%^{2}\) |

So for the expected variance of a loan of specific term, grade, and age, we identified all loans that matched the criteria, calculated the variance based off of the loan’s return series, and took the average of the variances for our expected variance.

### Covariance

Our method for calculating covariance utilized Monte Carlo simulations to randomly select two loans (term1/grade1 and term2/grade2) and calculate a covariance if and where the two return series had a computable covariance (e.g., the two loans existed at the same time for at least two months). For covariance, we also found that age difference (e.g., a loan issued in Jan 2010 and Feb 2010 has an age difference of 1 month) is important for similar reasons as demonstrated for variance; the covariance for two loans of term1, grade1, and term2, grade2 could be significantly different depending on the age difference. To illustrate this, we’ll look at correlations between loans (a “normalized” covariance) since it is much more interpretable than covariances.

Here’s a snippet of two different correlation tables, the first being of 36 month loans with no age difference, and the second of 36 month loans with 11 months age difference.

0 months age difference:

11 months age difference:

Note how if the two loans are issued at the same time (0 age difference), there is a more “average” amount of correlation (range of .24 – .69) but at 11 months age difference the correlation expands towards the extremes (range of .17 – .75). One way to interpret this is that two 36 term \(A\) grade loans issued 11 months apart are slightly more correlated than two 36 \(A\) grade loans issued at the same time. The logic behind this can be illustrated with an example:

Pretend that 36 term \(A\) grade loans have an 80% chance to do well (constantly pay until term) and 20% chance to default. Based on the hazard curve, we know that most of those defaults will happen in the earlier months rather than later months. If we have two 36 term \(A\) grade loans at issuance, the chance to be positively correlated (move together) is if they both do well (80% x 80% = 64%) or both default (20% x 20% = 4%) for a total of 68%. Now compare this to a newly issued 36 \(A\) loan and an 11 month old 36 term \(A\) loan. The new 36 term \(A\) loan still has 80% chance to do well and 20% of default, but the 11 month old loan (having “lived” past some dangerous months) is now closer to 90% doing well and 10% defaulting. Now the probability of being positively correlated has increased to 74%(90% x 80% + 10% x 20%). So, based on where two loans are on the hazard curve (which is determined by the loan’s age), the correlation (and thus covariance) is different.

One thing to note is that these correlation/covariance tables are actually pseudo-correlation/covariance tables because the main diagonals do not contain values you’d expect (1s, meaning perfect correlation, in the correlation matrix and variances in the covariance matrix). This is because we aren’t actually comparing a loan with itself, but instead with a loan that has characters similar to itself.

## Results

So with all the numbers we need, let’s dive into an example of how we might use our findings, keeping remaining amounts invested as nice whole numbers for illustration purposes. Pretend you have a conservative portfolio on Lending Club consisting of two loans:

Loan | Term | Grade | Age | Amount Invested |
---|---|---|---|---|

1 | 36 | A | 1 | 24 |

2 | 36 | A | 2 | 23 |

The Expected Return and standard deviation of the portfolio (in linear algebra notation) is:

\[E(r)_{p} = \omega_{1}E(r_{1}) + \omega_{2}E(r_{2})\]

\[\sigma_{p} = \sqrt{\left( \begin{array}{ccc}

\omega_{1} & \omega_{2} \\

\end{array} \right)

\left( \begin{array}{ccc}

\sigma^{2}_{11} & \sigma^2_{12}\\

\sigma^{2}_{21} & \sigma^2_{22} \end{array} \right)

\left( \begin{array}{ccc}

\omega_{1} \\

\omega_{2} \\

\end{array} \right)}\]

Plugging in the appropriate numbers yields \(E(r)_{p} = 4.34\%\) and \(\sigma_{p} = 3.49\%\)

Now say you have two loans that you can potentially invest your next 25\($\) in:

Loan | Term | Grade | Age | Amount Invested |
---|---|---|---|---|

A | 36 | A | 0 | 25 |

D | 36 | D | 0 | 25 |

You wonder “which loan should I invest in if I want to be conservative/aggressive?” It’s easy to find out; just extend the above formulas to 3 loans and fill with Loan A values to see the risk/return profile of the new portfolio with Loan A, and with Loan D values to see the risk/return profile with Loan D. The expanded formulas become:

\[E(r)_{p} = \omega_{1}E(r_{1}) + \omega_{2}E(r_{2}) + \omega_{3}E(r_{3})\]

\[\sigma_{p} = \sqrt{\left( \begin{array}{ccc}

\omega_{1} & \omega_{2} & \omega_{3} \\

\end{array} \right)

\left( \begin{array}{ccc}

\sigma^{2}_{11} & \sigma^2_{12} & \sigma^{2}_{13}\\

\sigma^{2}_{21} & \sigma^2_{22} & \sigma^{2}_{23}\\

\sigma^{2}_{31} & \sigma^2_{32} & \sigma^{2}_{33}\end{array} \right)

\left( \begin{array}{ccc}

\omega_{1} \\

\omega_{2} \\

\omega_{3} \\

\end{array} \right)}\]

Plugging in again, we see that if you choose Loan A you have \(E(r)_{p} = 4.34\%\) and \(\sigma_{p} = 3.38\%\) whereas if you choose Loan D you have \(E(r)_{p} = 5.49\%\) and \(\sigma_{p} = 3.21\%\). We have a surprising result; a \(D\) grade loan was actually able to reduce the portfolio’s standard deviation (risk) more than an \(A\) grade loan. What about the risk/return profiles of portfolios if we’d instead tried a \(B\)/\(C\)/\(E\)/\(F\)/\(G\) loan? We ran those numbers and here’s how it looks visually (1 is the original portfolio of two loans):

In this instance, at an original portfolio of two loans moving to a portfolio of three, we see that anything you buy can help in reducing the risk of the portfolio. This is diversification at work; spreading your money across more loans reduces your risk. But obviously, one might want to put their money into an \(E\) or \(D\) loan in this instance because not only do they reduce the risk the most, but also happen to increase the Expected Return the most as well.

Here’s another example of a conservative portfolio (marked 2) of young \(A\), \(B\), and \(C\) loans, and adding 10 notes of same grade at 25\($\) each to the portfolio.:

# of Loans | Term | Grade | Age | Remaining Amount Invested |
---|---|---|---|---|

25 | 36 | A | 1 | 24 |

25 | 36 | A | 2 | 23 |

25 | 36 | B | 4 | 21 |

25 | 36 | B | 2 | 23 |

5 | 36 | C | 1 | 24 |

5 | 36 | C | 3 | 22 |

Again, it is easy to see how diversification helps in reducing risk, as almost all portfolios of our conservative portfolio (having 110 loans originally, or 120 after investing additional notes) have reduced risk than the first example portfolio. The importance of diversification cannot be overstated. To optimally add additional notes to a portfolio, we can simply add the best combination of available notes that increase return for the lowest given quantity of risk.

We’re near the end of this long read, but here’s why it matters: 1) **A conservative portfolio is not necessarily made up of only conservative loan grades**; as we saw above, adding an \(E\) grade loan actually reduced the portfolio’s risk more than adding another \(A\) grade loan. 2) When faced with the choice of which loans to invest in, this analysis can be done to choose the loans that best fit the desired risk/return profile of the portfolio.

**Disclaimer**

The numbers used in this article are for mature loans on the Lending Club platform. Expected Return, variance, and covariance numbers for mature loans picked by LendingRobot’s scoring algorithm are different from those presented in the article.

- Justin Hsi
- 3 Comment

I don’t think you can use Correlation to define risk the way you have above. The only risk in your example is default. Since I can’t imagine anyone having a problem with all of the loans in a portfolio paying off. If you instead approached the value of the portfolio from what it is worth on the secondary market you maybe able to show the effects of diversifying in different grades of loans. The risks then would be paying too much for note or of not taking advantage if market mispricing between the different grades of loans. Since these notes are fully amortized in a relatively short period of time it would be very hard to compare older loans to new ones. And also there appears to be an information problem on the secondary market. It is not clear what other people are paying on the secondary market. At least for stock and bond markets you have pricing info that tells you what an asset is selling for so you could mark your assets to market value. Using portfolio theory, if you had the ability to see what other people may be willing to pay for your notes, you would try to make purchases or sales to decrease the correlation of market movements in the value of your portfolio. I suspect that you would find that the difference in trading bands between the grades of notes secondary market are tight.

Hi Cory,

I’m not sure what you mean by “use correlation to define risk” when we define risk to be the standard deviation in returns. Ideally we would be able to mark everything to market to have our time series of values/returns. Only more time and transactions will give us the data we need.

Informative details are given by you thanks for it. Details are results oriented and practical based.