Do filtering strategies work?

Posted by admin on May 19, 2013

Do Filtering Strategies Work?

Probably the most popular way to pick which loans to invest in is to use some type of filtering strategy. This means selecting loans that fall in a specified range for different statistics. For example, you might decide to filter out loans for people who have had any delinquencies in the past 2 years. Or maybe you only want to lend to people with a 700+ credit score. You can find all sorts of discussion about what filters to use. Most of them are just people's hunches about what they think will be good filters and then maybe some analysis on how there portfolio has done after a year. The time of their analysis and the size of their portfolios are much too small to provide any meaningful insight into what a good filter is. There are also some websites that allow you to play with filters and see the historical rate of return on loans that fit your filter. You can find a filter that has worked well in the past and then apply that to select future loans. This is certainly better than just going with your hunches and you can increase your returns by applying good filters. However, there are some problems with creating a filtering strategy this way that you should be aware of.


1. Overfitting

The biggest problem with filtering strategies is overfitting. Most people keep applying more and more specific filters until they see good historical returns. The problem with this is that by that time the sample of loans that fit their criteria is usually pretty small. Even with a group of 100 loans just a few more or less defaults would make a big difference in the result. It could be that the filter you are using turned up good historical results purely by chance.

Overfitting is a common problem in many applications of staticstics and machine learning. One way to guard against overfitting your data is divide it into 2 groups. A training group and a testing group. Then you use the training group to find good loan filters as you did before. Once you have found one you like, you apply it to the testing group to see how it performs. If it also does well on the test group there is a better chance that the results weren't just random chance. 

2. Latent Variables

Another problem with applying filtering strategies is getting good historical returns because of latent variables unrelated to your filter. A latenet variable is an unobserved reason that affects the results. The most common case of this is applying a filter that is biased towards newer loans. Lending Club has been gradually changing their underwriting requirements overtime. Also different types of borrowers may be lending now than several years ago. Imagine you are filtering for large loans, say loans of over $25,000. You are finding that you see better historical returns for this group of loans. Great! You can just invest in large loans only and increase your earnings. Not so fast, we can't be sure that the size of the loan is causing better returns. Lending Club has been growing and may not have given out as many large loans in the past. The overall quality of borrows has been increasign so loans of all sizes are doing better now than several  years ago. When you filter for large loans you are also filtering for newer loans because they just didn't exist as often in the past. Perhaps all loans are doing better and if you filtered for more recent smaller loans you'd see good returns there as well.


3. Inaccurate historical return estimates.

The whole strategy of filtering is based on the idea of using past historical returns to learn a good filter. But what if those past return numbers are innaccurate. For several reasons that we explain here, estimates of the return on a portfolio of loans drops over time. So any filter that selects more young loans will have exaggerated historical return estimates. For this reason it is probably best that you develop loan filters on loans that are at least 18 months old.


4. Not enough targets

Maybe you have developed a great filter, but there just aren't enough loans matching it to invest all your money. Using a filter tosses out loans that don't meet all the criteria but that might still be great investments. Maybe one person had a delinquency recently but all of their other stats are perfect. They still might be a good borrower but your filter will toss them out.



Filtering strategies can increase your return on investment but you need to be wary of the potential pitfalls. If you decide to go ahead and create your own filter strategies:

  • Try to use filters that provide a large sample size
  • Don't get too elaborate by filtering on every possible statistic. 
  • Filter only loans that have matured a few years when desigining your filter. 
  • Reserve some of the historical data as a test set that you will check only after you think you've created a good filter.

Follow those guidelines and you should do ok with filtering strategies. 


submit to reddit