Vocabulary as an indicator of creditworthiness: An analysis of public loan data

By Justin Wagers. University of Puget Sound.

The purpose of this research is to determine the usefulness of a borrower’s vocabulary in determining his/her creditworthiness. The analysis takes a word-frequency approach to 36,055 loans from the peer-to-peer lending platform Lending Club, and evaluates text submitted by borrowers to improve the prediction of whether they will pay back their loan through a naïve-Bayes classifier model. Vocabulary, when paired with traditional creditworthiness measures, is found to significantly improve the prediction accuracy of a borrower’s creditworthiness as compared to the accuracy of traditional credit measures alone.

Read the full paper here.