In this project, I developed machine learning models to predict delinquent loans, aiming to achieve the highest possible recall score. The models used included Logistic Regression, XGB, Random Forest, Voting Classifier, Stacking, and a Deep Learning model using Keras. The best performing model was a Logistic Regression model, achieving an average recall score of 0.76. These models can help Freddie Mac purchase mortgage securities at the correct price, minimizing the negative impact of delinquent loans on the market.
Notebook Link | For a more complete summary please visit my portfolio website Complete Summary
-
Data Analysis: I conducted a comprehensive exploratory data analysis, handling null values and identifying key features for the model.
-
Model Selection: Various models were tested including Logistic Regression, XGB, Random Forest, Voting Classifier, Stacking, and a Deep Learning model using Keras. The Logistic Regression model outperformed others in terms of recall score.
-
Model Interpretation: I used LIME (Local Interpretable Model-Agnostic Explanations) for model interpretation, ensuring the model's decisions can be understood and validated.
-
Compliance with Fair Housing Act: The interpretability of the Logistic Regression model ensures compliance with the Fair Housing Act, which prohibits discrimination in housing financing.
Further exploration and optimization of Deep Learning models could potentially improve precision, although their compliance with the Fair Housing Act needs to be examined.
Please note that the provided links are placeholders. Replace them with the actual notebook link and website link for the notebook and complete summary, respectively.