Credit-Fraud-Assessment

Tech Stack: AI/ML, Pandas, Numpy, Seaborn, SkLearn, Matplotlib, Linear Regression, Random Forest Regression, K-Nearest Neighbours Regression, LGBM Classifier, Random Forest Classifier

Dataset Description: A total of 32582 data points for Credit Fraud Assessment are used with 14 variables including age, income, home ownership status, employment length, loan intent, loan grade, etc. The target variable is loan status which has 1 for loan approved and 0 for loan not approved.

Data Cleaning and Training

Finding and dropping duplicates
Removing redundant values
Removing absurd values
Managing missing values
- Iterative Imputing
- Standard Scaling

Following the train-test split, randomized search using the following is done to find the most optimal set of paraments.

LGBM Classifier
Random Forest Classifier
Linear Regression
Random Forest Regression
K-Nearest Neighbours Regression

Learning Curve: Learning Curve is then made for accuracy assessment. There is a high variance to be found. The model is then rebuilt with less complexity using just the random forest classifier and pruned decision trees to reduce overfitting.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CreditFraudAssessment.ipynb		CreditFraudAssessment.ipynb
CreditRiskDataset.csv		CreditRiskDataset.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit-Fraud-Assessment

About

Releases

Packages

Languages

Parul-Mann/Credit-Fraud-Assessment

Folders and files

Latest commit

History

Repository files navigation

Credit-Fraud-Assessment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages