Skip to content

Parul-Mann/Credit-Fraud-Assessment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Credit-Fraud-Assessment

Tech Stack: AI/ML, Pandas, Numpy, Seaborn, SkLearn, Matplotlib, Linear Regression, Random Forest Regression, K-Nearest Neighbours Regression, LGBM Classifier, Random Forest Classifier


Dataset Description: A total of 32582 data points for Credit Fraud Assessment are used with 14 variables including age, income, home ownership status, employment length, loan intent, loan grade, etc. The target variable is loan status which has 1 for loan approved and 0 for loan not approved.


Data Cleaning and Training

  • Finding and dropping duplicates
  • Removing redundant values
  • Removing absurd values
  • Managing missing values
    • Iterative Imputing
    • Standard Scaling

Following the train-test split, randomized search using the following is done to find the most optimal set of paraments.

  • LGBM Classifier
  • Random Forest Classifier
  • Linear Regression
  • Random Forest Regression
  • K-Nearest Neighbours Regression

Learning Curve: Learning Curve is then made for accuracy assessment. There is a high variance to be found. The model is then rebuilt with less complexity using just the random forest classifier and pruned decision trees to reduce overfitting.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published