Skip to content

mbelk059/SortMyTrash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

♻️ SortMyTrash

Image classifier for household waste: given a photo, predict one of seven material classes (plastic, metal, paper, cardboard, glass, organic, trash). Training uses transfer learning (ResNet or EfficientNet). gradcam.py overlays a heatmap so you can see which regions influenced the prediction.

Project Scope

  • Train a custom image classifier on TrashNet
  • Evaluate with valid metrics: accuracy, precision, recall, F1, confusion matrix
  • Produce Grad-CAM visual explanations for model decisions

See FINAL_RESULTS.md for full training history, test metrics, and baseline comparison.

GitHub: .pth weight files are not committed (they can exceed GitHub’s 100 MB per-file limit). Clone the repo, add data/ if needed, run Train once to create outputs/model_best.pth, then Evaluate / Grad-CAM. Metrics and images in outputs/ can still be committed.

Setup

conda create -n sortmytrash python=3.10
conda activate sortmytrash
pip install -r requirements.txt

If conda activate does not work in PowerShell, prefix all commands with:

conda run -n sortmytrash python src/train.py ...

For example:

conda run -n sortmytrash python src/dataset_stats.py --data_dir data

Data

Download TrashNet from https://github.com/garythung/trashnet and place trashnet-master.zip in the project root, then unzip it.

Expected layout under data/:

data/train/<class>/*.jpg
data/val/<class>/*.jpg
data/test/<class>/*.jpg

Class folder names: plastic, metal, paper, cardboard, glass, organic, trash.

If you start from one folder per class (e.g. after downloading datasets), build splits with:

python src/data_prep.py --src_dir raw_dataset --dest_dir data --train_ratio 0.70 --val_ratio 0.15 --test_ratio 0.15 --clear_dest

Count images per split (useful before training):

python src/dataset_stats.py --data_dir data --output_json outputs/dataset_counts.json

TrashNet does not ship an organic class. If you keep organic in the list above, add your own images under organic/ in the raw folder before running data_prep.py.

Using trashnet-master.zip: unzip it. Class images may live under trashnet-master/data/dataset/ or under a nested resized path such as trashnet-master/data/dataset-reszied/dataset-resized/ (folder names vary). The importer searches under trashnet-master/data by default and recognizes common typos (e.g. cardbord → cardboard).

python src/import_trashnet.py
python src/data_prep.py --src_dir raw_dataset --dest_dir data --train_ratio 0.70 --val_ratio 0.15 --test_ratio 0.15 --clear_dest

To search only a specific folder: python src/import_trashnet.py --trashnet_root trashnet-master/data/dataset-reszied/dataset-resized

import_trashnet.py fills raw_dataset/ with the six TrashNet classes. Add photos under raw_dataset/organic/ if you want the seventh class represented.

Train

Example (what was used for the numbers in FINAL_RESULTS.md — CPU-friendly):

python src/train.py --data_dir data --epochs 3 --batch_size 16 --lr 1e-4 --backbone resnet18 --pretrained --seed 42

Longer run if you have time or a GPU:

python src/train.py --data_dir data --epochs 20 --batch_size 32 --lr 1e-4 --backbone resnet18 --pretrained --seed 42

Frozen backbone (train only the linear head / use as a simpler comparison model):

python src/train.py --data_dir data --epochs 3 --batch_size 16 --lr 1e-4 --backbone resnet18 --pretrained --freeze_backbone --seed 42 --output_dir outputs_baseline

Then evaluate on the test split:

python src/evaluate.py --checkpoint outputs_baseline/model_best.pth --data_dir data --split test --prefix test_baseline --output_dir outputs_baseline

EfficientNet:

python src/train.py --data_dir data --epochs 20 --backbone efficientnet_b0 --pretrained

Checkpoints and training_history.json go under outputs/ (or outputs_baseline/ if you set --output_dir).

Evaluate

On the held-out test split:

python src/evaluate.py --checkpoint outputs/model_best.pth --data_dir data --split test --prefix test

Writes test_metrics.json, test_classification_report.txt, test_confusion_matrix.png, and test_confusion_matrix.csv under outputs/.

Grad-CAM

One image (prints predicted class and an illustrative blue/green/black bin hint — not legal advice; see src/bin_hint.py):

python src/gradcam.py --checkpoint outputs/model_best.pth --image_path path\to\image.jpg --output outputs/gradcam.png

Several images (glob):

python src/gradcam_batch.py --checkpoint outputs/model_best.pth --glob_pattern "path\to\images\**\*.jpg" --output_dir outputs/gradcam_batch

Sample Output

After training and evaluation, the following outputs are saved under outputs/:

Test set metrics (3 epochs, ResNet-18):

Metric Value
Accuracy ~0.91
Balanced Accuracy ~0.89
F1 (Weighted) ~0.91
F1 (Macro) ~0.77

Grad-CAM example: model correctly identifies a plastic container with 0.98 confidence and highlights the body and edges of the item:

Grad-CAM example

Confusion matrix:

Confusion matrix

See FINAL_RESULTS.md for full per-class breakdown and baseline comparison.

Source files

File Role
src/train.py Training loop, saves best checkpoint
src/evaluate.py Accuracy, balanced accuracy, macro/weighted F1, confusion matrix
src/gradcam.py Grad-CAM visualization
src/gradcam_batch.py Batch Grad-CAM
src/data_prep.py Train/val/test split from class folders
src/import_trashnet.py Copy TrashNet folders into raw_dataset/
src/dataset_stats.py Per-split counts
src/model.py Backbone + classifier head
src/dataset.py Dataset and augmentations
src/bin_hint.py Illustrative class → bin labels for console output / JSON

About

♻️ A computer vision project that identifies types of trash from images to help people recycle properly.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages