"Quality control is only as strong as its weakest inspection point โ letโs make it smarter."
This project automates defect localization in paper manufacturing by combining classical feature extraction with modern machine learning models. It demonstrates how techniques like HOG, Gabor filters, Canny edge detection, and Wavelet transforms, when paired with models like SVMs, Logistic Regression, and CNNs, can transform defect detection in industrial pipelines.
๐ Pre-trained Models: Hugging Face Repository
๐ Articles:
- Defect Detection in Papers & Model Comparison
- Feature Extraction & Data Preprocessing in Computer Vision
The complete pipeline is available in the master branch along with the report. Hereโs how the workflow was structured:
- โ Image Loading & Verification
- ๐ Class Distribution analysis (balanced dataset ensured)
- ๐งน Preprocessing for consistent inputs
- ๐จ Color Histograms & Binning โ minimal impact
- ๐งญ HOG (Histogram of Oriented Gradients) โ highly effective
- ๐ Gabor Filters โ captured fine-grained patterns
- โ๏ธ Canny Edge Detection โ sharp defect localization
- ๐ Wavelet Transform โ insights across resolutions
- โช Local Binary Patterns โ limited by blur
- Built a feature set from HOG, Gabor, Canny, and Wavelets
- Applied PCA โ reduced dimensions while retaining 90% variance
- Logistic Regression โ 99% train | 79% test
- Naive Bayes (Gaussian) โ comparable to LR after tuning
- SVM (Support Vector Machines) โ 86% train | 80% test
- CNNs โ struggled (38% test accuracy, optimization needed)
- Ensemble (SVM + LR + NB) โ 90% train | 81% test
๐ Key takeaway โ Classical + ensemble methods outperformed deep CNNs for this dataset.
- Generated visual heatmaps of detected defects on sample paper images
- Enhanced interpretability of classification results
- Languages & Libraries: Python (NumPy, Pandas, Scikit-learn, TensorFlow/Keras)
- Feature Extraction: OpenCV, skimage (HOG, Gabor, Canny, Wavelets, LBP)
- Modeling: Logistic Regression, Naive Bayes, SVM, CNN, Ensemble Learning
- Visualization: Matplotlib, Seaborn
- ๐ซ Only optimized code sections included for clarity
- ๐ ๏ธ Redundant/less effective parts omitted
- ๐ฎ Improvement Potential โ CNN architectures & hyperparameter tuning
- ๐ Data visualizations available in full report, trimmed here for brevity
- ๐ง Improve CNN training with data augmentation & better architectures
- ๐ง Integrate explainable AI (Grad-CAM, SHAP) for defect interpretability
- ๐ Deploy as an industrial quality-control dashboard
Your insights, feedback, and suggestions for model improvement are welcome! Feel free to fork, experiment, and share results.
๐ฅ With this pipeline, industrial paper defect detection becomes faster, more accurate, and more explainable.