lung-cancer-image-classification

Lung cancer image classification in Python using LIDC dataset. Images are processed using local feature descriptors and transformation methods before input into classifiers.

Project Objective

To identify the best local feature extraction and image transformation method for lung cancer image classification
To develop a model for lung cancer classification
To develop a prototype of image classification tool to categorize malignant and benign lung nodules

Methods Used

Image Transformation
Dimensionality Reduction
Machine Learning

Technologies

Python
Python scikit-learn
Python pandas, flask
Jupyter

Project Description

config.py - global variables
preprocessing.py - preprocessing methods
image_processing.py - image transformations methods
import_data.py - read and convert raw data
data_lidc.py - generates features from LIDC dataset
main.py - train models
Models Comparison.ipynb - models comparison

Data source from cancerimagingarchive.net consists of 1018 labelled CT scans cases.


Dataset CT scan slices.

Data from dicom format is read into array.


Flow of data to classifiers.

K-means algorithm is used to group features extracted from images. Images transformed are directly fed into classiifers. A comparison is made for the each local feature descriptors and image transformation methods in the diagram.


One example of image transformations, wavelet tranform.


Best accuracy obtained after 3rd wavelet transformation and LBP clustering


Screenshot of flask app running.

Process Flow

frontend development
data collection
data processing/cleaning
image transformation
model training
writeup/reporting

Future Improvements

This is my first time experimenting on a large dataset. Make use of data pipeline for clean and reusable codes. Try on hadoop to handle insufficient memory.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.idea		.idea
assets		assets
dask-worker-space		dask-worker-space
image_output		image_output
jupyter notebook		jupyter notebook
lidc_data/box_plot		lidc_data/box_plot
spiee_data		spiee_data
.gitignore		.gitignore
Models Comparison.ipynb		Models Comparison.ipynb
README.md		README.md
_config.yml		_config.yml
config.py		config.py
data_lidc.py		data_lidc.py
image_processing.py		image_processing.py
import_data.py		import_data.py
main.py		main.py
main_dask.py		main_dask.py
preprocessing.py		preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lung-cancer-image-classification

Project Objective

Methods Used

Technologies

Project Description

Process Flow

Future Improvements

About

Releases

Packages

Languages

yeexunwei/lung-cancer-image-classification

Folders and files

Latest commit

History

Repository files navigation

lung-cancer-image-classification

Project Objective

Methods Used

Technologies

Project Description

Process Flow

Future Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages