The repository contains the prediction code for a classifier that could guess which country the temple is in. The code gets the path to a directory with images as a parameter, and returns a CSV file with the results. The solution has option of three pretrained models: Efficient Net B3, Efficient B0, VGG16. Efficient Net B3 is the default model since it gives the highest accuracy.
library | version |
---|---|
numpy | 1.22.1 |
opencv-python | 4.5.5.62 |
pandas | 1.4.0 |
Pillow | 9.0.0 |
python-dateutil | 2.8.2 |
pytz | 2021.3 |
six | 1.16.0 |
torch | 1.10.1 |
torchvision | 0.11.2 |
typing_extensions | 4.0.1 |
pip3 install -r requirements.txt
bash predict.sh --input_path [PATH TO IMAGE DIRECTORY] --model [MODEL NAME] --pretrained_path [Path of the Pretrained Model] --output_dir [Path to the directory where output will be written] --image_shape [Dimension of the input shape]
Argument | Required/Optional | Meaning |
---|---|---|
--input_path | Required | Path of the Input Directory |
--model | Optional [Default: EFFB3] | Prediction Model to be used |
--pretrained_path | Optional [Default: models/temple-classifier-eff-best.pt] | Path of the pretrained model |
--output_dir | Optional [Default: data/output/] | Path of the Output Directory |
--image_shape | Optional [Default: 300] | Width and Height of the image |
bash predict.sh --input_path data/input/
bash predict.sh --input_path data/input/ --model EFFB0 --pretrained_path models/temple-classifier-effb0.pt --image_shape 224
bash predict.sh --input_path data/input/ --model VGG16 --pretrained_path models/temple-classifier-vgg.pt --image_shape 224
Paste the pretrained models downloaded from the link below in the models file
bash predict.sh --help
The images were divided in training and test dataset on a 80:20 split.
After Training-Testing split, the distribution is as follows
Category | #Images |
---|---|
Train | 576 |
Test | 138 |
Data augmentation was performed on training data.
The following augmentation techniques were applied:
- Random 90 Degree Rotation
- Random Crop
- Adding Gaussian Noise
- Adding Fog like noise
- Changing the color temperature(to give a night likeview)
- Redacting Random Parts of image
Data Augmentation is applied such that each class is augmented till we reach to a max of 400 image per class or we have augmented each image with a factor of 10
Class | Count |
---|---|
Australia | 28 |
Indonesia-Bali | 36 |
Germany | 86 |
Armenia | 9 |
Portugal+Brazil | 44 |
Japan | 50 |
Thailand | 84 |
Spain | 55 |
Malaysia+Indonesia | 44 |
Hungary+Slovakia+Croatia | 40 |
Russia | 100 |
Total | 576 |
Class | CountOfImagesAfterAugmentation |
---|---|
Australia | 280 |
Indonesia-Bali | 360 |
Germany | 400 |
Armenia | 90 |
Portugal+Brazil | 400 |
Japan | 400 |
Thailand | 400 |
Spain | 400 |
Malaysia+Indonesia | 400 |
Hungary+Slovakia+Croatia | 400 |
Russia | 400 |
Total | 3930 |
Each class of augmented train image is now randomly split into train and val dataset in 80:20
Category | #Images |
---|---|
train | 3144 |
val | 786 |
test | 138 |
The model is now finetuned on various CNN architectures trained on Imagenet data, Following changes are made to each of the net
- The size of the final layer is changed to suit the dataset making the size to x,11, x is the size of the output of the previous layer
- The weights of all but last CNN layer is freezed during the training process
Model | Accuracy | Weights |
---|---|---|
EfficientNet B3 | 84.061 | effb3weights |
EfficientNet B0 | 81.88 | effb0weights |
VGG16 | 81.15 | vggweights |
Please Look at notebook modelTraining.iypnb in the folder notebooks for more analysis performed; The notebook explains the reason for bottom two underperforming class
Armenia is not considered in tests due to only having 9 unaugmented images in train+val dataset
Following are the notebooks present in the notebooks folder
- ImageClassification_Efficientnet_B3.ipynb: Contains the training code for Efficient Net B3
- ImageClassification_Efficientnet_B0.ipynb: Contains the training code for Efficient Net B0
- ImageClassification_Efficientnet_VGG.ipynb: Contains the training code for VGG16
- modelTesting.ipynb: Contains the result calculation and the analysis of the results on test dataset
- DataSplit-Augmentation.ipynb: Contains the code to split the code in train and test; augmenment the train; split the augmented data in train and val
- DataVisualisation.ipynb: Visualiztion of data distribution
Category | Datalink |
---|---|
train | train |
val | val |
test | test |