Image Detection and Classification

Dataset Preparation

The Training Dataset was prepared by scraping images using python script from Oriental Image Database and The IBC Bird Collection. (15,436 images for 35 classes). The bounding boxes for Training Dataset was then prepared using the software RectLabel. Proper Training-Validation-Test Split (70:20:10 ) was done for model evaluation purpose at the later stage.

Image Detection Model Training

For the detection purpose, we tried several pretrained models from Mobile-net Single Shot Detectors to Mask-RCNN. We fine-tuned these models and evaluate the Detection accuracy and IOU metric score on their predicted bounding boxes against the bounding boxes prepared using RectLabel Software. The best results came with Mobile-net SSD architecture with pre-trained weights from COCO Dataset.On fine-tuning, Around 94.2% of the birds were correctly detected from the test images with confidence value>0.5. The mean IOU Metric of the detected birds were found to be 0.647 on the test dataset. So we finalised this model as our Detection model.

Classification Model Training

We then cropped the bounding boxes from the Training Images for further classification. Several pre-trained networks from VGG to ResNet 50 were tried and we finally decided to go with Resnet 50 architecture. Transfer Learning approach was followed on each of the model. For the initial trials, The weights from ImageNet were taken and the initial few layers were freezed. The final dense layer after the global average pooling were only trained at each stage. Augmentation was done on each image sample. In the dense layer too, different combination of no of layers and no of nodes were tried and for each, the validation accuracy was recorded. But we were not able to get more than 0.65 F1 score and 70% accuracy with these approaches. So, we started experimenting with the frozen layers. We unfreeze some more layers and achieved a better F1 score. This was a tradeoff between training time. In our final model, we froze top 150 layers from ResNet 50 model and trained only the last 25+Dense layers. At this configuration, we were able to achieve 90% accuracy on test dataset. Beyond this, the test accuracy started increasing signalling the model has overfit with the limited no of image samples and huge no of trainable parameters.

Results:

Training Accuracy: 95.9% Training F1 Score: 0.957
Validation Accuracy: 90.1% Validation F1 Score: 0.886
Test Accuracy:90.0% Test F1 Score: 0.897

F1 score (Classification)
0.8936170212765957
0.8947368421052632
0.9534883720930233
0.9448818897637795
0.8
0.9051094890510949
0.9
0.8450704225352113
0.9174311926605505
0.875
0.9354838709677419
0.9007633587786259
0.8918918918918919
0.9
0.7272727272727273
0.8990825688073395
0.9009009009009009
0.868421052631579
0.8985507246376812
0.875
0.905982905982906
0.8767123287671232
0.9333333333333333
0.9421487603305785
0.926829268292683
0.9357798165137615
0.9108910891089109
0.8695652173913043
0.9647058823529412
0.9541284403669725
0.8888888888888888
0.8620689655172413
0.8163265306122449
0.9565217391304348
0.9583333333333334

Bird Class	Counts
Brahminy_maina	234
Bulbul	593
Collared_dove	439
Common_myna	631
Common_sparrow	442
Coppersmith	729
Crow_pheasant	310
Drongo	372
Golden_backed_woodpecker	512
Green Barbet	314
Hoopoe	315
House_Crow	675
Indian_hornbill	392
Indian_robin	300
Jungle_Crow	306
Jungle_babbler	553
Koel	595
Little_green_beeeater	378
Magpie_robin	342
Owlet	333
Parakeet	563
Pariah_kite	342
Partridge	413
Peacock	639
Pied_myna	189
Pied_wagtail	580
Pigeon	505
Pond_heron	311
Red_wattled_lapwing	421
Rufous_backed_shrike	541
Shikra	573
Sunbird	517
Tailor_bird	430
White_breasted_kingfisher	358
White_breasted_water_hen	488

Dataset Preparation

Image Detection Model Training

Classification Model Training

Results:

Share this: