Image Detection and Classification

Dataset Preparation

The Training Dataset was prepared by scraping images using python script from Oriental Image Database and The IBC Bird Collection. (15,436 images for 35 classes). The bounding boxes for Training Dataset was then prepared using the software RectLabel. Proper Training-Validation-Test Split (70:20:10 ) was done for model evaluation purpose at the later stage.

Image Detection Model Training

For the detection purpose, we tried several pretrained models from Mobile-net Single Shot Detectors to Mask-RCNN. We fine-tuned these models and evaluate the Detection accuracy and IOU metric score on their predicted bounding boxes against the bounding boxes prepared using RectLabel Software. The best results came with Mobile-net SSD architecture with pre-trained weights from COCO Dataset.On fine-tuning, Around 94.2% of the birds were correctly detected from the test images with confidence value>0.5. The mean IOU Metric of the detected birds were found to be 0.647 on the test dataset. So we finalised this model as our Detection model.

Classification Model Training

We then cropped the bounding boxes from the Training Images for further classification. Several pre-trained networks from VGG to ResNet 50 were tried and we finally decided to go with Resnet 50 architecture. Transfer Learning approach was followed on each of the model. For the initial trials, The weights from ImageNet were taken and the initial few layers were freezed. The final dense layer after the global average pooling were only trained at each stage. Augmentation was done on each image sample. In the dense layer too, different combination of no of layers and no of nodes were tried and for each, the validation accuracy was recorded. But we were not able to get more than 0.65 F1 score and 70% accuracy with these approaches. So, we started experimenting with the frozen layers. We unfreeze some more layers and achieved a better F1 score. This was a tradeoff between training time. In our final model, we froze top 150 layers from ResNet 50 model and trained only the last 25+Dense layers. At this configuration, we were able to achieve 90% accuracy on test dataset. Beyond this, the test accuracy started increasing signalling the model has overfit with the limited no of image samples and huge no of trainable parameters.

Results:

  • Training Accuracy: 95.9% Training F1 Score: 0.957
  • Validation Accuracy: 90.1% Validation F1 Score: 0.886
  • Test Accuracy:90.0% Test F1 Score: 0.897
F1 score (Classification)
0.8936170212765957
0.8947368421052632
0.9534883720930233
0.9448818897637795
0.8
0.9051094890510949
0.9
0.8450704225352113
0.9174311926605505
0.875
0.9354838709677419
0.9007633587786259
0.8918918918918919
0.9
0.7272727272727273
0.8990825688073395
0.9009009009009009
0.868421052631579
0.8985507246376812
0.875
0.905982905982906
0.8767123287671232
0.9333333333333333
0.9421487603305785
0.926829268292683
0.9357798165137615
0.9108910891089109
0.8695652173913043
0.9647058823529412
0.9541284403669725
0.8888888888888888
0.8620689655172413
0.8163265306122449
0.9565217391304348
0.9583333333333334
Bird Class Counts
Brahminy_maina 234
Bulbul 593
Collared_dove 439
Common_myna 631
Common_sparrow 442
Coppersmith 729
Crow_pheasant 310
Drongo 372
Golden_backed_woodpecker 512
Green Barbet 314
Hoopoe 315
House_Crow 675
Indian_hornbill 392
Indian_robin 300
Jungle_Crow 306
Jungle_babbler 553
Koel 595
Little_green_beeeater 378
Magpie_robin 342
Owlet 333
Parakeet 563
Pariah_kite 342
Partridge 413
Peacock 639
Pied_myna 189
Pied_wagtail 580
Pigeon 505
Pond_heron 311
Red_wattled_lapwing 421
Rufous_backed_shrike 541
Shikra 573
Sunbird 517
Tailor_bird 430
White_breasted_kingfisher 358
White_breasted_water_hen 488