Keras/Tensorflow implementation of the SSD
The base source code is placed in the ssd dir. Currently SSD300 NN based on VGG16 model is implemented - see ssd300_vgg16.py.
The SSD300-VGG16 is trained on Pascal VOC 2007+2012 dataset. Initially Ground-Truth Boxes are fetched from Pascal VOC dataset and stored as a hashtable. The keys of the hashtable are filenames, values are numpy arrays containing normalized bounding boxes, one-hot-encoded classes and difficulty property: [xmin, ymin, xmax, ymax, one-hot-encoded-class, is-difficult]. Ground-Truth Boxes are stored in the following pickle files:
- pascal_voc_2007_test.p (see http://host.robots.ox.ac.uk/pascal/VOC/voc2007/#testdata)
- pascal_voc_2007_trainval.p (see http://host.robots.ox.ac.uk/pascal/VOC/voc2007/#devkit)
- pascal_voc_2012_trainval.p (see http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit)
PriorBoxes are generated as in the origin Caffe implementation.
PriorBoxes.ipynb contains samples of how PriorBoxes might look.
See DataAugmentation.ipynb for the whole process samples. See Imaging.ipynb for photo-metric distortion samples.
Mining hard examples is implemented in the SsdLoss class
SSD300-VGG16.ipynb contains the training process.
This implementation has a lower performance comparing to the original Caffe implementation: overall mAP = 66%
Class | AP (%) |
---|---|
aeroplane | 76 |
bicycle | 76 |
bird | 66 |
boat | 63 |
bottle | 39 |
bus | 71 |
car | 80 |
cat | 80 |
chair | 36 |
cow | 59 |
diningtable | 64 |
dog | 70 |
horse | 75 |
motorbike | 72 |
person | 70 |
pottedplant | 44 |
sheep | 56 |
sofa | 71 |
train | 76 |
tvmonitor | 68 |
See Anaconda env file for dependencies.
To create env use:
conda env create -f anaconda-dl-env.yml