Documente Academic
Documente Profesional
Documente Cultură
We use Faster RCNN to detect object because Faster RCNN consumes less time than RCNN[11]. there are
two modules one is region proposal network (RPN) and fast- RCNN[12] detector.
region proposal network gives some rectangular object proposal and their objectness scores. It tells the
detector module where to look at. We use RESNET [13] network. We used max overlap threshold for RPN
0.7 and RPN stride 16. In our competition, we deal with small object. This is quite different from normal
image dataset such as PASCAL VOC, COCO etc. state-of-the-art object detection algorithm fails to give
satisfactory result to detect small objects in image.[14] Also we have some hardware limitation that’s why
we use some technique which support us to give satisfactory result in short time.
Our dataset has some unique features and we use this for our favors. Since usually traffic sign lies above
the horizon as this is viewed from a vehicle we cut down some lower portion of image which doesn’t
change anything but reduce computational time. Our original image was 1628*1236, after cutting it
become 1628*660. Then we cut our image into two part as left side and right side and keep overlap of 70
pixels. We train one model for left and one model for right side this helps us to use two gpu simultaneously
for each model and high resolution original image. But here we find a problem, in our dataset ROI is in less
number in left than right. So only left side ROI can’t train left side model well so we use right side and left
side ROI to train single model and use it twice since both right and left side image is similar. After using
this technique gives both models satisfactory result. We also made some modification in ground truth box
for training.
We take bigger ground truth box to allow some background. Training model with sign with background ie.
context helps in detection.[14] There are 14 different traffic signs but we grouped them in 7 groups, similar
signs in same group, for CNN classifier which helps in detection. We also apply non-maxima suppression
to get biggest bounding box.[14]. We use anchor box scale [50, 150] and ratio [1 1] which is smaller than
11. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region
12. R. Girshick, “Fast R-CNN,” in IEEE International Conference on Computer Vision (ICCV), 2015.
13. Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun, “Deep Residual Learning for Image Recognition”,
14. Chen, C.; Liu, M.-Y.; Tuzel, C.O.; Xiao, J.,“R-cnn for small object detection.”