Object detection in deep learning
The goal is to predict the localization of objects in an image via bounding boxes and the classes of the located objects. With deep learning approach we tackle this problem with two type of models called one stage or two stages detectors. Even if they share a common gold the two approach are sightly different and can be summarized as follow:
Common tricks used in object detection model Bounding Box Regression We define by p $=(p_{x},p_{y},p_{w},p_{h})$ the prediction of the bounding box regressor, where x and y correspond to the centred coordinate and h and w its height, weight the corresponding ground truth box coordinates g $=(g_{x},g_{y},g_{w},g_{h})$, the regressor is configured to learn scale-invariant transformation between two centers and log-scale transformation between widths and heights.