Object Detection
Detection of physical objects in still images or videos

The object detection mining service allows to detect one or more physical objects to be found in images and videos.
Input: Image file or video file. You can specify which frames are to be processed for a video.
Output: A set of detected objects will be returned for the image or each processed frame. For each detected object an axially parallel bounding box, an object category and a rating are returned. The rating indicates the certainty of the model regarding the category of the identified object within a bounding box.
In addition, an automatically generated ID is assigned to each detected object to allow the unambiguous identification of all detected objects in one media file. This ID has no relation to the category of the detected Object.
Model:
The mining service is using an EfficientDet-D4-Model which has been trained on the COCO dataset. The model can detect objects from 80 different categories of the COCO dataset.
Metrics:
The model is achieving a mean average precision (mAP) of 0.485 on the validation set of the COCO dataset.
Mining results details:
For the example image (see above) the object detection mining service could detect following objects:
ID; bounding box (px); category; rating
object-1; (12, 27), (401, 317); bicycle; 0.82
object-2; (467, 156), (591, 323); dog; 0.74
References:
- Mingxing Tan, Ruoming Pang, Quoc V. Le. EfficientDet: Scalable and Efficient Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10781-10790
- Lin TY. et al. (2014). Microsoft COCO: Common Objects in Context. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_48