3D Object Detection and Tracking in Monocular Images

We present a method that estimates in real-time and under challenging conditions the 3D pose of a known object. Our method relies only on grayscale images since depth cameras fail on metallic objects; it can handle poorly textured objects, and cluttered, changing environments; the pose it predicts degrades gracefully in presence of large occlusions. As a result, by contrast with the state-of-the-art, our method is suitable for practical Augmented Reality applications even in industrial environments. To be robust to occlusions, we first learn to detect some parts of the target object. Our key idea is to then predict the 3D pose of each part in the form of the 2D projections of a few control points. The advantages of this representation is three-fold: We can predict the 3D pose of the object even when only one part is visible; when several parts are visible, we can combine them easily to compute a better pose of the object; the 3D pose we obtain is usually very accurate, even when only few parts are visible.

Code and Data

Data:

3D Object Tracking from Monocular Images 3D Object Tracking from Monocular Images ICCV 2015 Challenge : 3D Rigid Tracking from RGB Images

Publications

Robust 3D Object Tracking from Monocular Images using Stable Parts
Alberto Crivellaro, Mahdi Rad, Yannick Verdie, Kwang Moo Yi, Pascal Fua, and Vincent Lepetit
In Proc. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 A Novel Representation of Parts for Accurate 3D Object Detection and Tracking in Monocular Images
Alberto Crivellaro, Mahdi Rad, Yannick Verdie, Kwang Moo Yi, Pascal Fua, and Vincent Lepetit
In Proceedings of the International Conference on Computer Vision, 2015.