Research And Projects

We develop computer vision methods, including 3D object detection, hand pose estimation, geo-localization, and indoor 3D reconstruction, with application to augmented reality and robotics.


Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation

We propose an efficient transformer based architecture for 3D pose estimation of two-hands and object during complex interaction from a single RGB image.

Project Info

MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans

We propose a novel method for reconstructing floor plans from noisy 3D point clouds that proposes Monte Carlo Tree Search (MCTS) with integrated refinement step to solve this problem.

Project Info

Monte Carlo Scene Search for 3D Scene Understanding

We explore how a general AI algorithm can be used for 3D scene understanding in order to reduce the need for training data. More exactly, we propose a modification of the Monte Carlo Tree Search (MCTS) algorithm to retrieve objects and room layouts from noisy RGB-D scans.

Project Info

Hand-Object Pose Annotation and Estimation

We develop a method for automatic hand-object 3D pose annotation when captured with one or more RGBD cameras. We create large scale hand-object dataset using this method and make it public along with baseline results for hand pose estimation from single RGB image.

Project Info

General 3D Room Layout from a Single View by Render-and-Compare

We introduce a novel method for estimating 3D Room Layout from a single image.

Project Info

Domain Transfer for 3D Pose Estimation

While acquiring annotations for color images is a difficult task, we introduce a novel learning method for 3D pose estimation from color images.

Project Info

Robust Object Pose Estimation

We introduce a novel approach for object 3D pose estimation, which is inherently robust to partial occlusions of the object.

Project Info


3D Pose Estimation and 3D Model Retrieval

We present a scalable approach to retrieve 3D models for objects in the wild. Our method builds on the fact that knowing the object pose significantly reduces the complexity of the task.

Project Info


Feature Mapping

We propose a simple and efficient method for exploiting synthetic images when training a Deep Network to predict a 3D pose from an image.

Project Info


Physics-Based Hand Object Interaction

We propose a simple and efficient method for physics-based hand object interaction in VR.

Project Info


Segmentation-Based 3D Tracking

Given simple 2.5D city maps, we show how to exploit recent results in semantic segmentation to efficiently track a camera in urban environments.

Project Info


3D Pose Estimation

BB8 is a novel method for 3D object detection and pose estimation from color images only. It predicts the 3D poses of the objects in the form of 2D projections of the 8 corners of their 3D bounding boxes.

Project Info


ALCN: Adaptive Local Contrast Normalization

We propose a novel illumination normalization method that lets us learn to detect objects and estimate their 3D poses under challenging illumination condition from very few training samples.

Project Info


Hand Detection and 3D Pose Estimation

We introduce novel methods for predicting the 3D joint locations of a hand given a depth map using Convolutional Neural Networks (CNN).

Project Info


Geo-localization from Images and 2.5D Maps

We propose methods for accurate camera pose estimation in urban environments from single images and 2.5D maps made of the surrounding buildings’ outlines and their heights.

Project Info


Accurate Geo-Localization from Images

We present a method for large-scale geo-localization and global tracking of mobile devices in urban outdoor environments.

Project Info


Object Detection and 3D Pose Estimation

We introduce a simple but powerful approach to computing descriptors for object views that efficiently capture both the object identity and 3D pose.

Project Info


3D Object Tracking

We present a method that estimates in real-time and under challenging conditions the 3D pose of a known object.

Project Info


Learning to Detect Keypoints (at CVLab, EPFL)

We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly sensitive.

Project Info


Flying Object Detection from a Single Moving Camera (at CVLab, EPFL)

We propose an approach to detect flying objects such as UAVs and aircrafts when they occupy a small portion of the field of view, possibly moving against complex backgrounds, and are filmed by a camera that itself moves.

Project Info


Realistic Synthetic Data Generation (at CVLab, EPFL)

We propose a novel approach to synthesizing images that are effective for training object detectors.

Project Info


Older Projects at CVLab

CVLab