The ability to capture 3D-data in a way which is as easy as taking an image, has the potential to revolutionize the way we perceive and interact with our technical world. Especially the advent of Time-of-Flight camera technology has triggered a vision of new products like 3D Body-Scanners in our living rooms, mobile phones which are aware of their indoor location and augmented reality glasses, which make monitor and mouse obsolete.
3D camera modules based on the Time-of-Flight (ToF) sensing principle, which are as small as a two-Euro-coin and cheap, would be the ideal backend technology to make all of that possible. In one shot, these cameras capture the geometry of a scenery as 50.000 depth measurements which finally form a depth-image. These measurements can act as a means to measure human body shape and posture, recognize and interpret hand gestures and even create a 3D map of buildings
The algorithms we developed are based on the mathematical principle of energy minimization, where one seeks to minimize the energy (i.e. cost) of a problem-specific functional. In this way we created novel formulations for the computer vision problems of scene-flow estimation, image super-resolution and guided image denoising as global optimization problems. Computationally, we solved these problems efficiently based on a primal-dual approach. We further developed machine-learning techniques to increase image resolution from a single image, and to recognize human head and hand poses to enable novel applications for end-users.
Single image super-resolution is an important task in the field of computer vision and finds many practical applications. Current state-of-the-art methods typically rely on machine learning algorithms to infer a mapping from low- to high-resolution images. These methods use a single fixed blur kernel during training and, consequently, assume the exact same kernel underlying the image formation process for all test images. However, this setting is not realistic for practical applications, because the blur is typically different for each test image.