V-MAV: Cooperative micro aerial vehicles using onboard visual sensors

In the DACH project V-MAV, the aim was to improve image-based algorithms used to control Micro Aerial Vehicles (MAVs). The three partners, TU Graz, TU Munich and ETH Zurich, worked on localization and pose estimation for MAVs using multi-camera systems and visual-inertial systems (i.e., camera systems also using accelerometer, gyroscope and compass data), embedded image processing algorithms (i.e., usage of specifically designed hardware for image processing, with which it is possible to make the MAV smaller and the image processing faster) and on mapping of the environment using images taken with MAVs. The mapping part also made use of additional scene meta-information (e.g., semantic information – which part of the scene is a tree, which part is a house) in order to improve the mapping result.

In our project part, we mainly focused on visual localization and mapping. Our investigations in visual localization resulted in an image-based localization method, which runs in real-time and, hence, can be used for navigating an MAV. Our system mainly uses vertical lines to compute the camera movement. Such lines occur very often in man-made environments (e.g., at windows, doors, building outlines) and, hence, can be used especially for such environments to improve the localization results. In order to detect vertical lines in a fast way, we used an Inertial Measurement Unit (IMU), which delivers accelerometer and gyroscope data, to detect the gravity direction and consecutively detected lines which are parallel to the gravity direction. In a next step, we incorporated the IMU information directly into our localization algorithm in order to improve the localization quality.

In the area of 3D mapping, we investigated in several reconstruction techniques to create compact models, which can be transmitted easily via network, and visually appealing 3D models especially for urban environments. Our first algorithm delivered very compact and visually appealing representations of buildings and specific scene structures. To get similar results for arbitrary urban environments, we developed an additional approach, which detects planes in the scene and makes these plane surfaces selectable in the final reconstruction process. By adjusting the reconstruction parameters, it is possible to adjust how precisely the reconstruction should follow the planes. Additionally, we used semantic information (i.e., which parts in the image are trees, buildings, streets) to improve the 3D reconstruction result. We used artificial intelligence methods to segment images into different semantic classes. Then, using this semantic information, we adjusted the 3D reconstruction parameters depending on the semantic class (i.e., a façade should be planar, a tree should have a smooth surface) and showed that this improves the reconstruction result.

To top



The VMAV project is a collaborative DACH project with partners from ETH Zürich, TU München and TU Graz.

Institute for Computer Graphics and Vision, Graz University of Technology 
Institute of Visual Computing, ETH Zürich
Remote Sensing Technology, Technische Universität München



To top

Project related publications


  • Plane-based Surface Regularization for Urban 3D Reconstruction
    Holzmann, T., Oswald, M.R., Pollefeys, M., Fraundorfer, F., Bischof, H.
    2017 in: 28th British Machine Vision Conference (BMVC)
    [PDF] [Supp. Material] [Video]
  • Regularized 3D Modeling from Noisy Building Reconstructions
    Holzmann, T., Fraundorfer, F., Bischof, H.
    2016 in: 4th International Conference on 3D Vision (3DV)
    [PDF] [Video]
  • Direct Stereo Visual Odometry Based on Lines
    Holzmann, T., Fraundorfer, F., Bischof, H.
    2016 in : International Conference on Computer Vision Theory and Applications (VISAPP), Best Paper Award
  • A New Paradigm for Matching UAV- and Aerial Images
    Koch, Tobias; Zhuo, Xiangyu; Reinartz, Peter; Fraundorfer, Friedrich.
    ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences. 2016. S. 83-90.
  • Automatic alignment of indoor and outdoor building models using 3d line segments
    Koch, Tobias; Fraundorfer, Friedrich.
    IEEE/CVF CVPR Workshop on Visual Analysis of Satellite to Street Imagery, USA. 2016.
  • The tum-dlr multimodal earth observation evaluation benchmark
    Koch, Tobias; d'Angelo, Pablo; Kurz, Franz; Fraundorfer, Friedrich; Reinartz, Peter ; Körner, Marco.
    IEEE/CVF CVPR Workshop on Visual Analysis of Satellite to Street Imagery, USA . 2016.



To top


Visual Odometry

Our work about "Direct Stereo Visual Odometry based on Lines" won the Best Paper Award at the International Conference on Computer Vision Theory and Applicaitions (VISAPP) in February 2016 in Rome! More details about our work can be found here.

droneSpace inauguration

Our droneSpace at TUG has been finished. It is equipped with an OptiTrack tracking system to control our Pixhawk UAV's.

See below a video of one of the first flight tests.

To top



Mavmap is a structure-from-motion system developed by the VMAV partners. It is designed to compute 3D reconstruction from typical UAV imagery.

Mavmap is open-source and hosted on Github (Mavmap repository).

Project team:
Friedrich Fraundorfer
Ass.Prof. Dipl.-Ing. Dr.techn.
Thomas Holzmann
Dipl.-Ing. BSc