High-Quality Urban Reconstructions by Fitting Shape Grammars to Images and derived Textured Point Clouds
Partners At CGV:
This project was funded by the austrian initiative FIT-IT
Challenge, Problem Statement
The goal of the CITYFIT project is, given highly redundant input imagery and range maps from an arbitrary building in Graz, to synthesize a shape grammar that, when evaluated, creates a clean, CAD- quality reconstruction of that building that fits the original data very closely and makes the semantics of all major architectural features explicit.
Within the Cityfit project we have developed an end to end workflow for 3D high- quality urban reconstruction. Starting from image sequences and sparse LIDAR information, we utilized a piecewise planar 3D model to fuse recognition confidences. The common interface between the modules was a 2D orthonormal view of each façade represented by an irregular lattice, which encodes the semantic repetition of architectural elements (doors, windows, balconies, etc) in a compact way. This lattice enabled a dynamic programming solution for the shape grammar matching and resulted in a high-level parse tree of the facade structures. The resulting parse tree was then represented using the generative modeling language (GML). The overall workflow was evaluated on a dataset in Graz consisting of 27000 images and 95 gigabytes of visual and 3D input data, which was reduced to a total of two megabytes in the GML model. --- The work of the CGV mostly concentrated on two areas: Procedural modeling of facades and the grammar parsing algorithm that runs on top of the initial object recognition pass.
Procedural facade modeling
In this part of the project, a novel methodology for rule based facade modeling using convex polyhedra as modeling primitives was developed. The structure of such buildings can be varied by exchanging a few lines of code.
Facade Grammar Parsing
The structure is determined by a machine learning approach: A classificator was trained for detecting the probabilites of windows, wall, door and sky. Using maximum aposteriori estimation (MAP) per pixel yields a noisy segmentation. After grammar parsing, symmetries and repetitions are obtained (right). The parse trees are converted to a procedural model afterwards.
Hohmann, B., Havemann, S., Krispel, U. & Fellner, D., (2010), "A GML shape grammar for semantically enriched 3D building models", Computers & Graphics, Vol.34(4), pp.322-334.
Thaller, W., Krispel, U., Zmugg, R., Havemann, S. & Fellner, D.W., (2013), "Shape Grammars on Convex Polyhedra", Computers & Graphics, Vol.37(6), pp.707-717.
Riemenschneider, H., Krispel, U., Thaller, W., Donoser, M., Havemann, S., Fellner, D.W. & Bischof, H., (2012), "Irregular lattices for complex shape grammar facade parsing", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1640-1647, IEEE.