Machines learn how to see

How does seeing work? What is a good image, and what is a bad image? How do we filter out the essential information from an image – the information we need to recognise what we see?

How does seeing work?

These are key questions for brain researchers as well as computer scientists, such as Thomas Pock, who is always looking for international cooperation to achieve progress in image processing. Since 2014, Thomas Pock has held an AIT-endowed professorship for Mobile Computer Vision at the Institute for Computer Graphics and Vision (ICG). His research work, developed in co-operation with colleagues from New York and Paris, focuses on mathematical models to distinguish between “good” images and “bad” images. Ultimately the objective is to filter machine-supplied image signals, extracting only the visual information that is absolutely essential for the reconstruction of a meaningful image with maximum detail.

Research cooperation with New York

One of Thomas Pock’s current projects is to build mathematical models for the reconstruction of two-dimensional images from magnetic resonance imaging signals (MRI signals). His aim is to get the best possible result from the smallest possible amount of signal data. If you need less data, you automatically shorten the scanning time in the MRI, which in turn allows you to increase the number of patients who can be scanned with the machine in one single day and thus reduces costs. In a research co-operation with Florian Knoll and Daniel K. Sodickson from the Department of Radiology at New York University School of Medicine, Thomas Pock and his PhD student Kerstin Hammernik developed an algorithm that does precisely that: it builds highquality images from the undersampled MRI signal data, taking just one sixth of the scan time of previous MRI scans. The research partners in the USA provide the MRI data that are needed to develop this method as well as the know-how in physics and the operating principles of the MRI devices, while Thomas Pock and Kerstin Hammernik devised the mathematical model that reconstructs the images from the undersampled MRI signal data in Graz. One fundamental problem of machine-based image recognition is the sheer endless number of possible images. “If you calculate all theoretical image variations for an image size of no more than 65 times 65 pixels and 256 grey scales, you obtain potentially many more different images than there are atoms in the universe, namely 25^665x65˜10^10000 versus 10^80.” This unimaginably large number of possible images explains simple comparison-based databases cannot work,” explains Thomas Pock.

Machine learning combined with image processing

“As human beings we know what high-quality images look like,” says Thomas Pock. “Now we want computers to do the same – to be able to recognise and classify images within fractions of a second.” In the past, we used to look for mathematical models capable of coping with this massive computing task virtually by hand. Now Thomas Pock combines new methods of machine learning with image processing methods. He designs image models with a large number of degrees of freedom that are able to reconstruct two-dimensional images from the signal data, constantly comparing the result with the ideal image. Thomas Pock adds: “The learning problem involves a loss function that measures the loss and calculates how much the momentarily reconstructed solution differs from the target image, i.e. the appearance of the tissue as reflected in the MRI data. This error is in turn back propagated into the model. This is done by calculating the gradient of the loss function which points in the direction of the strongest change. In this manner the model parameters can be varied to make the mistake smaller.Then you let the process continue until no detectable improvement can be achieved any more.”

Extract from the learned model parameters. Filter cores on the left, evaluation funkctions on the right.

The Figure shows an extract from the learned model parameters which basically consist of a large number of different filter cores and evaluation functions. The design step in Pock and Hammernik’s mathematical models, with its thousands of free parameters, is inspired by neuronal networks and based on the findings of more than 50 years of research. The calculation is carried out on a high performance computer of TU Graz. For the research work at the TU Graz institute it was decided in co-operation with the Central IT Service to purchase a supercomputer equipped with 16 of the most powerful graphic boards, each having a processing power of approximately four TeraFLOPS, meaning that each of these graphic boards is able to perform approximately four billion (4x1012) computer operations per second.

US patent

Pock’s method reconstructs MR images of comparable quality to that of currently generated images, but in a very short time – it only needs one sixth of the sampling time. The Figure shows the clear advantage of the learned method, in which the sampling time is reduced by a factor of six. A US patent for this new method is pending, and one particular manufacturer of MRI scanners has already shown strong interest.

The reconstruction of an MRI slice with an acceleration factor of 6. On the left, the result using a traditional method, which here leads to too many artifacts. In the centre, a reconstruction using the developed method; on the right, a reconstruction from the complete data as a comparison.

Paper in Acta Numerica

Thomas Pock recently summarized his theoretical insights in a comprehensive review article written jointly with Antonin Chambolle, professor at theCentre de Mathématiques Appliquées at the Ecole Polytechnique in Paris. His paper with the title “An introduction to continuous optimization for imaging” will be published in the Cambridge journal “Acta Numerica”, currently the world’s top-cited journal in the field of mathematics.

Kontakt

Thomas POCK
Univ.-Prof. Dipl.-Ing. Dr.techn.
Institute for Computer Graphics and Vision
Inffeldgasse 16/II
8010 Graz, Austria
Phone:+43 316 873 5056
Fax: +43 316 873 5050
<link int-link-mail window for sending>pock@icg.tugraz.at

Machines learn how to see