Thanks to our sponsors.

TitelManaged Processing on the GPU
Time period2011 - 2014






FWF Stand-alone Project


Project NumberP23329
























Volumetric data is very common in medicine, geology or engineering, but the high complexity in data and algorithms has prevented widespread use of volume graphics. Recently, however, 3D image processing and visualization algorithms have been parallelized and ported to graphics processing units (GPUs). This proposal is concerned with new ways of designing volume graphics algorithms for the GPU that can interactively cope with these huge problems by better utilization of GPU capacity. Unfortunately, only certain parts of common image or volume processing algorithms can be mapped to the standard GPU stream processing model. For most real-world problems, writing programs for this architecture is a tedious task. As a result, most algorithms use the available processing power only for small subtasks -- the number crunching in inner loops. For example, direct volume rendering (DVR) methods send rays into a volumetric object, accumulate intensities, divide rays into sub-rays, scatter rays in materials and/or extract certain features. All GPU implementations of DVR use one processing unit for one pixel, regardless of whether the pixel will require very complex calculations or not. This strategy frequently leads to strong load imbalances. A particular problem of interactive applications such as volume graphics is that they are not traditional number crunching tasks, which only require optimal computational throughput, while having relaxed or no constraints concerning latency. On the contrary, interactive applications demand meeting real-time deadlines to ensure interactive response. This is a classical real-time resource scheduling problem. It can only be achieved by adaptive algorithms that rely on complex flow control and memory management decisions during the parallel execution. Both is currently only available on the CPU, which allows access to privileged mode through the operating system. On the GPU, components for high level scheduling involving latency hiding and memory management are missing or inaccessible. The desired full utilization of the GPU is very difficult to achieve for complex graphics algorithms with real-time demands. Building a toolset that allows harvesting the full GPU power for a general class of real-time volume graphics algorithms is the main goal of this proposal. We propose a managed volume processing system that incorporates the missing components. Its key modules are a task model, a workload scheduler with real-time capabilities and a virtual memory management system executed in tandem on the GPU and CPU. We will rely on the most recent hardware developments and use OpenCL as the standardized interface to access them.