GPU Rendering Pipelines

Pipeline designs are important in many areas of computer graphics. They allow streaming input data to become available gradually, and allow individual pipeline stages to be executed in parallel for different input elements. Moreover, pipeline stages are easily understood, reused and extended. Consequently, pipelines are used for realtime graphics (OpenGL/D3D), production rendering (Reyes), visualization, 3D printing and many more. The graphics processing unit (GPU) uses a hardware pipeline, but is designed for high efficiency rather than flexibility. Some stages allow programmable shaders, but the overall architecture is static. The order of stages cannot be changed, and several of the stages are fixed-function units with only minimal freedom for configuration. This inflexibility restricts research and development. New rendering architectures, which do not follow the predefined pipeline, can never compete with approaches that fit the hardware pipeline. Rather than waiting for hardware vendors to change or open up their designs, we propose to build a flexible, configurable software rendering platform that runs on the programmable compute units of current graphics processors, targeting modern GPU compute languages such as CUDA. Our platform supports the generation of arbitrary pipeline designs with an unrestricted number of pipeline stages, freely configurable intermediate data and stage connections. This flexibility is achievable by modeling all stages as well as stage connectors in software. The key element for achieving high performance is scheduling the pipeline workload to the execution cores in an optimal manner. To achieve these goals, we will investigate new ways to schedule graphics workloads, achieve work distribution between pipeline stages and support recursive pipelines with bounded memory. On top of the new rendering architecture, we will investigate advanced concepts such as frameless rendering, and non-linear rasterization. With the success of this project, we will provide the basis for novel research in the field of real-time rendering. We will help to envision new rendering pipelines, guide and inspire new hardware designs, and show that rendering techniques currently believed to be infeasible can run in real-time on current hardware. As with our previous research results, we plan to release our code under a permissive open- source license to motivate future collaborations.

Publications

Martin Winter, Daniel Mlakar, Mathias Parger, Markus Steinberger:
Ouroboros: virtualized queues for dynamic memory management on GPUs
International Conference on Supercomputing (ICS’20), 2020, pdf
Daniel Mlakar, Martin Winter, Pascal Stadlbauer, Hans-Peter Seidel, Markus Steinberger, Rhaleb Zayer:
Subdivision-Specialized Linear Algebra Kernels for Static and Dynamic Mesh Connectivity on the GPU
Eurographics ‘20 Best Paper Award
Computer Graphics Forum / Eurographics (EG'20), 2020, pdf
Wolfgang Tatzgern, Benedikt Mayr, Berhard Kerbl, Markus Steinberger:
Stochastic Substitute Trees for Real-Time Global Illumination
Proceedings of Symposium on Interactive 3D Graphics and Games (I3D ‘20), 2020, pdf
Mathias Parger, Martin Winter, Daniel Mlakar, Markus Steinberger:
spECK: Accelerating GPU Sparse Matrix-Matrix Multiplication Through Lightweight Analysis
Proceedings of the 25th Symposium on Principles and Practice of Parallel Programming, 2020, pdf
Jozef Hladky, Hans-Peter Seidel, Markus Steinberger:
The Camera Offset Space: Real-time Potentially Visible Set Computations for Streaming Rendering
ACM Transactions on Graphics (SIGGRAPH Asia'19), 2019, pdf
Jozef Hladky, Hans-Peter Seidel, Markus Steinberger:
Tessellated Shading Streaming
Computer Graphics Forum / Eurographics Symposium on Rendering (EGSR'19), 2019, pdf
Mark Dokter, Jozef Hladky, Mathias Parger, Dieter Schmalstieg, Hans-Peter Seidel, Markus Steinberger:
Hierarchical Rasterization of Curved Primitives for Vector Graphics Rendering on the GPU
Computer Graphics Forum / Eurographics (EG'19), 2019, pdf
Dominic Tödling, Martin Winter, Markus Steinberger:
Breadth-First Search on Dynamic Graphs using Dynamic Parallelism on the GPU
High Performance Extreme Computing, 2019, pdf
Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger:
Adaptive sparse matrix-matrix multiplication on the GPU
Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, 2019, pdf
Michael Kenzel, Bernhard Kerbl, Dieter Schmalstieg, Markus Steinberger:
A High-Performance Software Graphics Pipeline Architecture for the GPU
ACM Transactions on Graphics (SIGGRAPH'18), 2018, pdf
Bernhard Kerbl, Michael Kenzel, Elena Ivanchenko, Dieter Schmalstieg, Markus Steinberger:
Revisiting The Vertex Cache: Understanding and Optimizing Vertex Processing on the modern GPU
Proceedings of the ACM on Computer Graphics and Interaction Techniques (HPG'18), 2018, pdf
Michael Kenzel, Bernhard Kerbl, Wolfgang Tatzgern, Elena Ivanchenko, Dieter Schmalstieg, Markus Steinberger:
On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing
Proceedings of the ACM on Computer Graphics and Interaction Techniques (HPG'18), 2018, pdf
Bernhard Kerbl, Joerg H. Mueller, Michael Kenzel, Dieter Schmalstieg, Markus Steinberger:
The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU
International Conference on Supercomputing (ICS'18), 2018, pdf
Bernhard Kerbl, Michael Kenzel, Joerg H. Mueller, Dieter Schmalstieg, Markus Steinberger:
A scalable queue for work distribution on GPUs
ACM SIGPLAN Notices (PPoPP'18), 2018
Martin Winter, Rhaleb Zayer, Markus Steinberger:
Autonomous, Independent Management of Dynamic Graphs on GPUs
HPEC '17 Best Student Paper
High Performance Extreme Computing, 2017, pdf
Bernhard Kerbl, Michael Kenzel, Dieter Schmalstieg, Markus Steinberger:
Effective static bin patterns for sort-middle rendering
High Performance Graphics (HPG '17), 2017, pdf
Andreas Derler, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger:
Dynamic scheduling for efficient hierarchical sparse matrix operations on the GPU
International Conference on Supercomputing (ICS'17), 2017, pdf

Acknowledgements

This research is supported by the German Research Foundation (DFG) grant STE 2565/1-1, and
the Austrian Science Fund (FWF) grant I 3007.