Source code downloads of projects developed within the MVP project

Massively Parallel Queuing

This is a preview source code release for a set of GPU queuing tests. Feel free to investigate our queues. A more in depth release merged with the current version of Whippletree will be available soon.

Queuing Testbed 0.1.0 preview CUDA source


We present Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU. We introduce a new programming model which offers the simplicity and expressiveness of task-based parallelism while retaining all aspects of the multi-level execution hierarchy essential to unlocking the full potential of a modern GPU. At the same time, our programming model lends itself to efficient implementation on the SIMD-based architecture typical of a current GPU. We demonstrate the practical utility of our model by providing a reference implementation on top of current CUDA hardware. Furthermore, we show that our model compares favorably to traditional approaches in terms of both performance as well as the range of applications that can be covered. We demonstrate the benefits of our model for recursive Reyes rendering, procedural geometry generation and volume rendering with concurrent irradiance caching.

Whippletree 0.8.0 preview CUDA source (zip)


Softshell, a novel execution model for devices composed of multiple processing cores operating in a single instruction, multiple data fashion, such as graphics processing units (GPUs). The Softshell model is intuitive and more flexible than the kernel-based adaption of the streamprocessingmodel, which is currently the dominant model for general purpose GPU computation. Using the Softshell model, algorithms with a relatively low local degree of parallelism can execute efficiently on massively parallel architectures. Softshell has the following distinct advantages: (1)work can be dynamically issued directly on the device, eliminating the need for synchronization with an external source, i.e., the CPU; (2) its three-tier dynamic scheduler supports arbitrary scheduling strategies, including dynamic priorities and real-time scheduling; and (3) the user can influence, pause, and cancel work already submitted for parallel execution. The Softshell processing model thus brings capabilities to GPU architectures that were previously only known from operating-system designs and reserved for CPU programming.

SoftShell 1.0.3 preview CUDA source (zip)


ScatterAlloc is a dynamic memory allocator for the GPU. It is designed concerning the requirements of massively parallel execution. ScatterAlloc greatly reduces collisions and congestion by scattering memory requests based on hashing. It can deal with thousands of GPU-threads concurrently allocating memory and its execution time is almost independent of the thread count. ScatterAlloc is open source and easy to use in your CUDA projects.

ScatterAlloc 1.0.1 CUDA source (zip)