GPGPU Patterns for Serial and Parallel Audio Effects

Travis Skare
DAFx-2020 - Vienna (virtual)
Modern commodity GPUs offer high numerical throughput per unit of cost, but often sit idle during audio workstation tasks. Various researches in the field have shown that GPUs excel at tasks such as Finite-Difference Time-Domain simulation and wavefield synthesis. Concrete implementations of several such projects are available for use. Benchmarks and use cases generally concentrate on running one project on a GPU. Running multiple such projects simultaneously is less common, and reduces throughput. In this work we list some concerns when running multiple heterogeneous tasks on the GPU. We apply optimization strategies detailed in developer documentation and commercial CUDA literature, and show results through the lens of real-time audio tasks. We benchmark the cases of (i) a homogeneous effect chain made of previously separate effects, and (ii) a synthesizer with distinct, parallelizable sound generators.