Download Binaural simulations using audio rate FDTD schemes and CUDA
Three dimensional finite difference time domain schemes can be used as an approach to spatial audio simulation. By embedding a model of the human head in a 3D computational space, such simulations can emulate binaural sound localisation. This approach normally relies on using high sample rates to give finely detailed models, and is computationally intensive. This paper examines the use of head models within audio rate FDTD schemes, ranging from 176.4 down to 44.1 kHz. Using GPU computing with Nvidia’s CUDA architecture, simulations can be accelerated many times over a serial computation in C. This allows efficient, dynamic simulations to be produced where sounds can be moved around during the runtime. Sound examples have been generated by placing a personalised head model inside an anechoic cube. At the lowest sample rate, 44.1 kHz, localisation is clear in the horizontal plane but much less so in the other dimensions. At 176.4, there is far greater three dimensional depth, with perceptible front to back, and some vertical movement.
Download Timpani Drum Synthesis in 3D on GPGPUs
Physical modeling sound synthesis for systems in 3D is a computationally intensive undertaking; the number of degrees of freedom is very large, even for systems and spaces of modest physical dimensions. The recent emergence into the mainstream of highly parallel multicore hardware, such as general purpose graphical processing units (GPGPUs) has opened an avenue of approach to synthesis for such systems in a reasonable amount of time, without severe model simplification. In this context, new programming and algorithm design considerations appear, especially the ease with which a given algorithm may be parallelized. To this end finite difference time domain methods operating over regular grids are explored, with regard to an interesting and non-trivial test problem, that of the timpani drum. The timpani is chosen here because its sounding mechanism relies on the coupling between a 2D resonator and a 3D acoustic space (an internal cavity). It is also of large physical dimensions, and thus simulation is of high computational cost. A timpani model is presented, followed by a brief presentation of finite difference time domain methods, followed by a discussion of parallelization on GPGPU, and simulation results.
Download On the limits of real-time physical modelling synthesis with a modular environment
One goal of physical modelling synthesis is the creation of new virtual instruments. Modular approaches, whereby a set of basic primitive elements can be connected to form a more complex instrument have a long history in audio synthesis. This paper examines such modular methods using finite difference schemes, within the constraints of real-time audio systems. Focusing on consumer hardware and the application of parallel programming techniques for CPU processors, useable combinations of 1D and 2D objects are demonstrated. These can form the basis for a modular synthesis environment that is implemented in a standard plug-in architecture such as an Audio Unit, and controllable via a MIDI keyboard. Optimisation techniques such as vectorization and multi-threading are examined in order to maximise the performance of these computationally demanding systems.
Download Large stencil operations for GPU-based 3-D acoustics simulations
Stencil operations are often a key component when performing acoustics simulations, for which the specific choice of implementation can have a significant effect on both accuracy and computational performance. This paper presents a detailed investigation of computational performance for GPU-based stencil operations in two-step finite difference schemes, using stencils of varying shape and size (ranging from seven to more than 450 points in size). Using an Nvidia K20 GPU, it is found that as the stencil size increases, compute times increase less than that naively expected by considering only the number of computational operations involved, because performance is instead determined by data transfer times throughout the GPU memory architecture. With regards to the effects of stencil shape, performance obtained with stencils that are compact in space is mainly due to efficient use of the read-only data (texture) cache on the K20, and performance obtained with standard high-order stencils is due to increased memory bandwidth usage, compensating for lower cache hit rates. Also in this study, a brief comparison is made with performance results from a related, recent study that used a shared memory approach on a GTX 670 GPU device. It is found that by making efficient use of a GTX 660Ti GPU—whose computational performance is generally lower than that of a GTX 670—similar or better performance to those results can be achieved without the use of shared memory.
Download Large-scale Real-time Modular Physical Modeling Sound Synthesis
Due to recent increases in computational power, physical modeling synthesis is now possible in real time even for relatively complex models. We present here a modular physical modeling instrument design, intended as a construction framework for string- and bar- based instruments, alongside a mechanical network allowing for arbitrary nonlinear interconnection. When multiple nonlinearities are present in a feedback setting, there are two major concerns. One is ensuring numerical stability, which can be approached using an energy-based framework. The other is coping with the computational cost associated with nonlinear solvers—standard iterative methods, such as Newton-Raphson, quickly become a computational bottleneck. Here, such iterative methods are sidestepped using an alternative energy conserving method, allowing for great reduction in computational expense or, alternatively, to real-time performance for very large-scale nonlinear physical modeling synthesis. Simulation and benchmarking results are presented.
Download Real-Time Modal Synthesis of Nonlinearly Interconnected Networks
Modal methods are a long-established approach to physical modeling sound synthesis. Projecting the equation of motion of a linear, time-invariant system onto a basis of eigenfunctions yields a set of independent forced, lossy oscillators, which may be simulated efficiently and accurately by means of standard time-stepping methods. Extensions of modal techniques to nonlinear problems are possible, though often requiring the solution of densely coupled nonlinear time-dependent equations. Here, an application of recent results in numerical simulation design is employed, in which the nonlinear energy is first quadratised via a convenient auxiliary variable. The resulting equations may be updated in time explicitly, thus avoiding the need for expensive iterative solvers, dense linear system solutions, or matrix inversions. The case of a network of interconnected distributed elements is detailed, along with a real-time implementation as an audio plugin.
Download Real-time Gong Synthesis
Physical modeling sound synthesis is notoriously computationally intensive. But recent advances in algorithm efficiency, accompanied by increases in available computing power have brought real-time performance within range for a variety of complex physical models. In this paper, the case of nonlinear plate vibration, used as a simple model for the synthesis of sounds from gongs is considered. Such a model, derived from that of Föppl and von Kármán, includes a strong geometric nonlinearity, leading to a variety of perceptually-salient effects, including pitch glides and crashes. Also discussed here are input excitation and scanned multichannel output. A numerical scheme is presented that mirrors the energetic and dissipative properties of a continuous model, allowing for control over numerical stability. Furthermore, the nonlinearity in the scheme can be solved explicitly, allowing for an efficient solution in real time. The solution relies on a quadratised expression for numerical energy, and is in line with recent work on invariant energy quadratisation and scalar auxiliary variable approaches to simulation. Implementation details, including appropriate perceptuallyrelevant choices for parameter settings are discussed. Numerical examples are presented, alongside timing results illustrating realtime performance on a typical CPU.