DAFx Paper Archive - Search for machine learning in papers byBilbao, S.

Differentiable grey-box modelling of phaser effects using frame-based spectral processing

Alistair Carson; Cassia Valentini-Botinhao; Simon King; Stefan Bilbao

DAFx-2023 - Copenhagen

Machine learning approaches to modelling analog audio effects have seen intensive investigation in recent years, particularly in the context of non-linear time-invariant effects such as guitar amplifiers. For modulation effects such as phasers, however, new challenges emerge due to the presence of the low-frequency oscillator which controls the slowly time-varying nature of the effect. Existing approaches have either required foreknowledge of this control signal, or have been non-causal in implementation. This work presents a differentiable digital signal processing approach to modelling phaser effects in which the underlying control signal and time-varying spectral response of the effect are jointly learned. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain, with a transfer function based on typical analog phaser circuit topology. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters. The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy. The optimal frame length depends on both the rate and transient decay-time of the target effect, but the frame length can be altered at inference time without a significant change in accuracy.

Download

Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Alistair Carson; Alec Wright; Jatin Chowdhury; Vesa Välimäki; Stefan Bilbao

DAFx-2024 - Guildford

In recent years, machine learning approaches to modelling guitar amplifiers and effects pedals have been widely investigated and have become standard practice in some consumer products. In particular, recurrent neural networks (RNNs) are a popular choice for modelling non-linear devices such as vacuum tube amplifiers and distortion circuitry. One limitation of such models is that they are trained on audio at a specific sample rate and therefore give unreliable results when operating at another rate. Here, we investigate several methods of modifying RNN structures to make them approximately sample rate independent, with a focus on oversampling. In the case of integer oversampling, we demonstrate that a previously proposed delay-based approach provides high fidelity sample rate conversion whilst additionally reducing aliasing. For non-integer sample rate adjustment, we propose two novel methods and show that one of these, based on cubic Lagrange interpolation of a delay-line, provides a significant improvement over existing methods. To our knowledge, this work provides the first in-depth study into this problem.

Download

Learning Nonlinear Dynamics in Physical Modelling Synthesis Using Neural Ordinary Differential Equations

Victor Zheleznov; Stefan Bilbao; Alec Wright; Simon King

DAFx-2025 - Ancona

Modal synthesis methods are a long-standing approach for modelling distributed musical systems. In some cases extensions are possible in order to handle geometric nonlinearities. One such case is the high-amplitude vibration of a string, where geometric nonlinear effects lead to perceptually important effects including pitch glides and a dependence of brightness on striking amplitude. A modal decomposition leads to a coupled nonlinear system of ordinary differential equations. Recent work in applied machine learning approaches (in particular neural ordinary differential equations) has been used to model lumped dynamic systems such as electronic circuits automatically from data. In this work, we examine how modal decomposition can be combined with neural ordinary differential equations for modelling distributed musical systems. The proposed model leverages the analytical solution for linear vibration of system’s modes and employs a neural network to account for nonlinear dynamic behaviour. Physical parameters of a system remain easily accessible after the training without the need for a parameter encoder in the network architecture. As an initial proof of concept, we generate synthetic data for a nonlinear transverse string and show that the model can be trained to reproduce the nonlinear dynamics of the system. Sound examples are presented.

Download

Differentiable All-Pole Filters for Time-Varying Audio Systems

Chin-Yun Yu; Christopher Mitcheltree; Alistair Carson; Stefan Bilbao; Joshua Reiss; György Fazekas

DAFx-2024 - Guildford

Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers. However, their recursive structure impedes end-toend training of these systems using automatic differentiation. Although non-recursive filter approximations like frequency sampling and frame-based processing have been proposed and widely used in previous works, they cannot accurately reflect the gradient of the original system. We alleviate this difficulty by reexpressing a time-varying all-pole filter to backpropagate the gradients through itself, so the filter implementation is not bound to the technical limitations of automatic differentiation frameworks. This implementation can be employed within audio systems containing filters with poles for efficient gradient evaluation. We demonstrate its training efficiency and expressive capabilities for modelling real-world dynamic audio systems on a phaser, time-varying subtractive synthesiser, and feed-forward compressor. We make our code and audio samples available and provide the trained audio effect and synth models in a VST plugin1 .

Download

Large-scale Real-time Modular Physical Modeling Sound Synthesis

Stefan Bilbao; Michele Ducceschi; Craig Webb

DAFx-2019 - Birmingham

Due to recent increases in computational power, physical modeling synthesis is now possible in real time even for relatively complex models. We present here a modular physical modeling instrument design, intended as a construction framework for string- and bar- based instruments, alongside a mechanical network allowing for arbitrary nonlinear interconnection. When multiple nonlinearities are present in a feedback setting, there are two major concerns. One is ensuring numerical stability, which can be approached using an energy-based framework. The other is coping with the computational cost associated with nonlinear solvers—standard iterative methods, such as Newton-Raphson, quickly become a computational bottleneck. Here, such iterative methods are sidestepped using an alternative energy conserving method, allowing for great reduction in computational expense or, alternatively, to real-time performance for very large-scale nonlinear physical modeling synthesis. Simulation and benchmarking results are presented.

Download

Timpani Drum Synthesis in 3D on GPGPUs

Stefan Bilbao; Craig Webb

DAFx-2012 - York

Physical modeling sound synthesis for systems in 3D is a computationally intensive undertaking; the number of degrees of freedom is very large, even for systems and spaces of modest physical dimensions. The recent emergence into the mainstream of highly parallel multicore hardware, such as general purpose graphical processing units (GPGPUs) has opened an avenue of approach to synthesis for such systems in a reasonable amount of time, without severe model simplification. In this context, new programming and algorithm design considerations appear, especially the ease with which a given algorithm may be parallelized. To this end finite difference time domain methods operating over regular grids are explored, with regard to an interesting and non-trivial test problem, that of the timpani drum. The timpani is chosen here because its sounding mechanism relies on the coupling between a 2D resonator and a 3D acoustic space (an internal cavity). It is also of large physical dimensions, and thus simulation is of high computational cost. A timpani model is presented, followed by a brief presentation of finite difference time domain methods, followed by a discussion of parallelization on GPGPU, and simulation results.

Download

Two polarisation finite difference model of bowed strings with nonlinear contact and friction forces

Charlotte Desvages; Stefan Bilbao

DAFx-2015 - Trondheim

Recent bowed string sound synthesis has relied on physical modelling techniques; the achievable realism and flexibility of gestural control are appealing, and the heavier computational cost becomes less significant as technology improves. A bowed string is simulated in two polarisations by discretising the partial differential equations governing its behaviour, using the finite difference method; a globally energy balanced scheme is used, as a guarantee of numerical stability under highly nonlinear conditions. In one polarisation, a nonlinear contact model is used for the normal forces exerted by the dynamic bow hair, left hand fingers, and fingerboard. In the other polarisation, a force-velocity friction curve is used for the resulting tangential forces. The scheme update requires the solution of two nonlinear vector equations.Sound examples and video demonstrations are presented.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors