DAFx Paper Archive - Search for neural, page 1 of 35

Real-Time Black-Box Modelling With Recurrent Neural Networks

Alec Wright; Eero-Pekka Damskägg; Vesa Välimäki

DAFx-2019 - Birmingham

This paper proposes to use a recurrent neural network for black-box modelling of nonlinear audio systems, such as tube amplifiers and distortion pedals. As a recurrent unit structure, we test both Long Short-Term Memory and a Gated Recurrent Unit. We compare the proposed neural network with a WaveNet-style deep neural network, which has been suggested previously for tube amplifier modelling. The neural networks are trained with several minutes of guitar and bass recordings, which have been passed through the devices to be modelled. A real-time audio plugin implementing the proposed networks has been developed in the JUCE framework. It is shown that the recurrent neural networks achieve similar accuracy to the WaveNet model, while requiring significantly less processing power to run. The Long Short-Term Memory recurrent unit is also found to outperform the Gated Recurrent Unit overall. The proposed neural network is an important step forward in computationally efficient yet accurate emulation of tube amplifiers and distortion pedals.

Download

Neural Net Tube Models for Wave Digital Filters

Champ C. Darabundit; Dirk Roosenburg; Julius O. Smith III

DAFx-2022 - Vienna

Herein, we demonstrate the use of neural nets towards simulating multiport nonlinearities inside a wave digital filter. We introduce a resolved wave definition which allows us to extract features from a Kirchhoff domain dataset and train our neural networks directly in the wave domain. A hyperparameter search is performed to minimize error and runtime complexity. To illustrate the method, we model a tube amplifier circuit inspired by the preamplifier stage of the Fender Pro-Junior guitar amplifier. We analyze the performance of our neural nets models by comparing their distortion characteristics and transconductances. Our results suggest that activation function selection has a significant effect on the distortion characteristic created by the neural net.

Download

Neural Parametric Equalizer Matching Using Differentiable Biquads

Shahan Nercessian

DAFx-2020 - Vienna (virtual)

This paper proposes a neural network for carrying out parametric equalizer (EQ) matching. The novelty of this neural network solution is that it can be optimized directly in the frequency domain by means of differentiable biquads, rather than relying solely on a loss on parameter values which does not correlate directly with the system output. We compare the performance of the proposed neural network approach with that of a baseline algorithm based on a convex relaxation of the problem. It is observed that the neural network can provide better matching than the baseline approach because it directly attempts to solve the non-convex problem. Moreover, we show that the same network trained with only a parameter loss is insufficient for the task, despite the fact that it matches underlying EQ parameters better than one trained with a combination of spectral and parameter losses.

Download

Neural Audio Processing on Android Phones

Jason Hoopes; Brooke Chalmers; Victor Zappi

DAFx-2024 - Guildford

This study investigates the potential of real-time inference of neural audio effects on Android smartphones, marking an initial step towards bridging the gap in neural audio processing for mobile devices. Focusing exclusively on processing rather than synthesis, we explore the performance of three open-source neural models across five Android phones released between 2014 and 2022, showcasing varied capabilities due to their generational differences. Through comparative analysis utilizing two C++ inference engines (ONNX Runtime and RTNeural), we aim to evaluate the computational efficiency and timing performance of these models, considering the varying computational loads and the hardware specifics of each device. Our work contributes insights into the feasibility of implementing neural audio processing in real-time on mobile platforms, highlighting challenges and opportunities for future advancements in this rapidly evolving field.

Download

Speech Dereverberation Using Recurrent Neural Networks

Shahan Nercessian; Alexey Lukin

DAFx-2019 - Birmingham

Advances in deep learning have led to novel, state-of-the-art techniques for blind source separation, particularly for the application of non-stationary noise removal from speech. In this paper, we show how a simple reformulation allows us to adapt blind source separation techniques to the problem of speech dereverberation and, accordingly, train a bidirectional recurrent neural network (BRNN) for this task. We compare the performance of the proposed neural network approach with that of a baseline dereverberation algorithm based on spectral subtraction. We find that our trained neural network quantitatively and qualitatively outperforms the baseline approach.

Download

Wave Digital Modeling of Circuits with Multiple One-Port Nonlinearities Based on Lipschitz-Bounded Neural Networks

Oliviero Massi; Edoardo Manino; Alberto Bernardini

DAFx-2024 - Guildford

Neural networks have found application within the Wave Digital Filters (WDFs) framework as data-driven input-output blocks for modeling single one-port or multi-port nonlinear devices in circuit systems. However, traditional neural networks lack predictable bounds for their output derivatives, essential to ensure convergence when simulating circuits with multiple nonlinear elements using fixed-point iterative methods, e.g., the Scattering Iterative Method (SIM). In this study, we address such issue by employing Lipschitz-bounded neural networks for regressing nonlinear WD scattering relations of one-port nonlinearities.

Download

Decoding Sound Source Location From EEG: Preliminary Comparisons of Spatial Rendering and Location

Nils Marggraf-Turley; Lorenzo Picinali; Niels Pontoppidan; Martha Shiell; Drew Cappotto

DAFx-2024 - Guildford

Spatial auditory acuity is contingent on the quality of spatial cues presented during listening. Electroencephalography (EEG) shows promise for finding neural markers of such acuity present in recorded neural activity, potentially mitigating common challenges with behavioural assessment (e.g., sound source localisation tasks). This study presents findings from three preliminary experiments which investigated neural response variations to auditory stimuli under different spatial listening conditions: free-field (loudspeakerbased), individual Head-Related Transfer-Functions (HRTF), and non-individual HRTFs. Three participants, each participating in one experiment, were exposed to auditory stimuli from various spatial locations while neural activity was recorded via EEG. The resultant neural responses underwent a decoding protocol to asses how decoding accuracy varied between stimuli locations over time. Decoding accuracy was highest for free-field auditory stimuli, with significant but lower decoding accuracy between left and right hemisphere locations for individual and non-individual HRTF stimuli. A latency in significant decoding accuracy was observed between listening conditions for locations dominated by spectral cues. Furthermore, findings suggest that decoding accuracy between free-field and non-individual HRTF stimuli may reflect behavioural front-back confusion rates.

Download

Fast Temporal Convolutions for Real-Time Audio Signal Processing

Stepan Miklanek; Jiri Schimmel

DAFx-2022 - Vienna

This paper introduces the possibilities of optimizing neural network convolutional layers for modeling nonlinear audio systems and effects. Enhanced methods for real-time dilated convolutions are presented to achieve faster signal processing times than in previous work. Due to the improved implementation of convolutional layers, a significant decrease in computational requirements was observed and validated on different configurations of single layers with dilated convolutions and WaveNet-style feedforward neural network models. In most cases, equivalent signal processing times were achieved to those using recurrent neural networks with Long Short-Term Memory units and Gated Recurrent Units, which are considered state-of-the-art in the field of black-box virtual analog modeling.

Download

Aliasing Reduction in Neural Amp Modeling by Smoothing Activations

Ryota Sato; Julius O. Smith III

DAFx-2025 - Ancona

The increasing demand for high-quality digital emulations of analog audio hardware, such as vintage tube guitar amplifiers, led to numerous works on neural network-based black-box modeling, with deep learning architectures like WaveNet showing promising results. However, a key limitation in all of these models was the aliasing artifacts stemming from nonlinear activation functions in neural networks. In this paper, we investigated novel and modified activation functions aimed at mitigating aliasing within neural amplifier models. Supporting this, we introduced a novel metric, the Aliasing-to-Signal Ratio (ASR), which quantitatively assesses the level of aliasing with high accuracy. Measuring also the conventional Error-to-Signal Ratio (ESR), we conducted studies on a range of preexisting and modern activation functions with varying stretch factors. Our findings confirmed that activation functions with smoother curves tend to achieve lower ASR values, indicating a noticeable reduction in aliasing. Notably, this improvement in aliasing reduction was achievable without a substantial increase in ESR, demonstrating the potential for high modeling accuracy with reduced aliasing in neural amp models.

Download

Inference-Time Structured Pruning for Real-Time Neural Network Audio Effects

Christopher Johann Clarke; Jatin Chowdhury

DAFx-2025 - Ancona

Structured pruning is a technique for reducing the computational load and memory footprint of neural networks by removing structured subsets of parameters according to a predefined schedule or ranking criterion. This paper investigates the application of structured pruning to real-time neural network audio effects, focusing on both feedforward networks and recurrent architectures. We evaluate multiple pruning strategies at inference time, without retraining, and analyze their effects on model performance. To quantify the trade-off between parameter count and audio fidelity, we construct a theoretical model of the approximation error as a function of network architecture and pruning level. The resulting bounds establish a principled relationship between pruninginduced sparsity and functional error, enabling informed deployment of neural audio effects in constrained real-time environments.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors