DAFx Paper Archive - Search for 2021, page 19 of 21

Dynamic Pitch Warping for Expressive Vocal Retuning

Daniel Hernan Molina Villota; Christophe D'Alessandro; Olivier Perrotin

DAFx-2023 - Copenhagen

This work introduces the use of the Dynamic Pitch Warping (DPW) method for automatic pitch correction of singing voice audio signals. DPW is designed to dynamically tune any pitch trajectory to a predefined scale while preserving its expressive ornamentation. DPW has three degrees of freedom to modify the fundamental frequency (f0 ) signal: detection interval, critical time, and transition time. Together, these parameters allow us to define a pitch velocity condition that triggers an adaptive correction of the pitch trajectory (pitch warping). We compared our approach to Antares Autotune (the most commonly used software brand, abbreviated as ATA in this article). The pitch correction in ATA has two degrees of freedom: a triggering threshold (flextune) and the transition time (retune speed). The pitch trajectories that we compare were extracted from autotuned-in-ATA audio signals, and the DPW algorithm implemented over the f0 of the input audio tracks. We studied specifically pitch correction for three typical situations of f0 curves: staircase, vibrato, free-path. We measured the proximity of the corrected pitch trajectories to the original ones for each case obtaining that the DPW pitch correction method is better to preserve vibrato while keeping the f0 free path. In contrast, ATA is more effective in generating staircase curves, but fails for notsmall vibratos and free-path curves. We have also implemented an off-line automatic picth tuner using DPW.

Download

Digitizing the Schumann PLL Analog Harmonizer

Isaiah Farrell; Stefan Bilbao

DAFx-2024 - Guildford

The Schumann Electronics PLL is a guitar effect that uses hardwarebased processing of one-bit digital signals, with op-amp saturation and CMOS control systems used to generate multiple square waves derived from the frequency of the input signal. The effect may be simulated in the digital domain by cascading stages of statespace virtual analog modeling and algorithmic approximations of CMOS integrated circuits. Phase-locked loops, decade counters, and Schmitt trigger inverters are modeled using logic algorithms, allowing for the comparable digital implementation of the Schumann PLL. Simulation results are presented.

Download

Modeling the Frequency-Dependent Sound Energy Decay of Acoustic Environments with Differentiable Feedback Delay Networks

Alessandro Ilic Mezza; Riccardo Giampiccolo; Alberto Bernardini

DAFx-2024 - Guildford

Differentiable machine learning techniques have recently proved effective for finding the parameters of Feedback Delay Networks (FDNs) so that their output matches desired perceptual qualities of target room impulse responses. However, we show that existing methods tend to fail at modeling the frequency-dependent behavior of sound energy decay that characterizes real-world environments unless properly trained. In this paper, we introduce a novel perceptual loss function based on the mel-scale energy decay relief, which generalizes the well-known time-domain energy decay curve to multiple frequency bands. We also augment the prototype FDN by incorporating differentiable wideband attenuation and output filters, and train them via backpropagation along with the other model parameters. The proposed approach improves upon existing strategies for designing and training differentiable FDNs, making it more suitable for audio processing applications where realistic and controllable artificial reverberation is desirable, such as gaming, music production, and virtual reality.

Download

Fast Differentiable Modal Simulation of Non-Linear Strings, Membranes, and Plates

Rodrigo Diaz; Mark Sandler

DAFx-2025 - Ancona

Modal methods for simulating vibrations of strings, membranes, and plates are widely used in acoustics and physically informed audio synthesis. However, traditional implementations, particularly for non-linear models like the von Kármán plate, are computationally demanding and lack differentiability, limiting inverse modelling and real-time applications. We introduce a fast, differentiable, GPU-accelerated modal framework built with the JAX library, providing efficient simulations and enabling gradientbased inverse modelling. Benchmarks show that our approach significantly outperforms CPU and GPU-based implementations, particularly for simulations with many modes. Inverse modelling experiments demonstrate that our approach can recover physical parameters, including tension, stiffness, and geometry, from both synthetic and experimental data. Although fitting physical parameters is more sensitive to initialisation compared to methods that fit abstract spectral parameters, it provides greater interpretability and more compact parameterisation. The code is released as open source to support future research and applications in differentiable physical modelling and sound synthesis.

Download

Digital Morphophone Environment. Computer Rendering of a Pioneering Sound Processing Device

Daniel Scorranese

DAFx-2025 - Ancona

This paper introduces a digital reconstruction of the morphophone, a complex magnetophonic device developed in the 1950s within the laboratories of the GRM (Groupe de Recherches Musicales) in Paris. The analysis, design, and implementation methodologies underlying the Digital Morphophone Environment are discussed. Based on a detailed review of historical sources and limited documentation – including a small body of literature and, most notably, archival images – the core operational principles of the morphophone have been modeled within the MAX visual programming environment. The main goals of this work are, on the one hand, to study and make accessible a now obsolete and unavailable tool, and on the other, to provide the opportunity for new explorations in computer music and research.

Download

Perceptual Decorrelator Based on Resonators

Jon Fagerström; Nils Meyer-Kahlen; Sebastian J. Schlecht; Vesa Välimäki

DAFx-2025 - Ancona

Decorrelation filters transform mono audio into multiple decorrelated copies. This paper introduces a novel decorrelation filter design based on a resonator bank, which produces a sum of over a thousand exponentially decaying sinusoids. A headphone listening test was used to identify the minimum inter-channel time delays that perceptually match ERB-filtered coherent noise to corresponding incoherent noise. The decay rate of each resonator is set based on a group delay profile determined by the listening test results at its corresponding frequency. Furthermore, the delays from the test are used to refine frequency-dependent windowing in coherence estimation, which we argue represents the perceptually most accurate way of assessing interaural coherence. This coherence measure then guides an optimization process that adjusts the initial phases of the sinusoids to minimize the coherence between two instances of the resonator-based decorrelator. The delay results establish the necessary group delay per ERB for effective decorrelation, revealing higher-than-expected values, particularly at higher frequencies. For comparison, the optimization is also performed using two previously proposed group-delay profiles: one based on the period of the ERB band center frequency and another based on the maximum group-delay limit before introducing smearing. The results indicate that the perceptually informed profile achieves equal decorrelation to the latter profile while smearing less at high frequencies. Overall, optimizing the phase response of the proposed decorrelator yields significantly lower coherence compared to using a random phase.

Download

Stable Limit Cycles as Tunable Signal Sources

Wolfram E. Weingartner

DAFx-2025 - Ancona

This paper presents a method for synthesizing audio signals from nonlinear dynamical systems exhibiting stable limit cycles, with control over frequency and amplitude independent of changes to the system’s internal parameters. Using the van der Pol oscillator and the Brusselator as case studies, it is demonstrated how parameters are decoupled from frequency and amplitude by rescaling the angular frequency and normalizing amplitude extrema. Practical implementation considerations are discussed, as are the limits and challenges of this approach. The method’s validity is evaluated experimentally and synthesis examples show the application of tunable nonlinear oscillators in sound design, including the generation of transients in FM synthesis by means of a van der Pol oscillator and a Supersaw oscillator bank based on the Brusselator.

Download

Towards an Objective Comparison of Panning Feature Algorithms for Unsupervised Learning

Richard Mitic; Andreas Rossholm

DAFx-2025 - Ancona

Estimations of panning attributes are an important feature to extract from a piece of recorded music, with downstream uses such as classification, quality assessment, and listening enhancement. While several algorithms exist in the literature, there is currently no comparison between them and no studies to suggest which one is most suitable for any particular task. This paper compares four algorithms for extracting amplitude panning features with respect to their suitability for unsupervised learning. It finds synchronicities between them and analyses their results on a small set of commercial music excerpts chosen for their distinct panning features. The ability of each algorithm to differentiate between the tracks is analysed. The results can be used in future work to either select the most appropriate panning feature algorithm or create a version customized for a particular task.

Download

DDSP-Based Neural Waveform Synthesis of Polyphonic Guitar Performance From String-Wise MIDI Input

Nicolas Jonason; Xin Wang; Erica Cooper; Lauri Juvela; Bob L. T. Sturm; Junichi Yamagishi

DAFx-2024 - Guildford

We explore the use of neural synthesis for acoustic guitar from string-wise MIDI input. We propose four different systems and compare them with both objective metrics and subjective evaluation against natural audio and a sample-based baseline. We iteratively develop these four systems by making various considerations on the architecture and intermediate tasks, such as predicting pitch and loudness control features. We find that formulating the control feature prediction task as a classification task rather than a regression task yields better results. Furthermore, we find that our simplest proposed system, which directly predicts synthesis parameters from MIDI input performs the best out of the four proposed systems. Audio examples and code are available.

Download

Parameter Estimation of Frequency-Modulated Sinusoids with the Distribution Derivative Method

Marcelo Caetano

DAFx-2024 - Guildford

Frequency-modulated (FM) sinusoids are commonly used to model signals in several engineering applications, such as radar, sonar, communications, acoustics, and optics. The estimation of the parameters of FM sinusoids is a challenging problem with a long history in the literature. In this article, we use the distribution derivative method (DDM) to estimate the parameters of FM sinusoids in additive white Gaussian noise. Firstly, we derive the estimation of parameters of the model with DDM. Then, we compare the results of Monte-Carlo simulations (MCS) of DDM estimation of FM signals in additive white Gaussian noise against the state of the art (SOTA) and the Cramér-Rao lower bound (CRLB). DDM estimation of FM sinusoids showed performance comparable to the SOTA with less estimation bias. Additionally, DDM estimation of FM sinusoids is simple and straightforward to implement with the fast Fourier transform (FFT) relative to other approaches in the literature. Finally, DDM estimation has effectively the same computational complexity as the FFT.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors