DAFx Paper Archive - Search for deep, page 15 of 25

DDSP-SFX: Acoustically-Guided Sound Effects Generation with Differentiable Digital Signal Processing

DAFx-2024 - Guildford

Controlling the variations of sound effects using neural audio synthesis models has been a challenging task. Differentiable digital signal processing (DDSP) provides a lightweight solution that achieves high-quality sound synthesis while enabling deterministic acoustic attribute control by incorporating pre-processed audio features and digital synthesizers. In this research, we introduce DDSP-SFX, a model based on the DDSP architecture capable of synthesizing high-quality sound effects while enabling users to control the timbre variations easily. We integrate a transient modelling algorithm in DDSP that achieves higher objective evaluation scores and subjective ratings over impulsive signals (footsteps, gunshots). We propose a novel method that achieves frame-level timbre variation control while also allowing deterministic attribute control. We further qualitatively show the timbre transfer performance using voice as the guiding sound.

Download

Differentiable All-Pole Filters for Time-Varying Audio Systems

Chin-Yun Yu; Christopher Mitcheltree; Alistair Carson; Stefan Bilbao; Joshua Reiss; György Fazekas

DAFx-2024 - Guildford

Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers. However, their recursive structure impedes end-toend training of these systems using automatic differentiation. Although non-recursive filter approximations like frequency sampling and frame-based processing have been proposed and widely used in previous works, they cannot accurately reflect the gradient of the original system. We alleviate this difficulty by reexpressing a time-varying all-pole filter to backpropagate the gradients through itself, so the filter implementation is not bound to the technical limitations of automatic differentiation frameworks. This implementation can be employed within audio systems containing filters with poles for efficient gradient evaluation. We demonstrate its training efficiency and expressive capabilities for modelling real-world dynamic audio systems on a phaser, time-varying subtractive synthesiser, and feed-forward compressor. We make our code and audio samples available and provide the trained audio effect and synth models in a VST plugin1 .

Download

Towards Neural Emulation of Voltage-Controlled Oscillators

Riccardo Simionato; Stefano Fasciani

DAFx-2025 - Ancona

Machine learning models have become ubiquitous in modeling analog audio devices. Expanding on this line of research, our study focuses on Voltage-Controlled Oscillators of analog synthesizers. We employ black box autoregressive artificial neural networks to model the typical analog waveshapes, including triangle, square, and sawtooth. The models can be conditioned on wave frequency and type, enabling the generation of pitch envelopes and morphing across waveshapes. We conduct evaluations on both synthetic and analog datasets to assess the accuracy of various architectural variants. The LSTM variant performed better, although lower frequency ranges present particular challenges.

Download

Chroma and MFCC Based Pattern Recognition in Audio Files Utilizing Hidden Markov Models And Dynamic Programming

Alexander Wankhammer; Peter Sciri; Alois Sontacchi

DAFx-2009 - Como

In this paper we present an algorithm to reveal the immanent musical structure within pieces of popular music. Our proposed model uses an estimate of the harmonic progression which is obtained by calculating beat-synchronous chroma vectors and letting a Hidden Markov Model (HMM) decide the most probable sequence of chords. In addition, MFCC vectors are computed to retrieve basic timbral information that can not be described by harmony. Subsequently, a dynamic programming algorithm is used to detect repetitive patterns in these feature sequences. Based on these patterns a second dynamic programming stage tries to find and link corresponding patterns to larger segments that reflect the musical structure.

Download

Estimation and Modeling of Pinna-Related Transfer Functions

Michele Geronazzo; Simone Spagnol; Federico Avanzini

DAFx-2010 - Graz

This paper considers the problem of modeling pinna-related transfer functions (PRTFs) for 3-D sound rendering. Following a structural modus operandi, we present an algorithm for the decomposition of PRTFs into ear resonances and frequency notches due to reflections over pinna cavities. Such an approach allows to control the evolution of each physical phenomenon separately through the design of two distinct filter blocks during PRTF synthesis. The resulting model is suitable for future integration into a structural head-related transfer function model, and for parametrization over anthropometrical measurements of a wide range of subjects.

Download

A Hilbert-Transformer Frequency Shifter for Audio

Wardle, Scott

DAFx-1998 - Barcelona

In contrast to conventional pitch-shifting effects which attempt to maintain harmonic relationships in the signal, a frequency shifter translates all the component frequencies of the input signal by an equal amount, disrupting the harmonic relationships and radically altering the sonic qualities of the signal. Ring modulation is a generalization of double-sideband suppressed-carrier modulation, and the frequency shifter is equivalent to a single-sideband modulator. Applications of the frequency shifter include the creation of bizarre distortions, phaser, and rotating speaker effects. An implementation is presented that is suitable for fixed-point digital hardware.

Download

An Interdisciplinary Approach to Audio Effect Classification

Vincent Verfaille; Catherine Guastavino; Caroline Traube

DAFx-2006 - Montreal

The aim of this paper is to propose an interdisciplinary classification of digital audio effects to facilitate communication and collaborations between DSP programmers, sound engineers, composers, performers and musicologists. After reviewing classifications reflecting technological, technical and perceptual points of view, we introduce a transverse classification to link disciplinespecific classifications into a single network containing various layers of descriptors, ranging from low-level features to high-level features. Simple tools using the interdisciplinary classification are introduced to facilitate the navigation between effects, underlying techniques, perceptual attributes and semantic descriptors. Finally, concluding remarks on implications for teaching purposes and for the development of audio effects user interfaces based on perceptual features rather than technical parameters are presented.

Download

Physically Based Sound Synthesis and Control of Footsteps Sounds

Luca Turchet; Stefania Serafin; Smilen Dimitrov; Rolf Nordahl

DAFx-2010 - Graz

We describe a system to synthesize in real-time footsteps sounds. The sound engine is based on physical models and physically inspired models reproducing the act of walking on several surfaces. To control the real-time engine, three solutions are proposed. The first two solutions are based on floor microphones, while the third one is based on shoes enhanced with sensors. The different solutions proposed are discussed in the paper.

Download

Adaptive Pitch-Shifting With Applications to Intonation Adjustment in a Cappella Recordings

Sebastian Rosenzweig; Simon Schwär; Jonathan Driedger; Meinard Müller

DAFx-2021 - Vienna (virtual)

A central challenge for a cappella singers is to adjust their intonation and to stay in tune relative to their fellow singers. During editing of a cappella recordings, one may want to adjust local intonation of individual singers or account for global intonation drifts over time. This requires applying a time-varying pitch-shift to the audio recording, which we refer to as adaptive pitch-shifting. In this context, existing (semi-)automatic approaches are either laborintensive or face technical and musical limitations. In this work, we present automatic methods and tools for adaptive pitch-shifting with applications to intonation adjustment in a cappella recordings. To this end, we show how to incorporate time-varying information into existing pitch-shifting algorithms that are based on resampling and time-scale modification (TSM). Furthermore, we release an open-source Python toolbox, which includes a variety of TSM algorithms and an implementation of our method. Finally, we show the potential of our tools by two case studies on global and local intonation adjustment in a cappella recordings using a publicly available multitrack dataset of amateur choral singing.

Download

The development of an online course in DSP eartraining

Øyvind Brandtsegg; Sigurd Saue; Victor Lazzarini; John Pål Inderberg; Axel Tidemann; Håkon Kvidal; Jan Tro; Jøran Rudi; Notto J. W. Thelle

DAFx-2012 - York

The authors present a collaborative effort on establishing an online course in DSP eartraining. The paper reports from a preliminary workshop that covered a large range of topics such as eartraining in music education, terminology for sound characterization, e-learning, automated tutoring, DSP techniques, music examples and audio programming. An initial design of the web application is presented as a rich content database with flexible views to allow customized online presentations. Technical risks have already been mitigated through prototyping.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors