DAFx Paper Archive - Search for machine learning, page 26 of 32

TorchFX: A Modern Approach to Audio DSP with PyTorch and GPU Acceleration

DAFx-2025 - Ancona

The increasing complexity and real-time processing demands of audio signals require optimized algorithms that utilize the computational power of Graphics Processing Units (GPUs). Existing Digital Signal Processing (DSP) libraries often do not provide the necessary efficiency and flexibility, particularly for integrating with Artificial Intelligence (AI) models. In response, we introduce TorchFX: a GPU-accelerated Python library for DSP, engineered to facilitate sophisticated audio signal processing. Built on the PyTorch framework, TorchFX offers an Object-Oriented interface similar to torchaudio but enhances functionality with a novel pipe operator for intuitive filter chaining. The library provides a comprehensive suite of Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filters, with a focus on multichannel audio, thereby facilitating the integration of DSP and AI-based approaches. Our benchmarking results demonstrate significant efficiency gains over traditional libraries like SciPy, particularly in multichannel contexts. While there are current limitations in GPU compatibility, ongoing developments promise broader support and real-time processing capabilities. TorchFX aims to become a useful tool for the community, contributing to innovation in GPU-accelerated DSP. TorchFX is publicly available on GitHub at https://github.com/matteospanio/torchfx.

Download

GPGPU Patterns for Serial and Parallel Audio Effects

Travis Skare

DAFx-2020 - Vienna (virtual)

Modern commodity GPUs offer high numerical throughput per unit of cost, but often sit idle during audio workstation tasks. Various researches in the field have shown that GPUs excel at tasks such as Finite-Difference Time-Domain simulation and wavefield synthesis. Concrete implementations of several such projects are available for use. Benchmarks and use cases generally concentrate on running one project on a GPU. Running multiple such projects simultaneously is less common, and reduces throughput. In this work we list some concerns when running multiple heterogeneous tasks on the GPU. We apply optimization strategies detailed in developer documentation and commercial CUDA literature, and show results through the lens of real-time audio tasks. We benchmark the cases of (i) a homogeneous effect chain made of previously separate effects, and (ii) a synthesizer with distinct, parallelizable sound generators.

Download

An Evaluation of Audio Feature Extraction Toolboxes

David Moffat; David Ronan; Joshua D. Reiss

DAFx-2015 - Trondheim

Audio feature extraction underpins a massive proportion of audio processing, music information retrieval, audio effect design and audio synthesis. Design, analysis, synthesis and evaluation often rely on audio features, but there are a large and diverse range of feature extraction tools presented to the community. An evaluation of existing audio feature extraction libraries was undertaken. Ten libraries and toolboxes were evaluated with the Cranfield Model for evaluation of information retrieval systems, reviewing the coverage, effort, presentation and time lag of a system. Comparisons are undertaken of these tools and example use cases are presented as to when toolboxes are most suitable. This paper allows a software engineer or researcher to quickly and easily select a suitable audio feature extraction toolbox.

Download

A Modeller-Simulator for Instrumental Playing of Virtual Musical Instruments

James Leonard; Nicolas Castagné; Claude Cadoz; Jean-Loup Florens

DAFx-2013 - Maynooth

This paper presents a musician-oriented modelling and simulation environment for designing physically modelled virtual instruments and interacting with them via a high performance haptic device. In particular, our system allows restoring the physical coupling between the user and the manipulated virtual instrument, a key factor for expressive playing of traditional acoustical instruments that is absent in the vast majority of computer-based musical systems. We first analyse the various uses of haptic devices in Computer Music, and introduce the various technologies involved in our system. We then present the modeller and simulation environments, and examples of musical virtual instruments created with this new environment.

Download

A New Functional Framework for a Sound System for Realtime Flight Simulation

Siegfried Vössner; Reinhard Braunstingl; Helmuth Ploner-Bernard; Alois Sontacchi

DAFx-2005 - Madrid

We will show a new sound framework and concept for realistic flight simulation. Dealing with a highly complex network of mechanical systems that act as physical sound sources the main focus is on a fully modular and extensible/scalable design. The prototype we developed is part of a fully functional Full Flight Simulator for Pilot Training.

Download

Performing Expressive Rhythms with BillaBoop Voice-Driven Drum Generator

Amaury Hazan

DAFx-2005 - Madrid

In a previous work we presented a system for transcribing spoken rhythms into a symbolic score. Thereafter, the system was extended to process the vocal stream in real-time in order to allow a musician to use it as a voice-driven drum generator. Extensions to this work are the following. First we achieved a study of the system classification accuracy based on typical onomatopoeia used in western beat boxing, with the perspective of building a general supervised model for immediate use. Also, we want the user to be able to generate expressive rhythms, beyond the symbolic drum representation. Thus we considered a class-specific mapping of continuous vocal stream descriptors with either effects or synthesis parameters of the drum generator. The extraction of the symbolic drum stream is implemented in the BillaBoop VST Core plug-in. The class-specific mapping and the sound synthesis are carried out in Plogue Bidule 1 framework. All these components are integrated into a low-latency application that allows its use for live performances.

Download

Sonic Screwdrivers: Sound as a Sculptural Process

Rowlands, Robert Lewis

DAFx-1998 - Barcelona

This paper discusses a Fine Art approach to the processes of digital audio. The author puts forward some ideas for a re-defining of digital audio software to embrace a wider audience and to promote the manipulation of sound as a sculptural process removed from, yet still related to, the assumed musical tradition. The authors artworks are introduced, and the impact of current research upon these artworks and upon the authors teaching are discussed.

Download

Analysis and Trans-synthesis of Acoustic Bowed-String Instrument Recordings: a Case Study using Bach Cello Suites

Yin-Lin Chen; Tien-Ming Wang; Wei-Hsiang Liao; Alvin Su

DAFx-2011 - Paris

In this paper, analysis and trans-synthesis of acoustic bowed string instrument recordings with new non-negative matrix factorization (NMF) procedure are presented. This work shows that it may require more than one template to represent a note according to time-varying behavior of timbre, especially played by bowed string instruments. The proposed method improves original NMF without the knowledge of tone models and the number of required templates in advance. Resultant NMF information is then converted into the synthesis parameters of the sinusoidal synthesis. Bach cello suites recorded by Fournier and Starker are used in the experiments. Analysis and trans-synthesis examples of the recordings are also provided. Index Terms—trans-synthesis, non-negative matrix factorization, bowed string instrument

Download

Identification of Time-frequency Maps for sounds timbre discrimination

Olivero Anaïk

DAFx-2011 - Paris

Gabor Multipliers are signals operator which are diagonal in a time-frequency representation of signals and can be viewed as timefrequency transfer function. If we estimate a Gabor mask between a note played by two instruments, then we have a time-frequency representation of the difference of timbre between these two notes. By averaging the energy contained in the Gabor mask, we obtain a measure of this difference. In this context, our goal is to automatically localize the time-frequency regions responsible for such a timbre dissimilarity. This problem is addressed as a feature selection problem over the time-frequency coefficients of a labelled data set of sounds.

Download

Simulating Microphone Bleed and Tom-tom Resonance in Multisampled Drum Workstations

Alice Clifford; Henry Lindsay Smith; Josh Reiss

DAFx-2012 - York

In recent years multisampled drum workstations have become increasingly popular. They offer an alternative to recording a full drum kit if a producer, engineer or amateur lacks the equipment, money, space or knowledge to produce a quality recording. These drum workstations strive for realism, often recording up to a hundred different velocity hits of the same drum, including recordings from all microphones for each drum hit and including bleed between these microphones. This paper describes research undertaken to investigate if it is possible to simulate the snare and kick drum bleed into the tom-tom microphones and the subsequent resonance of the tom-tom that is caused, with the aim of reducing the amount of audio data that needs to be stored. A listening test was performed asking participants to identify the real recording from a simulation. The results were not statistically significant to reject the hypothesis that subjects were unable to distinguish the difference between the real and simulated recordings. This suggests listeners were unable to identify the real recording in the majority of cases.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors