DAFx Paper Archive - Browse all papers byRodà, A.

A Sound Localization based Interface for Real-Time Control of Audio Processing

Daniele Salvati; Sergio Canazza; Antonio Rodà

DAFx-2011 - Paris

This paper describes the implementation of an innovative musical interface based on the sound localization capability of a microphone array. Our proposal is to allow a musician to plan and conduct the expressivity of a performance, by controlling in realtime an audio processing module through the spatial movement of a sound source, i.e. voice, traditional musical instruments, sounding mobile devices. The proposed interface is able to locate and track the sound in a two-dimensional space with accuracy, so that the x-y coordinates of the sound source can be used to control the processing parameters. In particular, the paper is focused on the localization and tracking of harmonic sound sources in real moderate reverberant and noisy environment. To this purpose, we designed a system based on adaptive parameterized Generalized Cross-Correlation (GCC) and Phase Transform (PHAT) weighting with Zero-Crossing Rate (ZCR) threshold, a Wiener filter to improve the Signal to Noise Ratio (SNR) and a Kalman filter to make the position estimation more robust and accurate. We developed a Max/MSP external objects to test the system in a real scenario and to validate its usability.

Download

TorchFX: A Modern Approach to Audio DSP with PyTorch and GPU Acceleration

Matteo Spanio; Antonio Rodà

DAFx-2025 - Ancona

The increasing complexity and real-time processing demands of audio signals require optimized algorithms that utilize the computational power of Graphics Processing Units (GPUs). Existing Digital Signal Processing (DSP) libraries often do not provide the necessary efficiency and flexibility, particularly for integrating with Artificial Intelligence (AI) models. In response, we introduce TorchFX: a GPU-accelerated Python library for DSP, engineered to facilitate sophisticated audio signal processing. Built on the PyTorch framework, TorchFX offers an Object-Oriented interface similar to torchaudio but enhances functionality with a novel pipe operator for intuitive filter chaining. The library provides a comprehensive suite of Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filters, with a focus on multichannel audio, thereby facilitating the integration of DSP and AI-based approaches. Our benchmarking results demonstrate significant efficiency gains over traditional libraries like SciPy, particularly in multichannel contexts. While there are current limitations in GPU compatibility, ongoing developments promise broader support and real-time processing capabilities. TorchFX aims to become a useful tool for the community, contributing to innovation in GPU-accelerated DSP. TorchFX is publicly available on GitHub at https://github.com/matteospanio/torchfx.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors