Download TorchFX: A Modern Approach to Audio DSP with PyTorch and GPU Acceleration The increasing complexity and real-time processing demands of
audio signals require optimized algorithms that utilize the computational power of Graphics Processing Units (GPUs).
Existing Digital Signal Processing (DSP) libraries often do not provide
the necessary efficiency and flexibility, particularly for integrating
with Artificial Intelligence (AI) models. In response, we introduce TorchFX: a GPU-accelerated Python library for DSP, engineered to facilitate sophisticated audio signal processing. Built on
the PyTorch framework, TorchFX offers an Object-Oriented interface similar to torchaudio but enhances functionality with a novel
pipe operator for intuitive filter chaining. The library provides a
comprehensive suite of Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filters, with a focus on multichannel
audio, thereby facilitating the integration of DSP and AI-based
approaches.
Our benchmarking results demonstrate significant
efficiency gains over traditional libraries like SciPy, particularly
in multichannel contexts. While there are current limitations in
GPU compatibility, ongoing developments promise broader support and real-time processing capabilities. TorchFX aims to become a useful tool for the community, contributing to innovation
in GPU-accelerated DSP. TorchFX is publicly available on GitHub
at https://github.com/matteospanio/torchfx.
Download Towards Efficient Emulation of Nonlinear Analog Circuits for Audio Using Constraint Stabilization and Convex Quadratic Programming This paper introduces a computationally efficient method for
the emulation of nonlinear analog audio circuits by combining state-space representations, constraint stabilization, and convex quadratic programming (QP). Unlike traditional virtual analog (VA) modeling approaches or computationally demanding
SPICE-based simulations, our approach reformulates the nonlinear
differential-algebraic (DAE) systems that arise from analog circuit
analysis into numerically stable optimization problems. The proposed method efficiently addresses the numerical challenges posed
by nonlinear algebraic constraints via constraint stabilization techniques, significantly enhancing robustness and stability, suitable
for real-time simulations. A canonical diode clipper circuit is presented as a test case, demonstrating that our method achieves accurate and faster emulations compared to conventional state-space
methods. Furthermore, our method performs very well even at
substantially lower sampling rates. Preliminary numerical experiments confirm that the proposed approach offers improved numerical stability and real-time feasibility, positioning it as a practical
solution for high-fidelity audio applications.
Download Perceptual Decorrelator Based on Resonators Decorrelation filters transform mono audio into multiple decorrelated copies. This paper introduces a novel decorrelation filter design based on a resonator bank, which produces a sum of over a thousand exponentially decaying sinusoids. A headphone listening test was used to identify the minimum inter-channel time delays that perceptually match ERB-filtered coherent noise to corresponding incoherent noise. The decay rate of each resonator is set based on a group delay profile determined by the listening test results at its corresponding frequency. Furthermore, the delays from the test are used to refine frequency-dependent windowing in coherence estimation, which we argue represents the perceptually most accurate way of assessing interaural coherence. This coherence measure then guides an optimization process that adjusts the initial phases of the sinusoids to minimize the coherence between two instances of the resonator-based decorrelator. The delay results establish the necessary group delay per ERB for effective decorrelation, revealing higher-than-expected values, particularly at higher frequencies. For comparison, the optimization is also performed using two previously proposed group-delay profiles: one based on the period of the ERB band center frequency and another based on the maximum group-delay limit before introducing smearing. The results indicate that the perceptually informed profile achieves equal decorrelation to the latter profile while smearing less at high frequencies. Overall, optimizing the phase response of the proposed decorrelator yields significantly lower coherence compared to using a random phase.