Download Towards Efficient Emulation of Nonlinear Analog Circuits for Audio Using Constraint Stabilization and Convex Quadratic Programming This paper introduces a computationally efficient method for
the emulation of nonlinear analog audio circuits by combining state-space representations, constraint stabilization, and convex quadratic programming (QP). Unlike traditional virtual analog (VA) modeling approaches or computationally demanding
SPICE-based simulations, our approach reformulates the nonlinear
differential-algebraic (DAE) systems that arise from analog circuit
analysis into numerically stable optimization problems. The proposed method efficiently addresses the numerical challenges posed
by nonlinear algebraic constraints via constraint stabilization techniques, significantly enhancing robustness and stability, suitable
for real-time simulations. A canonical diode clipper circuit is presented as a test case, demonstrating that our method achieves accurate and faster emulations compared to conventional state-space
methods. Furthermore, our method performs very well even at
substantially lower sampling rates. Preliminary numerical experiments confirm that the proposed approach offers improved numerical stability and real-time feasibility, positioning it as a practical
solution for high-fidelity audio applications.
Download Compression of Head-Related Transfer Functions Using Piecewise Cubic Hermite Interpolation We present a spline-based method for compressing and reconstructing Head-Related Transfer Functions (HRTFs) that preserves perceptual quality. Our approach focuses on the magnitude response and consists of four stages: (1) acquiring minimumphase head-related impulse responses (HRIR), (2) transforming
them into the frequency domain and applying adaptive Wiener
filtering to preserve important spectral features, (3) extracting a
minimal set of control points using derivative-based methods to
identify local maxima and inflection points, and (4) reconstructing
the HRTF using piecewise cubic Hermite interpolation (PCHIP)
over the refined control points. Evaluation on 301 subjects demonstrates that our method achieves an average compression ratio of
4.7:1 with spectral distortion ≤ 1.0 dB in each Equivalent Rectangular Band (ERB). The method preserves binaural cues with a
mean absolute interaural level difference (ILD) error of 0.10 dB.
Our method achieves about three times the compression obtained
with a PCA-based method.
Download Simplifying Antiderivative Antialiasing with Lookup Table Integration Antiderivative Antialiasing (ADAA), has become a pivotal method
for reducing aliasing when dealing with nonlinear function at audio rate. However, its implementation requires analytical computation of the antiderivative of the nonlinear function, which in practical cases can be challenging without a symbolic solver. Moreover, when the nonlinear function is given by measurements it
must be approximated to get a symbolic description. In this paper, we propose a simple approach to ADAA for practical applications that employs numerical integration of lookup tables (LUTs)
to approximate the antiderivative. This method eliminates the need
for closed-form solutions, streamlining the ADAA implementation
process in industrial applications. We analyze the trade-offs of this
approach, highlighting its computational efficiency and ease of implementation while discussing the potential impact of numerical
integration errors on aliasing performance. Experiments are conducted with static nonlinearities (tanh, a simple wavefolder and
the Buchla 259 wavefolding circuit) and a stateful nonlinear system (the diode clipper).
Download MorphDrive: Latent Conditioning for Cross-Circuit Effect Modeling and a Parametric Audio Dataset of Analog Overdrive Pedals In this paper, we present an approach to the neural modeling of
overdrive guitar pedals with conditioning from a cross-circuit and
cross-setting latent space. The resulting network models the behavior of multiple overdrive pedals across different settings, offering continuous morphing between real configurations and hybrid
behaviors. Compact conditioning spaces are obtained through unsupervised training of a variational autoencoder with adversarial
training, resulting in accurate reconstruction performance across
different sets of pedals. We then compare three Hyper-Recurrent
architectures for processing, including dynamic and static HyperRNNs, and a smaller model for real-time processing. Additionally,
we present pOD-set, a new open dataset including recordings of
27 analog overdrive pedals, each with 36 gain and tone parameter combinations totaling over 97 hours of recordings. Precise parameter setting was achieved through a custom-deployed recording
robot.
Download Antialiased Black-Box Modeling of Audio Distortion Circuits Using Real Linear Recurrent Units In this paper, we propose the use of real-valued Linear Recurrent
Units (LRUs) for black-box modeling of audio circuits. A network architecture composed of real LRU blocks interleaved with
nonlinear processing stages is proposed.
Two case studies are
presented, a second-order diode clipper and an overdrive distortion pedal. Furthermore, we show how to integrate the antiderivative antialiaisng technique into the proposed method, effectively
lowering oversampling requirements. Our experiments show that
the proposed method generates models that accurately capture the
nonlinear dynamics of the examined devices and are highly efficient, which makes them suitable for real-time operation inside
Digital Audio Workstations.
Download Antialiasing in BBD Chips Using BLEP Several methods exist in the literature to accurately simulate Bucket
Brigade Device (BBD) chips, which are widely used in analog
delay-based audio effects for their characteristic lo-fi sound, which
is affected by noise, nonlinearities and aliasing. The latter is a desired quality, being typical of those chips. However, when simulating BBDs in a discrete-time domain environment, additional aliasing components occur that need to be suppressed. In this work, we
propose a novel method that applies the Bandlimited Step (BLEP)
technique, effectively minimizing aliasing artifacts introduced by
the simulation. The paper provides some insights on the design
of a BBD simulation using interpolation at the input for clock rate
conversion and, most importantly, shows how BLEP can be effective in reducing unwanted aliasing artifacts. Interpolation is shown
to have minor importance in the reduction of spurious components.
Download Generative Latent Spaces for Neural Synthesis of Audio Textures This paper investigates the synthesis of audio textures and the
structure of generative latent spaces using Variational Autoencoders (VAEs) within two paradigms of neural audio synthesis:
DSP-inspired and data-driven approaches. For each paradigm, we
propose VAE-based frameworks that allow fine-grained temporal
control. We introduce datasets across three categories of environmental sounds to support our investigations. We evaluate and compare the models’ reconstruction performance using objective metrics, and investigate their generative capabilities and latent space
structure through latent space interpolations.
Download Neural-Driven Multi-Band Processing for Automatic Equalization and Style Transfer We present a Neural-Driven Multi-Band Processor (NDMP), a differentiable audio processing framework that augments a static sixband Parametric Equalizer (PEQ) with per-band dynamic range
compression. We optimize this processor using neural inference
for two tasks: Automatic Equalization (AutoEQ), which estimates
tonal and dynamic corrections without a reference, and Production
Style Transfer (NDMP-ST), which adapts the processing of an input signal to match the tonal and dynamic characteristics of a reference. We train NDMP using a self-supervised strategy, where the
model learns to recover a clean signal from inputs degraded with
randomly sampled NDMP parameters and gain adjustments. This
setup eliminates the need for paired input–target data and enables
end-to-end training with audio-domain loss functions. In the inference, AutoEQ enhances previously unseen inputs in a blind setting, while NDMP-ST performs style transfer by predicting taskspecific processing parameters. We evaluate our approach on the
MUSDB18 dataset using both objective metrics (e.g., SI-SDR,
PESQ, STFT loss) and a listening test.
Our results show that
NDMP consistently outperforms traditional PEQ and a PEQ+DRC
(single-band) baseline, offering a robust neural framework for audio enhancement that combines learned spectral and dynamic control.
Download Wave Pulse Phase Modulation: Hybridising Phase Modulation and Phase Distortion This paper introduces Wave Pulse Phase Modulation (WPPM), a
novel synthesis technique based on phase shaping. It combines
two classic digital synthesis techniques: Phase Modulation (PM)
and Phase Distortion (PD), aiming to overcome their respective
limitations while enabling the creation of new, interesting timbres.
It works by segmenting a phase signal into two regions, each independently driving the phase of a modulator waveform. This results
in two distinct pulses per period that together form the signal used
as the phase input to a carrier waveform, similar to PM, hence the
name Wave Pulse Phase Modulation. This method provides a minimal set of parameters that enable the creation of complex, evolving waveforms, and rich dynamic textures. By modulating these
parameters, WPPM can produce a wide range of interesting spectra, including those with formant-like resonant peaks. The paper
examines PM and PD in detail, exploring the modifications needed
to integrate them with WPPM, before presenting the full WPPM
algorithm alongside its parameters and creative possibilities. Finally, it discusses scope for further research and developments into
new similar phase shaping algorithms.
Download A Statistics-Driven Differentiable Approach for Sound Texture Synthesis and Analysis In this work, we introduce TexStat, a novel loss function specifically designed for the analysis and synthesis of texture sounds
characterized by stochastic structure and perceptual stationarity.
Drawing inspiration from the statistical and perceptual framework
of McDermott and Simoncelli, TexStat identifies similarities
between signals belonging to the same texture category without
relying on temporal structure. We also propose using TexStat
as a validation metric alongside Frechet Audio Distances (FAD) to
evaluate texture sound synthesis models. In addition to TexStat,
we present TexEnv, an efficient, lightweight and differentiable
texture sound synthesizer that generates audio by imposing amplitude envelopes on filtered noise. We further integrate these components into TexDSP, a DDSP-inspired generative model tailored
for texture sounds. Through extensive experiments across various
texture sound types, we demonstrate that TexStat is perceptually meaningful, time-invariant, and robust to noise, features that
make it effective both as a loss function for generative tasks and as
a validation metric. All tools and code are provided as open-source
contributions and our PyTorch implementations are efficient, differentiable, and highly configurable, enabling its use in both generative tasks and as a perceptually grounded evaluation metric.