Download Searching for Music Mixing Graphs: A Pruning Approach Music mixing is compositional — experts combine multiple audio processors to achieve a cohesive mix from dry source tracks. We propose a method to reverse engineer this process from the input and output audio. First, we create a mixing console that applies all available processors to every chain. Then, after the initial console parameter optimization, we alternate between removing redundant processors and fine-tuning. We achieve this through differentiable implementation of both processors and pruning. Consequently, we find a sparse mixing graph that achieves nearly identical matching quality of the full mixing console. We apply this procedure to drymix pairs from various datasets and collect graphs that also can be used to train neural networks for music mixing applications.
Download Revisiting the Second-Order Accurate Non-Iterative Discretization Scheme In the field of virtual analog modeling, a variety of methods have been proposed to systematically derive simulation models from circuit schematics. However, they typically rely on implicit numerical methods to transform the differential equations governing the circuit to difference equations suitable for simulation. For circuits with non-linear elements, this usually means that a non-linear equation has to be solved at run-time at high computational cost. As an alternative to fully-implicit numerical methods, a family of non-iterative discretization schemes has recently been proposed, allowing a significant reduction of the computational load. However, in the original presentation, several assumptions are made regarding the structure of the ODE, limiting the generality of these schemes. Here, we show that for the second-order accurate variant in particular, the method is applicable to general ODEs. Furthermore, we point out an interesting connection to the implicit midpoint method.
Download QUBX: Rust Library for Queue-Based Multithreaded Real-Time Parallel Audio Streams Processing and Management The concurrent management of real-time audio streams pose an increasingly complex technical challenge within the realm of the digital audio signals processing, necessitating efficient and intuitive solutions. Qubx endeavors to lead in tackling this obstacle with an architecture grounded in dynamic circular queues, tailored to optimize and synchronize the processing of parallel audio streams. It is a library written in Rust, a modern and powerful ecosystem with a still limited range of tools for digital signal processing and management. Additionally, Rust’s inherent security features and expressive type system bolster the resilience and stability of the proposed tool.
Download Digitizing the Schumann PLL Analog Harmonizer The Schumann Electronics PLL is a guitar effect that uses hardwarebased processing of one-bit digital signals, with op-amp saturation and CMOS control systems used to generate multiple square waves derived from the frequency of the input signal. The effect may be simulated in the digital domain by cascading stages of statespace virtual analog modeling and algorithmic approximations of CMOS integrated circuits. Phase-locked loops, decade counters, and Schmitt trigger inverters are modeled using logic algorithms, allowing for the comparable digital implementation of the Schumann PLL. Simulation results are presented.
Download Modeling the Frequency-Dependent Sound Energy Decay of Acoustic Environments with Differentiable Feedback Delay Networks Differentiable machine learning techniques have recently proved effective for finding the parameters of Feedback Delay Networks (FDNs) so that their output matches desired perceptual qualities of target room impulse responses. However, we show that existing methods tend to fail at modeling the frequency-dependent behavior of sound energy decay that characterizes real-world environments unless properly trained. In this paper, we introduce a novel perceptual loss function based on the mel-scale energy decay relief, which generalizes the well-known time-domain energy decay curve to multiple frequency bands. We also augment the prototype FDN by incorporating differentiable wideband attenuation and output filters, and train them via backpropagation along with the other model parameters. The proposed approach improves upon existing strategies for designing and training differentiable FDNs, making it more suitable for audio processing applications where realistic and controllable artificial reverberation is desirable, such as gaming, music production, and virtual reality.
Download DDSP-Based Neural Waveform Synthesis of Polyphonic Guitar Performance From String-Wise MIDI Input We explore the use of neural synthesis for acoustic guitar from string-wise MIDI input. We propose four different systems and compare them with both objective metrics and subjective evaluation against natural audio and a sample-based baseline. We iteratively develop these four systems by making various considerations on the architecture and intermediate tasks, such as predicting pitch and loudness control features. We find that formulating the control feature prediction task as a classification task rather than a regression task yields better results. Furthermore, we find that our simplest proposed system, which directly predicts synthesis parameters from MIDI input performs the best out of the four proposed systems. Audio examples and code are available.
Download Parameter Estimation of Frequency-Modulated Sinusoids with the Distribution Derivative Method Frequency-modulated (FM) sinusoids are commonly used to model signals in several engineering applications, such as radar, sonar, communications, acoustics, and optics. The estimation of the parameters of FM sinusoids is a challenging problem with a long history in the literature. In this article, we use the distribution derivative method (DDM) to estimate the parameters of FM sinusoids in additive white Gaussian noise. Firstly, we derive the estimation of parameters of the model with DDM. Then, we compare the results of Monte-Carlo simulations (MCS) of DDM estimation of FM signals in additive white Gaussian noise against the state of the art (SOTA) and the Cramér-Rao lower bound (CRLB). DDM estimation of FM sinusoids showed performance comparable to the SOTA with less estimation bias. Additionally, DDM estimation of FM sinusoids is simple and straightforward to implement with the fast Fourier transform (FFT) relative to other approaches in the literature. Finally, DDM estimation has effectively the same computational complexity as the FFT.
Download Topology-Preserving Deformations of Digital Audio Topology provides global invariants for data as well as spaces of deformation. In this paper we discuss the deformations of audio signals which preserve topological information specified by sublevel set persistent homology. It is well known that the topological information only changes at extrema. We introduce box snakes as a data structure that captures permissible editing and deformation of signals and preserves the extremal properties of the signal while allowing for monotone deformations between them. The resulting algorithm works on any ordered discrete data hence can be applied to time and frequency domain finite length audio signals.
Download Characterisation and Excursion Modelling of Audio Haptic Transducers Statement and calculation of objective audio haptic transducer performance metrics facilitates optimisation of multi-sensory sound reproduction systems. Measurements of existing haptic transducers are applied to the calculation of a series of performance metrics to demonstrate a means of comparative objective analysis. The frequency response, transient response and moving mass excursion characteristics of each measured transducer are quantified using novel and previously defined metrics. Objective data drawn from a series of practical measurements shows that the proposed metrics and means of excursion modelling applied herein are appropriate for haptic transducer evaluation and protection against over-excursion respectively.
Download Automatic Equalization for Individual Instrument Tracks Using Convolutional Neural Networks We propose a novel approach for the automatic equalization of individual musical instrument tracks. Our method begins by identifying the instrument present within a source recording in order to choose its corresponding ideal spectrum as a target. Next, the spectral difference between the recording and the target is calculated, and accordingly, an equalizer matching model is used to predict settings for a parametric equalizer. To this end, we build upon a differentiable parametric equalizer matching neural network, demonstrating improvements relative to previously established state-of-the-art. Unlike past approaches, we show how our system naturally allows real-world audio data to be leveraged during the training of our matching model, effectively generating suitably produced training targets in an automated manner mirroring conditions at inference time. Consequently, we illustrate how fine-tuning our matching model on such examples considerably improves parametric equalizer matching performance in realworld scenarios, decreasing mean absolute error by 24% relative to methods relying solely on random parameter sampling techniques as a self-supervised learning strategy. We perform listening tests, and demonstrate that our proposed automatic equalization solution subjectively enhances the tonal characteristics for recordings of common instrument types.