Download Onset-Informed Source Separation Using Non-Negative Matrix Factorization With Binary Masks
This paper describes a new onset-informed source separation method based on non-negative matrix factorization (NMF) with binary masks. Many previous approaches to separate a target instrument sound from polyphonic music have used side-information of the target that is time-consuming to prepare. The proposed method leverages the onsets of the target instrument sound to facilitate separation. Onsets are useful information that users can easily generate by tapping while listening to the target in music. To utilize onsets in NMF-based sound source separation, we introduce binary masks that represent on/off states of the target sound. Binary masks are formulated as Markov chains based on continuity of musical instrument sound. Owing to the binary masks, onsets can be handled as a time frame in which the binary masks change from off to on state. The proposed model is inferred by Gibbs sampling, in which the target sound source can be sampled efficiently by using its onsets. We conducted experiments to separate the target melody instrument from recorded polyphonic music. Separation results showed about 2 to 10 dB improvement in target source to residual noise ratio compared to the polyphonic sound. When some onsets were missed or deviated, the method is still effective for target sound source separation.
Download Differentiable IIR Filters for Machine Learning Applications
In this paper we present an approach to using traditional digital IIR filter structures inside deep-learning networks trained using backpropagation. We establish the link between such structures and recurrent neural networks. Three different differentiable IIR filter topologies are presented and compared against each other and an established baseline. Additionally, a simple Wiener-Hammerstein model using differentiable IIRs as its filtering component is presented and trained on a guitar signal played through a Boss DS-1 guitar pedal.
Download Tiv.lib: An Open-Source Library for the Tonal Description of Musical Audio
In this paper, we present TIV.lib, an open-source library for the content-based tonal description of musical audio signals. Its main novelty relies on the perceptually-inspired Tonal Interval Vector space based on the Discrete Fourier transform, from which multiple instantaneous and global representations, descriptors and metrics are computed—e.g., harmonic change, dissonance, diatonicity, and musical key. The library is cross-platform, implemented in Python and the graphical programming language Pure Data, and can be used in both online and offline scenarios. Of note is its potential for enhanced Music Information Retrieval, where tonal descriptors sit at the core of numerous methods and applications.
Download Recognizing Guitar Effects and Their Parameter Settings
Guitar effects are commonly used in popular music to shape the guitar sound to fit specific genres or to create more variety within musical compositions. The sound is not only determined by the choice of the guitar effect, but also heavily depends on the parameter settings of the effect. This paper introduces a method to estimate the parameter settings of guitar effects, which makes it possible to reconstruct the effect and its settings from an audio recording of a guitar. The method utilizes audio feature extraction and shallow neural networks, which are trained on data created specifically for this task. The results show that the method is generally suited for this task with average estimation errors of ±5% − ±16% of different parameter scales and could potentially perform near the level of a human expert.
Download Diet Deep Generative Audio Models With Structured Lottery
Deep learning models have provided extremely successful solutions in most audio application fields. However, the high accuracy of these models comes at the expense of a tremendous computation cost. This aspect is almost always overlooked in evaluating the quality of proposed models. However, models should not be evaluated without taking into account their complexity. This aspect is especially critical in audio applications, which heavily relies on specialized embedded hardware with real-time constraints. In this paper, we build on recent observations that deep models are highly overparameterized, by studying the lottery ticket hypothesis on deep generative audio models. This hypothesis states that extremely efficient small sub-networks exist in deep models and would provide higher accuracy than larger models if trained in isolation. However, lottery tickets are found by relying on unstructured masking, which means that resulting models do not provide any gain in either disk size or inference time. Instead, we develop here a method aimed at performing structured trimming. We show that this requires to rely on global selection and introduce a specific criterion based on mutual information. First, we confirm the surprising result that smaller models provide higher accuracy than their large counterparts. We further show that we can remove up to 95% of the model weights without significant degradation in accuracy. Hence, we can obtain very light models for generative audio across popular methods such as Wavenet, SING or DDSP, that are up to 100 times smaller with commensurate accuracy. We study the theoretical bounds for embedding these models on Raspberry Pi and Arduino, and show that we can obtain generative models on CPU with equivalent quality as large GPU models. Finally, we discuss the possibility of implementing deep generative audio models on embedded platforms.
Download Identification of Nonlinear Circuits as Port-Hamiltonian Systems
This paper addresses identification of nonlinear circuits for power-balanced virtual analog modeling and simulation. The proposed method combines a port-Hamiltonian system formulation with kernel-based methods to retrieve model laws from measurements. This combination allows for the estimated model to retain physical properties that are crucial for the accuracy of simulations, while representing a variety of nonlinear behaviors. As an illustration, the method is used to identify a nonlinear passive peaking EQ.
Download Arbitrary-Order IIR Antiderivative Antialiasing
Nonlinear digital circuits and waveshaping are active areas of study, specifically for what concerns numerical and aliasing issues. In the past, an effective method was proposed to discretize nonlinear static functions with reduced aliasing based on the antiderivative of the nonlinear function. Such a method is based on the continuoustime convolution with an FIR antialiasing filter kernel, such as a rectangular kernel. These kernels, however, are far from optimal for the reduction of aliasing. In this paper we introduce the use of arbitrary IIR rational transfer functions that allow a closer approximation of the ideal antialiasing filter, required in the fictitious continuous-time domain before sampling the nonlinear function output. These allow a higher degree of aliasing reduction and can be flexibly adjusted to balance performance and computational cost.
Download An Equivalent Circuit Interpretation of Antiderivative Antialiasing
The recently proposed antiderivative antialiasing (ADAA) technique for stateful systems involves two key features: 1) replacing a nonlinearity in a physical model or virtual analog simulation with an antialiased nonlinear system involving antiderivatives of the nonlinearity and time delays and 2) introducing a digital filter in cascade with each original delay in the system. Both of these features introduce the same delay, which is compensated by adjusting the sampling period. The result is a simulation with reduced aliasing distortion. In this paper, we study ADAA using equivalent circuits, answering the question: “Which electrical circuit, discretized using the bilinear transform, yields the ADAA system?” This gives us a new way of looking at the stability of ADAA and how introducing extra filtering distorts a system’s response. We focus on the Wave Digital Filter (WDF) version of this technique.
Download Non-Iterative Schemes for the Simulation of Nonlinear Audio Circuits
In this work, a number of numerical schemes are presented in the context of virtual-analog simulation. The schemes are linearlyimplicit in character, and hence directly solvable without iterative methods. Schemes of increasing order of accuracy are constructed, and convergence and stability conditions are proven formally. The schemes are able to handle stiff problems very efficiently, because of their fast update, and can be run at higher sample rates to reduce aliasing. The cases of the diode clipper and ring modulator are investigated in detail, including several numerical examples.
Download Applications of Port Hamiltonian Methods to Non-Iterative Stable Simulations of the Korg35 and Moog 4-Pole Vcf
This paper presents an application of the port Hamiltonian formalism to the nonlinear simulation of the OTA-based Korg35 filter circuit and the Moog 4-pole ladder filter circuit. Lyapunov analysis is used with their state-space representations to guarantee zero-input stability over the range of parameters consistent with the actual circuits. A zero-input stable non-iterative discrete-time scheme based on a discrete gradient and a change of state variables is shown along with numerical simulations. Simulations show behavior consistent with the actual operation of the circuits, e.g., self-oscillation, and are found to be stable and have lower computational cost compared to iterative methods.