Download Onset-Informed Source Separation Using Non-Negative Matrix Factorization With Binary Masks This paper describes a new onset-informed source separation
method based on non-negative matrix factorization (NMF) with binary masks. Many previous approaches to separate a target instrument sound from polyphonic music have used side-information of
the target that is time-consuming to prepare. The proposed method
leverages the onsets of the target instrument sound to facilitate separation. Onsets are useful information that users can easily generate by tapping while listening to the target in music. To utilize
onsets in NMF-based sound source separation, we introduce binary masks that represent on/off states of the target sound. Binary
masks are formulated as Markov chains based on continuity of musical instrument sound. Owing to the binary masks, onsets can be
handled as a time frame in which the binary masks change from
off to on state. The proposed model is inferred by Gibbs sampling, in which the target sound source can be sampled efficiently
by using its onsets. We conducted experiments to separate the target melody instrument from recorded polyphonic music. Separation results showed about 2 to 10 dB improvement in target source
to residual noise ratio compared to the polyphonic sound. When
some onsets were missed or deviated, the method is still effective
for target sound source separation.
Download Differentiable IIR Filters for Machine Learning Applications In this paper we present an approach to using traditional digital IIR
filter structures inside deep-learning networks trained using backpropagation. We establish the link between such structures and
recurrent neural networks. Three different differentiable IIR filter
topologies are presented and compared against each other and an
established baseline. Additionally, a simple Wiener-Hammerstein
model using differentiable IIRs as its filtering component is presented and trained on a guitar signal played through a Boss DS-1
guitar pedal.
Download Tiv.lib: An Open-Source Library for the Tonal Description of Musical Audio In this paper, we present TIV.lib, an open-source library for the
content-based tonal description of musical audio signals. Its main
novelty relies on the perceptually-inspired Tonal Interval Vector
space based on the Discrete Fourier transform, from which multiple instantaneous and global representations, descriptors and metrics are computed—e.g., harmonic change, dissonance, diatonicity, and musical key. The library is cross-platform, implemented
in Python and the graphical programming language Pure Data, and
can be used in both online and offline scenarios. Of note is its
potential for enhanced Music Information Retrieval, where tonal
descriptors sit at the core of numerous methods and applications.
Download Recognizing Guitar Effects and Their Parameter Settings Guitar effects are commonly used in popular music to shape the
guitar sound to fit specific genres or to create more variety within
musical compositions. The sound is not only determined by the
choice of the guitar effect, but also heavily depends on the parameter settings of the effect. This paper introduces a method to
estimate the parameter settings of guitar effects, which makes it
possible to reconstruct the effect and its settings from an audio
recording of a guitar. The method utilizes audio feature extraction and shallow neural networks, which are trained on data created specifically for this task. The results show that the method
is generally suited for this task with average estimation errors of
±5% − ±16% of different parameter scales and could potentially
perform near the level of a human expert.
Download Diet Deep Generative Audio Models With Structured Lottery Deep learning models have provided extremely successful solutions in most audio application fields. However, the high accuracy
of these models comes at the expense of a tremendous computation cost. This aspect is almost always overlooked in evaluating the
quality of proposed models. However, models should not be evaluated without taking into account their complexity. This aspect
is especially critical in audio applications, which heavily relies on
specialized embedded hardware with real-time constraints.
In this paper, we build on recent observations that deep models are highly overparameterized, by studying the lottery ticket hypothesis on deep generative audio models. This hypothesis states
that extremely efficient small sub-networks exist in deep models
and would provide higher accuracy than larger models if trained in
isolation. However, lottery tickets are found by relying on unstructured masking, which means that resulting models do not provide
any gain in either disk size or inference time. Instead, we develop
here a method aimed at performing structured trimming. We show
that this requires to rely on global selection and introduce a specific criterion based on mutual information.
First, we confirm the surprising result that smaller models provide higher accuracy than their large counterparts. We further
show that we can remove up to 95% of the model weights without significant degradation in accuracy. Hence, we can obtain very
light models for generative audio across popular methods such as
Wavenet, SING or DDSP, that are up to 100 times smaller with
commensurate accuracy. We study the theoretical bounds for embedding these models on Raspberry Pi and Arduino, and show that
we can obtain generative models on CPU with equivalent quality
as large GPU models. Finally, we discuss the possibility of implementing deep generative audio models on embedded platforms.
Download Identification of Nonlinear Circuits as Port-Hamiltonian Systems This paper addresses identification of nonlinear circuits for
power-balanced virtual analog modeling and simulation. The proposed method combines a port-Hamiltonian system formulation
with kernel-based methods to retrieve model laws from measurements. This combination allows for the estimated model to retain
physical properties that are crucial for the accuracy of simulations,
while representing a variety of nonlinear behaviors. As an illustration, the method is used to identify a nonlinear passive peaking
EQ.
Download Arbitrary-Order IIR Antiderivative Antialiasing Nonlinear digital circuits and waveshaping are active areas of study,
specifically for what concerns numerical and aliasing issues. In
the past, an effective method was proposed to discretize nonlinear
static functions with reduced aliasing based on the antiderivative of
the nonlinear function. Such a method is based on the continuoustime convolution with an FIR antialiasing filter kernel, such as a
rectangular kernel. These kernels, however, are far from optimal
for the reduction of aliasing. In this paper we introduce the use
of arbitrary IIR rational transfer functions that allow a closer approximation of the ideal antialiasing filter, required in the fictitious continuous-time domain before sampling the nonlinear function output. These allow a higher degree of aliasing reduction and
can be flexibly adjusted to balance performance and computational
cost.
Download An Equivalent Circuit Interpretation of Antiderivative Antialiasing The recently proposed antiderivative antialiasing (ADAA) technique for stateful systems involves two key features: 1) replacing a nonlinearity in a physical model or virtual analog simulation
with an antialiased nonlinear system involving antiderivatives of
the nonlinearity and time delays and 2) introducing a digital filter
in cascade with each original delay in the system. Both of these
features introduce the same delay, which is compensated by adjusting the sampling period. The result is a simulation with reduced
aliasing distortion. In this paper, we study ADAA using equivalent
circuits, answering the question: “Which electrical circuit, discretized using the bilinear transform, yields the ADAA system?”
This gives us a new way of looking at the stability of ADAA and
how introducing extra filtering distorts a system’s response. We
focus on the Wave Digital Filter (WDF) version of this technique.
Download Non-Iterative Schemes for the Simulation of Nonlinear Audio Circuits In this work, a number of numerical schemes are presented in the
context of virtual-analog simulation. The schemes are linearlyimplicit in character, and hence directly solvable without iterative
methods. Schemes of increasing order of accuracy are constructed,
and convergence and stability conditions are proven formally. The
schemes are able to handle stiff problems very efficiently, because
of their fast update, and can be run at higher sample rates to reduce
aliasing. The cases of the diode clipper and ring modulator are
investigated in detail, including several numerical examples.
Download Applications of Port Hamiltonian Methods to Non-Iterative Stable Simulations of the Korg35 and Moog 4-Pole Vcf This paper presents an application of the port Hamiltonian formalism to the nonlinear simulation of the OTA-based Korg35 filter circuit and the Moog 4-pole ladder filter circuit. Lyapunov analysis is
used with their state-space representations to guarantee zero-input
stability over the range of parameters consistent with the actual
circuits. A zero-input stable non-iterative discrete-time scheme
based on a discrete gradient and a change of state variables is
shown along with numerical simulations. Simulations show behavior consistent with the actual operation of the circuits, e.g.,
self-oscillation, and are found to be stable and have lower computational cost compared to iterative methods.