Download Damped Chirp Mixture Estimation via Nonlinear Bayesian Regression Estimating mixtures of damped chirp sinusoids in noise is a
problem that affects audio analysis, coding, and synthesis applications. Phase-based non-stationary parameter estimators assume
that sinusoids can be resolved in the Fourier transform domain,
whereas high-resolution methods estimate superimposed components with accuracy close to the theoretical limits, but only for
sinusoids with constant frequencies. We present a new method
for estimating the parameters of superimposed damped chirps that
has an accuracy competitive with existing non-stationary estimators but also has a high-resolution like subspace techniques. After providing the analytical expression for a Gaussian-windowed
damped chirp signal’s Fourier transform, we propose an efficient
variational EM algorithm for nonlinear Bayesian regression that
jointly estimates the amplitudes, phases, frequencies, chirp rates,
and decay rates of multiple non-stationary components that may be
obfuscated under the same local maximum in the frequency spectrum. Quantitative results show that the new method not only has
an estimation accuracy that is close to the Cramér-Rao bound, but
also a high resolution that outperforms the state-of-the-art.
Download An Audio-Visual Fusion Piano Transcription Approach Based on Strategy Piano transcription is a fundamental problem in the field of music
information retrieval. At present, a large number of transcriptional
studies are mainly based on audio or video, yet there is a small
number of discussion based on audio-visual fusion. In this paper,
a piano transcription model based on strategy fusion is proposed,
in which the transcription results of the video model are used to assist audio transcription. Due to the lack of datasets currently used
for audio-visual fusion, the OMAPS data set is proposed in this paper. Meanwhile, our strategy fusion model achieves a 92.07% F1
score on OMAPS dataset. The transcription model based on feature fusion is also compared with the one based on strategy fusion.
The experiment results show that the transcription model based on
strategy fusion achieves better results than the one based on feature
fusion.
Download Spherical Decomposition of Arbitrary Scattering Geometries for Virtual Acoustic Environments A method is proposed to encode the acoustic scattering of objects for virtual acoustic applications through a multiple-input and
multiple-output framework. The scattering is encoded as a matrix in the spherical harmonic domain, and can be re-used and
manipulated (rotated, scaled and translated) to synthesize various
sound scenes. The proposed method is applied and validated using
Boundary Element Method simulations which shows accurate results between references and synthesis. The method is compatible
with existing frameworks such as Ambisonics and image source
methods.
Download Interacting With Digital Audio Effects Through a Haptic Knob With Programmable Resistance Live music performances and music production often involve the
manipulation of several parameters during sound generation, processing, and mixing. In hardware layouts, those parameters are
usually controlled using knobs, sliders and buttons. When these
layouts are virtualized, the use of physical (e.g. MIDI) controllers
can make interaction easier and reduce the cognitive load associated to sound manipulation. The addition of haptic feedback can
further improve such interaction by facilitating the detection of the
nature (continuous / discrete) and value of a parameter. To this
end, we have realized an endless-knob controller prototype with
programmable resistance to rotation, able to render various haptic effects. Ten subjects assessed the effectiveness of the provided
haptic feedback in a target-matching task where either visual-only
or visual-haptic feedback was provided; the experiment reported
significantly lower errors in presence of haptic feedback. Finally,
the knob was configured as a multi-parametric controller for a
real-time audio effect software written in Python, simulating the
voltage-controlled filter aboard the EMS VCS3. The integration
of the sound algorithm and the haptic knob is discussed, together
with various haptic feedback effects in response to control actions.
Download Quality Diversity for Synthesizer Sound Matching It is difficult to adjust the parameters of a complex synthesizer to
create the desired sound. As such, sound matching, the estimation of synthesis parameters that can replicate a certain sound, is
a task that has often been researched, utilizing optimization methods such as genetic algorithm (GA). In this paper, we introduce a
novelty-based objective for GA-based sound matching. Our contribution is two-fold. First, we show that the novelty objective is
able to improve the quality of sound matching by maintaining phenotypic diversity in the population. Second, we introduce a quality diversity approach to the problem of sound matching, aiming
to find a diverse set of matching sounds. We show that the novelty objective is effective in producing high-performing solutions
that are diverse in terms of specified audio features. This approach
allows for a new way of discovering sounds and exploring the capabilities of a synthesizer.
Download Bio-Inspired Optimization of Parametric Onset Detectors Onset detectors are used to recognize the beginning of musical
events in audio signals. Manual parameter tuning for onset detectors is a time consuming task, while existing automated approaches often maximize only a single performance metric. These
automated approaches cannot be used to optimize detector algorithms for complex scenarios, such as real-time onset detection
where an optimization process must consider both detection accuracy and latency. For this reason, a flexible optimization algorithm
should account for more than one performance metric in a multiobjective manner. This paper presents a generalized procedure for
automated optimization of parametric onset detectors. Our procedure employs a bio-inspired evolutionary computation algorithm
to replace manual parameter tuning, followed by the computation
of the Pareto frontier for multi-objective optimization. The proposed approach was evaluated on all the onset detection methods
of the Aubio library, using a dataset of monophonic acoustic guitar
recordings. Results show that the proposed solution is effective in
reducing the human effort required in the optimization process: it
replaced more than two days of manual parameter tuning with 13
hours and 34 minutes of automated computation. Moreover, the
resulting performance was comparable to that obtained by manual
optimization.
Download Adaptive Pitch-Shifting With Applications to Intonation Adjustment in a Cappella Recordings A central challenge for a cappella singers is to adjust their intonation and to stay in tune relative to their fellow singers. During
editing of a cappella recordings, one may want to adjust local intonation of individual singers or account for global intonation drifts
over time. This requires applying a time-varying pitch-shift to the
audio recording, which we refer to as adaptive pitch-shifting. In
this context, existing (semi-)automatic approaches are either laborintensive or face technical and musical limitations. In this work,
we present automatic methods and tools for adaptive pitch-shifting
with applications to intonation adjustment in a cappella recordings. To this end, we show how to incorporate time-varying information into existing pitch-shifting algorithms that are based on
resampling and time-scale modification (TSM). Furthermore, we
release an open-source Python toolbox, which includes a variety
of TSM algorithms and an implementation of our method. Finally,
we show the potential of our tools by two case studies on global
and local intonation adjustment in a cappella recordings using a
publicly available multitrack dataset of amateur choral singing.
Download Combining Zeroth and First-Order Analysis With Lagrange Polynomials to Reduce Artefacts in Live Concatenative Granulation This paper presents a technique addressing signal discontinuity and concatenation artefacts in real-time granular processing
with rectangular windowing. By combining zero-crossing synchronicity, first-order derivative analysis, and Lagrange polynomials, we can generate streams of uncorrelated and non-overlapping
sonic fragments with minimal low-order derivatives discontinuities. The resulting open-source algorithm, implemented in the
Faust language, provides a versatile real-time software for dynamical looping, wavetable oscillation, and granulation with reduced artefacts due to rectangular windowing and no artefacts
from overlap-add-to-one techniques commonly deployed in granular processing.
Download The Role of Modal Excitation in Colorless Reverberation A perceptual study revealing a novel connection between modal
properties of feedback delay networks (FDNs) and colorless reverberation is presented. The coloration of the reverberation tail
is quantified by the modal excitation distribution derived from the
modal decomposition of the FDN. A homogeneously decaying allpass FDN is designed to be colorless such that the corresponding narrow modal excitation distribution leads to a high perceived
modal density. Synthetic modal excitation distributions are generated to match modal excitations of FDNs. Three listening tests
were conducted to demonstrate the correlation between the modal
excitation distribution and the perceived degree of coloration. A
fourth test shows a significant reduction of coloration by the colorless FDN compared to other FDN designs. The novel connection of modal excitation, allpass FDNs, and perceived coloration
presents a beneficial design criterion for colorless artificial reverberation.
Download Object-Based Synthesis of Scraping and Rolling Sounds Based on Non-Linear Physical Constraints Sustained contact interactions like scraping and rolling produce a
wide variety of sounds. Previous studies have explored ways to
synthesize these sounds efficiently and intuitively but could not
fully mimic the rich structure of real instances of these sounds.
We present a novel source-filter model for realistic synthesis of
scraping and rolling sounds with physically and perceptually relevant controllable parameters constrained by principles of mechanics. Key features of our model include non-linearities to constrain
the contact force, naturalistic normal force variation for different
motions, and a method for morphing impulse responses within a
material to achieve location-dependence. Perceptual experiments
show that the presented model is able to synthesize realistic scraping and rolling sounds while conveying physical information similar to that in recorded sounds.