Download A Database of Partial Tracks for Evaluation of Sinusoidal Models This paper presents a database of partial tracks extracted from synthetic as well as pre-recorded musical signals, designed to serve as an ancillary tool for evaluation of sinusoidal analysis algorithms. In order to accomplish this goal, the database requirements have been carefully specified. A semi-automatic analysis methodology to ensure the track parameters are precisely estimated has been employed. The overall methodology is validated via the application of performance tests over the synthetic source-signals.
Download Parametric Spatial Audio Effects Based on the Multi-Directional Decomposition of Ambisonic Sound Scenes Decomposing a sound-field into its individual components and respective parameters can represent a convenient first-step towards
offering the user an intuitive means of controlling spatial audio
effects and sound-field modification tools. The majority of such
tools available today, however, are instead limited to linear combinations of signals or employ a basic single-source parametric
model. Therefore, the purpose of this paper is to present a parametric framework, which seeks to overcome these limitations by first
dividing the sound-field into its multi-source and ambient components based on estimated spatial parameters. It is then demonstrated that by manipulating the spatial parameters prior to reproducing the scene, a number of sound-field modification and spatial
audio effects may be realised; including: directional warping, listener translation, sound source tracking, spatial editing workflows
and spatial side-chaining. Many of the effects described have also
been implemented as real-time audio plug-ins, in order to demonstrate how a user may interact with such tools in practice.
Download A supervised learning approach to ambience extraction from mono recordings for blind upmixing A supervised learning approach to ambience extraction from onechannel audio signals is presented. The extracted ambient signals are applied for the blind upmixing of musical audio recordings to surround sound formats. The input signal is processed by means of short-term spectral attenuation. The spectral weights are computed using a low-level feature extraction process and a neural network regression method. The multi-channel audio signal is generated by feeding the computed ambient signal into the rear channels of a surround sound system.
Download The Role of Modal Excitation in Colorless Reverberation A perceptual study revealing a novel connection between modal
properties of feedback delay networks (FDNs) and colorless reverberation is presented. The coloration of the reverberation tail
is quantified by the modal excitation distribution derived from the
modal decomposition of the FDN. A homogeneously decaying allpass FDN is designed to be colorless such that the corresponding narrow modal excitation distribution leads to a high perceived
modal density. Synthetic modal excitation distributions are generated to match modal excitations of FDNs. Three listening tests
were conducted to demonstrate the correlation between the modal
excitation distribution and the perceived degree of coloration. A
fourth test shows a significant reduction of coloration by the colorless FDN compared to other FDN designs. The novel connection of modal excitation, allpass FDNs, and perceived coloration
presents a beneficial design criterion for colorless artificial reverberation.
Download Making Sounds with Numbers: A Tutorial on Music Software Dedicated to Digital Audio A (partial) taxonomy of software applications devoted to sounds is presented. For each category of software applications, an abstract model is proposed and actual implementations are evaluated with respect to this model.
Download Blind Upmix for Applause-like Signals Based on Perceptual Plausibility Criteria Applause is the result of many individuals rhythmically clapping their hands. Applause recordings exhibit a certain temporal, timbral and spatial structure: claps originating from a distinct direction (i.e, from a particular person) usually have a similar timbre and occur in a quasi-periodic repetition. Traditional upmix approaches for blind mono-to-stereo upmix do not consider these properties and may therefore produce an output with suboptimal perceptual quality to be attributed to a lack of plausibility. In this paper, we propose a blind upmixing approach of applause-like signals which aims at preserving the natural structure of applause signals by incorporating periodicity and timbral similarity of claps into the upmix process and therefore supporting plausibility of the artificially generated spatial scene. The proposed upmix approach is evaluated by means of a subjective preference listening test.
Download Wave Pulse Phase Modulation: Hybridising Phase Modulation and Phase Distortion This paper introduces Wave Pulse Phase Modulation (WPPM), a
novel synthesis technique based on phase shaping. It combines
two classic digital synthesis techniques: Phase Modulation (PM)
and Phase Distortion (PD), aiming to overcome their respective
limitations while enabling the creation of new, interesting timbres.
It works by segmenting a phase signal into two regions, each independently driving the phase of a modulator waveform. This results
in two distinct pulses per period that together form the signal used
as the phase input to a carrier waveform, similar to PM, hence the
name Wave Pulse Phase Modulation. This method provides a minimal set of parameters that enable the creation of complex, evolving waveforms, and rich dynamic textures. By modulating these
parameters, WPPM can produce a wide range of interesting spectra, including those with formant-like resonant peaks. The paper
examines PM and PD in detail, exploring the modifications needed
to integrate them with WPPM, before presenting the full WPPM
algorithm alongside its parameters and creative possibilities. Finally, it discusses scope for further research and developments into
new similar phase shaping algorithms.
Download Equalizing Loudspeakers in Reverberant Environments Using Deep Convolutive Dereverberation Loudspeaker equalization is an established topic in the literature, and currently many techniques are available to address most practical use cases. However, most of these rely on accurate measurements of the loudspeaker in an anechoic environment, which in some occurrences is not feasible. This is the case, e.g. of custom digital organs, which have a set of loudspeakers that are built into a large and geometrically-complex piece of furniture, which may be too heavy and large to be transported to a measurement room, or may require a big one, making traditional impulse response measurements impractical for most users. In this work we propose a method to find the inverse of the sound emission system in a reverberant environment, based on a Deep Learning dereverberation algorithm. The method is agnostic of the room characteristics and can be, thus, conducted in an automated fashion in any environment. A real use case is discussed and results are provided, showing the effectiveness of the approach in designing filters that match closely the magnitude response of the ideal inverting filters.
Download Balancing Error and Latency of Black-Box Models for Audio Effects Using Hardware-Aware Neural Architecture Search In this paper, we address automating and systematizing the process of finding black-box models for virtual analogue audio effects with an optimal balance between error and latency. We introduce a multi-objective optimization approach based on hardware-aware neural architecture search which allows specifying the optimization balance of model error and latency according to the requirements of the application. By using a regularized evolutionary algorithm, it is able to navigate through a huge search space systematically. Additionally, we propose a search space for modelling non-linear dynamic audio effects consisting of over 41 trillion different WaveNet-style architectures. We evaluate its performance and usefulness by yielding highly effective architectures, either up to 18× faster or with a test loss of up to 56% less than the best performing models of the related work, while still showing a favourable trade-off. We can conclude that hardware-aware neural architecture search is a valuable tool that can help researchers and engineers developing virtual analogue models by automating the architecture design and saving time by avoiding manual search and evaluation through trial-and-error.