Download Spherical Decomposition of Arbitrary Scattering Geometries for Virtual Acoustic Environments A method is proposed to encode the acoustic scattering of objects for virtual acoustic applications through a multiple-input and
multiple-output framework. The scattering is encoded as a matrix in the spherical harmonic domain, and can be re-used and
manipulated (rotated, scaled and translated) to synthesize various
sound scenes. The proposed method is applied and validated using
Boundary Element Method simulations which shows accurate results between references and synthesis. The method is compatible
with existing frameworks such as Ambisonics and image source
methods.
Download The Role of Modal Excitation in Colorless Reverberation A perceptual study revealing a novel connection between modal
properties of feedback delay networks (FDNs) and colorless reverberation is presented. The coloration of the reverberation tail
is quantified by the modal excitation distribution derived from the
modal decomposition of the FDN. A homogeneously decaying allpass FDN is designed to be colorless such that the corresponding narrow modal excitation distribution leads to a high perceived
modal density. Synthetic modal excitation distributions are generated to match modal excitations of FDNs. Three listening tests
were conducted to demonstrate the correlation between the modal
excitation distribution and the perceived degree of coloration. A
fourth test shows a significant reduction of coloration by the colorless FDN compared to other FDN designs. The novel connection of modal excitation, allpass FDNs, and perceived coloration
presents a beneficial design criterion for colorless artificial reverberation.
Download Parametric Spatial Audio Effects Based on the Multi-Directional Decomposition of Ambisonic Sound Scenes Decomposing a sound-field into its individual components and respective parameters can represent a convenient first-step towards
offering the user an intuitive means of controlling spatial audio
effects and sound-field modification tools. The majority of such
tools available today, however, are instead limited to linear combinations of signals or employ a basic single-source parametric
model. Therefore, the purpose of this paper is to present a parametric framework, which seeks to overcome these limitations by first
dividing the sound-field into its multi-source and ambient components based on estimated spatial parameters. It is then demonstrated that by manipulating the spatial parameters prior to reproducing the scene, a number of sound-field modification and spatial
audio effects may be realised; including: directional warping, listener translation, sound source tracking, spatial editing workflows
and spatial side-chaining. Many of the effects described have also
been implemented as real-time audio plug-ins, in order to demonstrate how a user may interact with such tools in practice.
Download One Billion Audio Sounds From Gpu-Enabled Modular Synthesis We release synth1B1, a multi-modal audio corpus consisting of 1
billion 4-second synthesized sounds, paired with the synthesis parameters used to generate them. The dataset is 100x larger than
any audio dataset in the literature. We also introduce torchsynth,
an open source modular synthesizer that generates the synth1B1
samples on-the-fly at 16200x faster than real-time (714MHz) on
a single GPU. Finally, we release two new audio datasets: FM
synth timbre and subtractive synth pitch. Using these datasets, we
demonstrate new rank-based evaluation criteria for existing audio
representations. Finally, we propose a novel approach to synthesizer hyperparameter optimization.
Download A Generative Model for Raw Audio Using Transformer Architectures This paper proposes a novel way of doing audio synthesis at the
waveform level using Transformer architectures. We propose a
deep neural network for generating waveforms, similar to wavenet . This is fully probabilistic, auto-regressive, and causal, i.e.
each sample generated depends on only the previously observed
samples. Our approach outperforms a widely used wavenet architecture by up to 9% on a similar dataset for predicting the next
step. Using the attention mechanism, we enable the architecture
to learn which audio samples are important for the prediction of
the future sample. We show how causal transformer generative
models can be used for raw waveform synthesis. We also show
that this performance can be improved by another 2% by conditioning samples over a wider context. The flexibility of the current
model to synthesize audio from latent representations suggests a
large number of potential applications. The novel approach of using generative transformer architectures for raw audio synthesis
is, however, still far away from generating any meaningful music
similar to wavenet, without using latent codes/meta-data to aid the
generation process.
Download Combining Zeroth and First-Order Analysis With Lagrange Polynomials to Reduce Artefacts in Live Concatenative Granulation This paper presents a technique addressing signal discontinuity and concatenation artefacts in real-time granular processing
with rectangular windowing. By combining zero-crossing synchronicity, first-order derivative analysis, and Lagrange polynomials, we can generate streams of uncorrelated and non-overlapping
sonic fragments with minimal low-order derivatives discontinuities. The resulting open-source algorithm, implemented in the
Faust language, provides a versatile real-time software for dynamical looping, wavetable oscillation, and granulation with reduced artefacts due to rectangular windowing and no artefacts
from overlap-add-to-one techniques commonly deployed in granular processing.
Download Alloy Sounds: Non-Repeating Sound Textures With Probabilistic Cellular Automata Contemporary musicians commonly face the challenge of finding
new, characteristic sounds that can make their compositions more
distinct. They often resort to computers and algorithms, which can
significantly aid in creative processes by generating unexpected
material in controlled probabilistic processes. In particular, algorithms that present emergent behaviors, like genetic algorithms
and cellular automata, have fostered a broad diversity of musical explorations. This article proposes an original technique for
the computer-assisted creation and manipulation of sound textures.
The technique uses Probabilistic Cellular Automata, which are yet
seldom explored in the music domain, to blend two audio tracks
into a third, different one. The proposed blending process works
by dividing the source tracks into frequency bands and then associating each of the automaton’s cell to a frequency band. Only one
source, chosen by the cell’s state, is active within each band. The
resulting track has a non-repeating textural pattern that follows the
changes in the Cellular Automata. This blending process allows
the musician to choose the original material and the blend granularity, significantly changing the resulting blends. We demonstrate
how to use the proposed blending process in sound design and its
application in experimental and popular music.
Download Graph-Based Audio Looping and Granulation In this paper we describe similarity graphs computed from timefrequency analysis as a guide for audio playback, with the aim
of extending the content of fixed recordings in creative applications. We explain the creation of the graph from the distance between spectral frames, as well as several features computed from
the graph, such as methods for onset detection, beat detection, and
cluster analysis. Several playback algorithms can be devised based
on conditional pruning of the graph using these methods. We describe examples for looping, granulation, and automatic montage.
Download Topologizing Sound Synthesis via Sheaves In recent years, a range of topological methods have emerged for
processing digital signals. In this paper we show how the construction of topological filters via sheaves can be used to topologize
existing sound synthesis methods. I illustrate this process on two
classes of synthesis approaches: (1) based on linear-time invariant digital filters and (2) based on oscillators defined on a circle.
We use the computationally-friendly approach to modeling topologies via a simplicial complex, and we attach our classical synthesis
methods to them via sheaves. In particular, we explore examples
of simplicial topologies that mimic sampled lines and loops. Over
these spaces we realize concrete examples of simple discrete harmonic oscillators (resonant filters), and simple comb filter based
algorithms (such as Karplus-Strong) as well as frequency modulation.
Download Bio-Inspired Optimization of Parametric Onset Detectors Onset detectors are used to recognize the beginning of musical
events in audio signals. Manual parameter tuning for onset detectors is a time consuming task, while existing automated approaches often maximize only a single performance metric. These
automated approaches cannot be used to optimize detector algorithms for complex scenarios, such as real-time onset detection
where an optimization process must consider both detection accuracy and latency. For this reason, a flexible optimization algorithm
should account for more than one performance metric in a multiobjective manner. This paper presents a generalized procedure for
automated optimization of parametric onset detectors. Our procedure employs a bio-inspired evolutionary computation algorithm
to replace manual parameter tuning, followed by the computation
of the Pareto frontier for multi-objective optimization. The proposed approach was evaluated on all the onset detection methods
of the Aubio library, using a dataset of monophonic acoustic guitar
recordings. Results show that the proposed solution is effective in
reducing the human effort required in the optimization process: it
replaced more than two days of manual parameter tuning with 13
hours and 34 minutes of automated computation. Moreover, the
resulting performance was comparable to that obtained by manual
optimization.