Download A Model for Adaptive Reduced-Dimensionality Equalisation We present a method for mapping between the input space of a parametric equaliser and a lower-dimensional representation, whilst preserving the effect’s dependency on the incoming audio signal. The model consists of a parameter weighting stage in which the parameters are scaled to spectral features of the audio signal, followed by a mapping process, in which the equaliser’s 13 inputs are converted to (x, y) coordinates. The model is trained with parameter space data representing two timbral adjectives (warm and bright), measured across a range of musical instrument samples, allowing users to impose a semantically-meaningful timbral modification using the lower-dimensional interface. We test 10 mapping techniques, comprising of dimensionality reduction and reconstruction methods, and show that a stacked autoencoder algorithm exhibits the lowest parameter reconstruction variance, thus providing an accurate map between the input and output space. We demonstrate that the model provides an intuitive method for controlling the audio effect’s parameter space, whilst accurately reconstructing the trajectories of each parameter and adapting to the incoming audio spectrum.
Download High-level musical control paradigms for Digital Signal Processing No matter how complex DSP algorithms are and how rich sonic processes they produce, the issue of their control immediately arises when they are used by musicians, independently on their knowledge of the underlying mathematics or their degree of familiarity with the design of digital instruments. This text will analyze the problem of the control of DSP modules from a compositional standpoint. An implementation of some paradigms in a Lisp-based environment (omChroma) will also be concisely discussed.
Download Characterisation of Acoustic Scenes Using a Temporally-constrained Shift-invariant Model In this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance.
Download Bio-Inspired Optimization of Parametric Onset Detectors Onset detectors are used to recognize the beginning of musical
events in audio signals. Manual parameter tuning for onset detectors is a time consuming task, while existing automated approaches often maximize only a single performance metric. These
automated approaches cannot be used to optimize detector algorithms for complex scenarios, such as real-time onset detection
where an optimization process must consider both detection accuracy and latency. For this reason, a flexible optimization algorithm
should account for more than one performance metric in a multiobjective manner. This paper presents a generalized procedure for
automated optimization of parametric onset detectors. Our procedure employs a bio-inspired evolutionary computation algorithm
to replace manual parameter tuning, followed by the computation
of the Pareto frontier for multi-objective optimization. The proposed approach was evaluated on all the onset detection methods
of the Aubio library, using a dataset of monophonic acoustic guitar
recordings. Results show that the proposed solution is effective in
reducing the human effort required in the optimization process: it
replaced more than two days of manual parameter tuning with 13
hours and 34 minutes of automated computation. Moreover, the
resulting performance was comparable to that obtained by manual
optimization.
Download Rumbator: a Flamenco Rumba Cover Version Generator Based on Audio Processing at Note Level In this article, a scheme to automatically generate polyphonic flamenco rumba versions from monophonic melodies is presented. Firstly, we provide an analysis about the parameters that defines the flamenco rumba, and then, we propose a method for transforming a generic monophonic audio signal into such a style. Our method firstly transcribes the monophonic audio signal into a symbolic representation, and then a set of note-level audio transformations based on music theory is applied to the monophonic audio signal in order to transform it to the polyphonic flamenco rumba style. Some audio examples of this transformation software are also provided.
Download A Direct Microdynamics Adjusting Processor with Matching Paradigm and Differentiable Implementation In this paper, we propose a new processor capable of directly changing the microdynamics of an audio signal primarily via a single dedicated user-facing parameter. The novelty of our processor is that it has built into it a measure of relative level, a short-term signal strength measurement which is robust to changes in signal macrodynamics. Consequent dynamic range processing is signal level-independent in its nature, and attempts to directly alter its observed relative level measurements. The inclusion of such a meter within our proposed processor also gives rise to a natural solution to the dynamics matching problem, where we attempt to transfer the microdynamic characteristics of one audio recording to another by means of estimating appropriate settings for the processor. We suggest a means of providing a reasonable initial guess for processor settings, followed by an efficient iterative algorithm to refine upon our estimates. Additionally, we implement the processor as a differentiable recurrent layer and show its effectiveness when wrapped around a gradient descent optimizer within a deep learning framework. Moreover, we illustrate that the proposed processor has more favorable gradient characteristics relative to a conventional dynamic range compressor. Throughout, we consider extensions of the processor, matching algorithm, and differentiable implementation for the multiband case.
Download Controlling a Non Linear Friction Model for Evocative Sound Synthesis Applications In this paper, a flexible strategy to control a synthesis model of sounds produced by non linear friction phenomena is proposed for guidance or musical purposes. It enables to synthesize different types of sounds, such a creaky door, a singing glass or a squeaking wet plate. This approach is based on the action/object paradigm that enables to propose a synthesis strategy using classical linear filtering techniques (source/resonance approach) which provide an efficient implementation. Within this paradigm, a sound can be considered as the result of an action (e.g. impacting, rubbing, ...) on an object (plate, bowl, ...). However, in the case of non linear friction phenomena, simulating the physical coupling between the action and the object with a completely decoupled source/resonance model is a real and relevant challenge. To meet this challenge, we propose to use a synthesis model of the source that is tuned on recorded sounds according to physical and spectral observations. This model enables to synthesize many types of non linear behaviors. A control strategy of the model is then proposed by defining a flexible physically informed mapping between a descriptor, and the non linear synthesis behavior. Finally, potential applications to the remediation of motor diseases are presented. In all sections, video and audio materials are available at the following URL: http://www.lma.cnrs-mrs.fr/~kronland/ thoretDAFx2013/
Download Fluently Remixing Musical Objects with Higher-Order Functions Soon after the Echo Nest Remix API was made publicly available and open source, the primary author began aggressively enhancing the Python framework for re-editing music based on perceptually-based musical analyses. The basic principles of this API – integrating content-based metadata with the underlying signal – are described in the paper, then the authors’ enhancements are described. The libraries moved from supporting an imperative coding style to incorporating influences from functional programming and domain specific languages to allow for a much more fluent, terse coding style, allowing users to concentrate on the functions needed to find the portions of the song that were interesting, and modifying them. The paper then goes on to describe enhancements involving mixing multiple sources with one another and enabling user-created and user-modifiable effects that are controlled by direct manipulation of the objects that represent the sound. Revelations that the Remix API does not need to be as integrated as it currently is point to future directions for the API at the end of the paper.
Download Implementing Real-Time Partitioned Convolution Algorithms on Conventional Operating Systems We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preemptive). We discuss the optimizations applied to both implementations and present measurements of their performance for a range of impulse response lengths on a recent high-end desktop machine. We find that while the time-distributed implementation is better suited for use as a plugin within a host audio application, the preemptive version was easier to implement and significantly outperforms the time-distributed version despite the overhead of frequent context switches.