DAFx Paper Archive - Search for machine learning, page 24 of 32

A Model for Adaptive Reduced-Dimensionality Equalisation

Spyridon Stasis; Ryan Stables; Jason Hockman

DAFx-2015 - Trondheim

We present a method for mapping between the input space of a parametric equaliser and a lower-dimensional representation, whilst preserving the effect’s dependency on the incoming audio signal. The model consists of a parameter weighting stage in which the parameters are scaled to spectral features of the audio signal, followed by a mapping process, in which the equaliser’s 13 inputs are converted to (x, y) coordinates. The model is trained with parameter space data representing two timbral adjectives (warm and bright), measured across a range of musical instrument samples, allowing users to impose a semantically-meaningful timbral modification using the lower-dimensional interface. We test 10 mapping techniques, comprising of dimensionality reduction and reconstruction methods, and show that a stacked autoencoder algorithm exhibits the lowest parameter reconstruction variance, thus providing an accurate map between the input and output space. We demonstrate that the model provides an intuitive method for controlling the audio effect’s parameter space, whilst accurately reconstructing the trajectories of each parameter and adapting to the incoming audio spectrum.

Download

High-level musical control paradigms for Digital Signal Processing

Stroppa M.

DAFx-2000 - Verona

No matter how complex DSP algorithms are and how rich sonic processes they produce, the issue of their control immediately arises when they are used by musicians, independently on their knowledge of the underlying mathematics or their degree of familiarity with the design of digital instruments. This text will analyze the problem of the control of DSP modules from a compositional standpoint. An implementation of some paradigms in a Lisp-based environment (omChroma) will also be concisely discussed.

Download

Characterisation of Acoustic Scenes Using a Temporally-constrained Shift-invariant Model

Emmanouil Benetos; Mathieu Lagrange; Simon Dixon

DAFx-2012 - York

In this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance.

Download

Bio-Inspired Optimization of Parametric Onset Detectors

Domenico Stefani; Luca Turchet

DAFx-2021 - Vienna (virtual)

Onset detectors are used to recognize the beginning of musical events in audio signals. Manual parameter tuning for onset detectors is a time consuming task, while existing automated approaches often maximize only a single performance metric. These automated approaches cannot be used to optimize detector algorithms for complex scenarios, such as real-time onset detection where an optimization process must consider both detection accuracy and latency. For this reason, a flexible optimization algorithm should account for more than one performance metric in a multiobjective manner. This paper presents a generalized procedure for automated optimization of parametric onset detectors. Our procedure employs a bio-inspired evolutionary computation algorithm to replace manual parameter tuning, followed by the computation of the Pareto frontier for multi-objective optimization. The proposed approach was evaluated on all the onset detection methods of the Aubio library, using a dataset of monophonic acoustic guitar recordings. Results show that the proposed solution is effective in reducing the human effort required in the optimization process: it replaced more than two days of manual parameter tuning with 13 hours and 34 minutes of automated computation. Moreover, the resulting performance was comparable to that obtained by manual optimization.

Download

Rumbator: a Flamenco Rumba Cover Version Generator Based on Audio Processing at Note Level

Carles Roig; Isabel Barbancho; Emilio Molina; Lorenzo J. Tardón; Ana María Barbancho

DAFx-2013 - Maynooth

In this article, a scheme to automatically generate polyphonic flamenco rumba versions from monophonic melodies is presented. Firstly, we provide an analysis about the parameters that defines the flamenco rumba, and then, we propose a method for transforming a generic monophonic audio signal into such a style. Our method firstly transcribes the monophonic audio signal into a symbolic representation, and then a set of note-level audio transformations based on music theory is applied to the monophonic audio signal in order to transform it to the polyphonic flamenco rumba style. Some audio examples of this transformation software are also provided.

Download

A Direct Microdynamics Adjusting Processor with Matching Paradigm and Differentiable Implementation

Shahan Nercessian; Russell McClellan; Alexey Lukin

DAFx-2022 - Vienna

In this paper, we propose a new processor capable of directly changing the microdynamics of an audio signal primarily via a single dedicated user-facing parameter. The novelty of our processor is that it has built into it a measure of relative level, a short-term signal strength measurement which is robust to changes in signal macrodynamics. Consequent dynamic range processing is signal level-independent in its nature, and attempts to directly alter its observed relative level measurements. The inclusion of such a meter within our proposed processor also gives rise to a natural solution to the dynamics matching problem, where we attempt to transfer the microdynamic characteristics of one audio recording to another by means of estimating appropriate settings for the processor. We suggest a means of providing a reasonable initial guess for processor settings, followed by an efficient iterative algorithm to refine upon our estimates. Additionally, we implement the processor as a differentiable recurrent layer and show its effectiveness when wrapped around a gradient descent optimizer within a deep learning framework. Moreover, we illustrate that the proposed processor has more favorable gradient characteristics relative to a conventional dynamic range compressor. Throughout, we consider extensions of the processor, matching algorithm, and differentiable implementation for the multiband case.

Download

Composing Musical Spaces By Means of Decorrelation of Audio Signals

Vaggione H.

DAFx-2001 - Limerick

Download

Controlling a Non Linear Friction Model for Evocative Sound Synthesis Applications

Etienne Thoret; Mitsuko Aramaki; Charles Gondre; Richard Kronland-Martinet; Sølvi Ystad

DAFx-2013 - Maynooth

In this paper, a flexible strategy to control a synthesis model of sounds produced by non linear friction phenomena is proposed for guidance or musical purposes. It enables to synthesize different types of sounds, such a creaky door, a singing glass or a squeaking wet plate. This approach is based on the action/object paradigm that enables to propose a synthesis strategy using classical linear filtering techniques (source/resonance approach) which provide an efficient implementation. Within this paradigm, a sound can be considered as the result of an action (e.g. impacting, rubbing, ...) on an object (plate, bowl, ...). However, in the case of non linear friction phenomena, simulating the physical coupling between the action and the object with a completely decoupled source/resonance model is a real and relevant challenge. To meet this challenge, we propose to use a synthesis model of the source that is tuned on recorded sounds according to physical and spectral observations. This model enables to synthesize many types of non linear behaviors. A control strategy of the model is then proposed by defining a flexible physically informed mapping between a descriptor, and the non linear synthesis behavior. Finally, potential applications to the remediation of motor diseases are presented. In all sections, video and audio materials are available at the following URL: http://www.lma.cnrs-mrs.fr/~kronland/ thoretDAFx2013/

Download

Fluently Remixing Musical Objects with Higher-Order Functions

Adam Lindsay; David Hutchison

DAFx-2009 - Como

Soon after the Echo Nest Remix API was made publicly available and open source, the primary author began aggressively enhancing the Python framework for re-editing music based on perceptually-based musical analyses. The basic principles of this API – integrating content-based metadata with the underlying signal – are described in the paper, then the authors’ enhancements are described. The libraries moved from supporting an imperative coding style to incorporating influences from functional programming and domain specific languages to allow for a much more fluent, terse coding style, allowing users to concentrate on the functions needed to find the portions of the song that were interesting, and modifying them. The paper then goes on to describe enhancements involving mixing multiple sources with one another and enabling user-created and user-modifiable effects that are controlled by direct manipulation of the objects that represent the sound. Revelations that the Remix API does not need to be as integrated as it currently is point to future directions for the API at the end of the paper.

Download

Implementing Real-Time Partitioned Convolution Algorithms on Conventional Operating Systems

Eric Battenberg; Rimas Avizienis

DAFx-2011 - Paris

We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preemptive). We discuss the optimizations applied to both implementations and present measurements of their performance for a range of impulse response lengths on a recent high-end desktop machine. We find that while the time-distributed implementation is better suited for use as a plugin within a host audio application, the preemptive version was easier to implement and significantly outperforms the time-distributed version despite the overhead of frequent context switches.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors