Download Re-targeting Expressive Musical Style Using a Machine-Learning Method Expressive musical performing style involves more than what is simply represented on the score. Performers imprint their personal style on each performances based on their musical understanding. Expressive musical performing style makes the music come alive by shaping the music through continuous variation. It is observed that the musical style can be represented by appropriate numerical parameters, where most parameters are related to the dynamics. It is also observed that performers tends to perform music sections and motives of similar shape in similar ways, where music sections and motives can be identified by an automatic phrasing algorithm. An experiment is proposed for producing expressive music from raw quantized music files using machine-learning methods like Support Vector Machines. Experimental results show that it is possible to induce some of a performer’s style by using the music parameters extracted from the audio recordings of their real performance.
Download Reservoir Computing: a powerful Framework for Nonlinear Audio Processing This paper proposes reservoir computing as a general framework for nonlinear audio processing. Reservoir computing is a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. It can in theory approximate arbitrary nonlinear dynamical systems with arbitrary precision, has an inherent temporal processing capability and is therefore well suited for many nonlinear audio processing problems. Always when nonlinear relationships are present in the data and time information is crucial, reservoir computing can be applied. Examples from three application areas are presented: nonlinear system identification of a tube amplifier emulator algorithm, nonlinear audio prediction, as necessary in a wireless transmission of audio where dropouts may occur, and automatic melody transcription out of a polyphonic audio stream, as one example from the big field of music information retrieval. Reservoir computing was able to outperform state-of-the-art alternative models in all studied tasks.
Download Towards Morphological Sound Description using segmental models We present an approach to model the temporal evolution of audio descriptors using Segmental Models (SMs). This method yields a signal segmentation into a sequence of primitives, constituted by a set of user-defined trajectories . This allows one to consider specific primitive shapes, model their duration and to take into account the time dependence between successive signal frames, contrary to standard Hidden Markov Models. We applied this approach to a database of violin playing. Various types of glissando and dynamics variations were specifically recorded. The results show that our approach using Segmental Models provides a segmentation that can be easily interpreted. Quantitatively, the Segmental Models performed better than standard implementation of Hidden Markov Models.
Download Music Genre visualization and Classification Exploiting a Small set of High-level Semantic Features In this paper a system for continuous analysis, visualization and classification of musical streams is proposed. The system performs visualization and classification task by means of three high-level, semantic features extracted computing a reduction on a multidimensional low-level feature vector through the usage of Gaussian Mixture Models. The visualization of the semantic characteristics of the audio stream has been implemented by mapping the value of the high-level features on a triangular plot and by assigning to each feature a primary color. In this manner, besides having the representation of musical evolution of the signal, we have also obtained representative colors for each musical part of the analyzed streams. The classification exploits a set of one-against-one threedimensional Support Vector Machines trained on some target genres. The obtained results on visualization and classification tasks are very encouraging: our tests on heterogeneous genre streams have shown the validity of proposed approach.
Download Beat-Marker Location using a Probabilistic Framework and Linear Discriminant Analysis This paper deals with the problem of beat-tracking in an audiofile. Considering time-variable tempo and meter estimation as input, we study two beat-tracking approaches. The first one is based on an adaptation of a method used in speech processing for locating the Glottal Closure Instants. The results obtained with this first approach allow us to derive a set of requirements for a robust approach. This second approach is based on a probabilistic framework. In this approach the beat-tracking problem is formulated as an “inverse” Viterbi decoding problem in which we decode times over beat-numbers according to observation and transition probabilities. A beat-template is used to derive the observation probabilities from the signal. For this task, we propose the use of a machine-learning method, the Linear Discriminant Analysis, to estimate the most discriminative beat-template. We finally propose a set of measures to evaluate the performances of a beattracking algorithm and perform a large-scale evaluation of the two approaches on four different test-sets.
Download KRONOS ‐ A Vectorizing Compiler for Music DSP This paper introduces Kronos, a vectorizing Just in Time compiler designed for musical programming systems. Its purpose is to translate abstract mathematical expressions into high performance computer code. Musical programming system design criteria are considered and a three-tier model of abstraction is presented. The low level expression Metalanguage used in Kronos is described, along with the design choices that facilitate powerful, yet transparent vectorization of the machine code.
Download Local Key estimation Based on Harmonic and Metric Structures In this paper, we present a method for estimating the local keys of an audio signal. We propose to address the problem of local key finding by investigating the possible combination and extension of different previous proposed global key estimation approaches. The specificity of our approach is that we introduce key dependency on the harmonic and the metric structures. In this work, we focus on the relationship between the chord progression and the local key progression in a piece of music. A contribution of our work is that we address the problem of finding a good analysis window length for local key estimation by introducing information related to the metric structure in our model. Key estimation is not performed on empirical-chosen segment length but on segments that are adapted to the analyzed piece and independent from the tempo. We evaluate and analyze our results on a new database composed of classical music pieces.
Download Automatic Target Mixing using Least-Squares Optimization of Gains and Equalization Settings The proposed automatic target mixing algorithm determines the gains and the equalization settings for the mixing of a multi-track recording using a least-squares optimization. These parameters are estimated using a single channel target mix, that is a signal which contains the same audio tracks as the multi-track recording, but that has been previously mixed using some unknown settings. Several tests have been done in order to evaluate the performances of two different approaches to the optimization, namely the sub-band estimator and the FIR filters estimator. The results show that, using the latter technique, the proposed algorithm is able to retrieve the parameters originally applied to the target mix. This achievement can be useful for remastering applications, where both the original recording sessions and the final mix are available, but there is the need to retrieve the mixing parameters originally applied to the various audio tracks.
Download Informed Selection of Frames for Music Similarity Computation In this paper we present a new method to compute frame based audio similarities, based on nearest neighbour density estimation. We do not recommend it is as a practical method for large collections because of the high runtime. Rather, we use this new method for a detailed analysis to get a deeper insight on how a bag of frames approach (BOF) determines similarities among songs, and in particular, to identify those audio frames that make two songs similar from a machine’s point of view. Our analysis reveals that audio frames of very low energy, which are of course not the most salient with respect to human perception, have a surprisingly big influence on current similarity measures. Based on this observation we propose to remove these low-energy frames before computing song models and show, via classification experiments, that the proposed frame selection strategy improves the audio similarity measure.
Download Novel methods in Information Management for Advanced Audio Workflows This paper discusses architectural aspects of a software library for unified metadata management in audio processing applications. The data incorporates editorial, production, acoustical and musicological features for a variety of use cases, ranging from adaptive audio effects to alternative metadata based visualisation. Our system is designed to capture information, prescribed by modular ontology schema. This advocates the development of intelligent user interfaces and advanced media workflows in music production environments. In an effort to reach these goals, we argue for the need of modularity and interoperable semantics in representing information. We discuss the advantages of extensible Semantic Web ontologies as opposed to using specialised but disharmonious metadata formats. Concepts and techniques permitting seamless integration with existing audio production software are described in detail.