Download A Source-Filter Model for Quasi-Harmonic Instruments
In this paper we propose a new method for a generalized model representing the time-varying spectral characteristics of quasi harmonic instruments. This approach comprises a linear sourcefilter model, a parameter estimation method and a model evaluation based on the prototype’s variance. The source-filter-model is composed of an excitation source generating sinusoidal parameter trajectories and a modeling resonance filter, whereas basic-splines (B-Splines) are used to model continuous trajectories. To estimate the model parameters we apply a gradient decent method to a training database and the prototype’s variance is being estimated on a test database. Such a model could later be used as a priori knowledge for polyphonic instrument recognition, polyphonic transcription and source separation algorithms as well as for resynthesis.
Download A Segmental Spectro-Temporal Model of Musical Timbre
We propose a new statistical model of musical timbre that handles the different segments of the temporal envelope (attack, sustain and release) separately in order to account for their different spectral and temporal behaviors. The model is based on a reduced-dimensionality representation of the spectro-temporal envelope. Temporal coefficients corresponding to the attack and release segments are subjected to explicit trajectory modeling based on a non-stationary Gaussian Process. Coefficients corresponding to the sustain phase are modeled as a multivariate Gaussian. A compound similarity measure associated with the segmental model is proposed and successfully tested in instrument classification experiments. Apart from its use in a statistical framework, the modeling method allows intuitive and informative visualizations of the characteristics of musical timbre.
Download Between Physics and Perception: Signal Models for High Level Audio Processing
The use of signal models is one of the key factors enabling us to establish high quality signal transformation algorithms with intuitive high level control parameters. In the present article we will discuss signal models, and the signal transformation algorithms that are based on these models, in relation to the physical properties of the sound source and the properties of human sound perception. We will argue that the implementation of perceptually intuitive high quality signal transformation algorithms requires strong links between the signal models and the perceptually relevant physical properties of the sound source. We will present an overview over the history of 2 sound models that are used for sound transformation and will show how the past and future evolution of sound transformation algorithms is driven by our understanding of the physical world.
Download A Shape-Invariant Phase Vocoder for Speech Transformation
This paper proposes a new method for shape invariant realtime modification of speech signals. The method can be understood as a frequency domain SOLA algorithm that is using the phase vocoder algorithm for phase synchronization. Compared to time domain SOLA the new implementation provides improved time synchronization during overlap add and improved quality of the noise components of the transformed speech signals. The algorithm has been compared in two perceptual tests with recent implementations of PSOLA and HNM algorithms demonstrating a very satisfying performance. Due to the fact that the quality of transformed signals stays constant over a wide range of transformation parameters the algorithm is well suited for real-time gender and age transformations.
Download A Reduced Multiple Gabor Frame for Local Time Adaptation of the Spectrogram
In this paper we propose a method for automatic local time adaptation of the spectrogram of an audio signal, based on its decomposition within a Gabor multi-frame. The sparsity of the analyses within each individual frame is evaluated through the Rényi entropies measures. According to the sparsity of the decompositions, an optimal resolution and a reduced multi-frame are determined, defining an adapted spectrogram with variable resolution and hop size. The composition of such a reduced multi-frame allows an immediate definition of a dual frame: re-synthesis techniques for this adapted analysis are easily derived by the traditional phase vocoder scheme.