Download Vocal synthesis and graphical representation of the phonetic gestures underlying guitar timbre description
The guitar is an instrument that gives the player great control over timbre. Different plucking techniques involve varying the finger position along the string, the inclination between the finger and the string, the inclination between the hand and the string and the degree of relaxation of the plucking finger. Guitarists perceive subtle variations of these parameters and they have developed a very rich vocabulary to describe the brightness, the colour, the shape and the texture of the sounds they produce on their instrument. Dark, bright, chocolatey, transparent, muddy, wooly, glassy, buttery, and metallic are just a few of those adjectives. The aim of this research is to conceive a computer tool producing the synthesis of the vocal imitation as well as the graphical representation of phonetic gestures underlying the description of the timbre of the classical guitar, as a function of the instrumental gesture parameters (mainly the plucking angle and distance from the bridge) and based on perceptual analogies between guitar and speech sounds. Similarly to the traditional teaching of tabla which uses onomatopeia to designate the different strokes, vocal imitation of guitar timbres could provide a common language to guitar performers, complementary to the mental imagery they commonly use to communicate about timbre, in a pedagogical context for example.
Download Improving Sinusoidal Frequency Estimation Using a Trigonometric Approach
Estimating the frequency of sinusoidal components is the first part of the sinusoidal analysis chain. Among numerous frequency estimators presented in the literature, we propose to study an estimator proposed in [1] known as the derivative algorithm. Thanks to a trigonometric interpretation of this frequency estimator, we are able to propose a new estimator which improves estimation performance for the frequencies close to the Nyquist frequency without any computational overload.
Download An Efficient Algorithm for Real-Time Spectrogram Inversion
We present a computationally efficient real-time algorithm for constructing audio signals from spectrograms. Spectrograms consist of a time sequence of short time Fourier transform magnitude (STFTM) spectra. During the audio signal construction process, phases are derived for the individual frequency components so that the spectrogram of the constructed signal is as close as possible to the target spectrogram given real-time constraints. The algorithm is a variation of the classic Griffin and Lim [1] technique modified to be computable in real-time. We discuss the application of the algorithm to time-scale modification of audio signals such as speech and music, and performance is compared with other methods. The new algorithm generates comparable or better results with significantly less computation. The phase consistency between adjacent frames produces excellent subjective sound quality with minimal fame transition artifacts.
Download GABOR, multi-representation real-time analysis/synthesis
This article describes a set of modules for Max/MSP for real-time sound analysis and synthesis combining various models, representations and timing paradigms. Gabor provides a unified framework for granular synthesis, PSOLA, phase vocoder, additive synthesis and other STFT techniques. Gabor’s processing scheme allows for the treatment of atomic sound particles at arbitrary rates and instants. Gabor is based on FTM, an extension of Max/MSP, introducing complex data structures such as matrices and sequences to the Max data flow programming paradigm. Most of the signal processing operators of the Gabor modules handle vector and matrix representations closely related to SDIF sound description formats.
Download Speech/music discrimination based on a new warped LPC-based feature and linear discriminant analysis
Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents a low complexity but effective approach, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPCSC). Comparison between WLPC-SC and the classical features proposed in [9] is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic based feature is reduced to a few statistical values (mean, variance and skewness), which are then transformed to a new feature space by applying LDA with the aim of increasing the classification accuracy percentage. The classification task is performed by applying SVM to the features in the transformed space. The classification results for different types of music and speech show the good discriminating power of the proposed approach.
Download Hidden Markov Models for spectral similarity of songs
Hidden Markov Models (HMM) are compared to Gaussian Mixture Models (GMM) for describing spectral similarity of songs. Contrary to previous work we make a direct comparison based on the log-likelihood of songs given an HMM or GMM. Whereas the direct comparison of log-likelihoods clearly favors HMMs, this advantage in terms of modeling power does not allow for any gain in genre classification accuracy.
Download Adaptive Network-Based Fuzzy Inference System for Automatic Speech/Music Discrimination
Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents an effective approach based on an Adaptive Network-Based Fuzzy Inference System (ANFIS) for the classification stage required in a speech/music discrimination system. A new simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC), is also proposed. Comparison between WLPC-SC and some of the classical features proposed in [11] is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic-based feature is reduced to a few statistical values (mean, variance and skewness). To evaluate the performance of the ANFIS system for speech/music discrimination, comparison to other commonly used classifiers is reported. The classification results for different types of music and speech show the good discriminating power of the proposed approach.
Download Blind Source Separation Using Repetitive Structure
Blind source separation algorithms typically involve decorrelating time-aligned mixture signals. The usual assumption is that all sources are active at all times. However, if this is not the case, we show that the unique pattern of source activity/inactivity helps separation. Music is the most obvious example of sources exhibiting repetitive structure because it is carefully constructed. We present a novel source separation algorithm based on spatial time-time distributions that capture the repetitive structure in audio. Our method outperforms time-frequency source separation when source spectra are highly overlapping.
Download Intermodulation Effects Analysis using Complex Bandpass Filterbanks
The objective of this paper is to show the ability of complex bandpass filterbanks to extract the intermodulation information that appears when two audio signals interact inside the same analysis band. To perform the analysis a sinusoidal model of the signals has been assumed. Three kinds of signals have been analyzed: a sum of two cosines, a sum of two linear chirps and a sum of two exponential chirps. The complex bandpass filtering of the signals is carried out using a new algorithm based on the Complex Continuous Wavelet Transform. The developed algorithm has been validated comparing the practical results with the theoretical instantaneous amplitude and instantaneous phase of the obtained model of the signals. With the appropriate width, the complex bandpass filters show the same behaviour as our perceptual ability to discriminate interacting tones when they fall inside a critical band of the human ear.
Download Enhanced digital models for analog modulation effects
This paper presents digital models for analog phaser and flanger / chorus effects. The structure of analog phasers is reviewed. The operation of two phaser implementations is analyzed and nonlinear digital models are presented for them. The models are based on cascades of one-pole filters with embedded nonlinearities and are suitable for real-time implementation. Modifications to standard digital flanger / chorus effect are also presented. A method to warp the delay time to more closely resemble the behavior of bucket brigade delays is presented. Also a simple model for companders used in such analog effect units is presented.