Download Frequency-Dependent Characteristics and Perceptual Validation of the Interaural Thresholded Level Distribution The interaural thresholded level distribution (ITLD) is a novel metric of auditory source width (ASW), derived from the psychophysical processes and structures of the inner ear. While several of the ITLD’s objective properties have been presented in previous work, its frequency-dependent characteristics and perceptual relationship with ASW have not been previously explored. This paper presents an investigation into these properties of the ITLD, which exhibits pronounced variation in band-limited behaviour as octaveband centre-frequency is increased. Additionally, a very strong correlation was found between [1 – ITLD] and normalised values of ASW, collected from a semantic differential listening test based on the Multiple Stimulus with Hidden Reference and Anchor (MUSHRA) framework. Perceptual relationships between various ITLD-derived quantities were also investigated, showing that the low-pass filter intrinsic to ITLD calculation strengthened the relationship between [1 – ITLD] and ASW. A subsequent test using transient stimuli, as well as investigations into other psychoacoustic properties of the metric such as its just-noticeabledifference, were outlined as subjects for future research, to gain a deeper understanding of the subjective properties of the ITLD.
Download Extended Source-Filter Model for Harmonic Instruments for Expressive Control of Sound Synthesis and Transformation In this paper we present a revised and improved version of a recently proposed extended source-filter model for sound synthesis, transformation and hybridization of harmonic instruments. This extension focuses mainly on the application for impulsively excited instruments like piano or guitar, but also improves synthesis results for continuously driven instruments including their hybrids. This technique comprises an extensive analysis of an instruments sound database, followed by the estimation of a generalized instrument model reflecting timbre variations according to selected control parameters. Such an instrument model allows for natural sounding transformations and expressive control of instrument sounds regarding its control parameters.
Download Combining classifications based on local and global features: application to singer identification In this paper we investigate the problem of singer identification on acapella recordings of isolated notes. Most of studies on singer identification describe the content of signals of singing voice with features related to the timbre (such as MFCC or LPC). These features aim to describe the behavior of frequencies at a given instant of time (local features). In this paper, we propose to describe sung tone with the temporal variations of the fundamental frequency (and its harmonics) of the note. The periodic and continuous variations of the frequency trajectories are analyzed on the whole note and the features obtained reflect expressive and intonative elements of singing such as vibrato, tremolo and portamento. The experiments, conducted on two distinct data-sets (lyric and pop-rock singers), prove that the new set of features capture a part of the singer identity. However, these features are less accurate than timbre-based features. We propose to increase the recognition rate of singer identification by combining information conveyed by local and global description of notes. The proposed method, that shows good results, can be adapted for classification problem involving a large number of classes, or to combine classifications with different levels of performance.
Download State of the Art in Sound Texture Synthesis The synthesis of sound textures, such as rain, wind, or crowds, is an important application for cinema, multimedia creation, games and installations. However, despite the clearly defined requirments of naturalness and flexibility, no automatic method has yet found widespread use. After clarifying the definition, terminology, and usages of sound texture synthesis, we will give an overview of the many existing methods and approaches, and the few available software implementations, and classify them by the synthesis model they are based on, such as subtractive or additive synthesis, granular synthesis, corpus-based concatenative synthesis, wavelets, or physical modeling. Additionally, an overview is given over analysis methods used for sound texture synthesis, such as segmentation, statistical modeling, timbral analysis, and modeling of transitions. 2
Download Signal Reconstruction from STFT magnitude : a State of the Art This paper presents a review on techniques for signal reconstruction without phase, i.e. when only the spectrogram (the squared magnitude of the Short Time Fourier Transform) of the signal is known. The now standard Griffin and Lim algorithm will be presented, and compared to more recent blind techniques. Two important issues are raised and discussed: first, the definition of relevant criteria to evaluate the performances of different algorithms, and second the question of the unicity of the solution. Some ways of reducing the complexity of the problem are presented with the injection of additional information in the reconstruction. Finally, issues that prevents optimal reconstruction are examined, leading to a discussion on what seem the most promising approaches for future research.
Download ACOUSTIC SIGNAL PROCESSING FOR NEXT-GENERATION HUMAN/MACHINE INTERFACES In this paper, we first define the scenario of a generic acoustic human/machine interface and then formulate the according fundamental signal processing problems. For signal reproduction, the requirements for ideal solutions are stated and some examples for the state of the technology are briefly reviewed. For signal acquisition, the fundamental problems ask for acoustic echo cancellation, desired source extraction, and source localization. After illustrating to which extent acoustic echo cancellation is already a solved problem, we present recent results for separation, dereverberation and localization of multiple source signals. As an underlying motivation for this synoptic treatment, we demonstrate that the considered subproblems (except localization) can be directly interpreted as signal separation or system identification problems with varying degrees of difficulty, which in turn determines the effectiveness of the known solutions.