Download Sinusoid Extraction and Salience Function Design for Predominant Melody Estimation In this paper we evaluate some of the alternative methods commonly applied in the first stages of the signal processing chain of automatic melody extraction systems. Namely, the first two stages are studied – the extraction of sinusoidal components and the computation of a time-pitch salience function, with the goal of determining the benefits and caveats of each approach under the specific context of predominant melody estimation. The approaches are evaluated on a data-set of polyphonic music containing several musical genres with different singing/playing styles, using metrics specifically designed for measuring the usefulness of each step for melody extraction. The results suggest that equal loudness filtering and frequency/amplitude correction methods provide significant improvements, whilst using a multi-resolution spectral transform results in only a marginal improvement compared to the standard STFT. The effect of key parameters in the computation of the salience function is also studied and discussed.
Download Audio De-Thumping using Huang s Empirical Mode Decomposition In the context of audio restoration, sound transfer of broken disks usually produces audio signals corrupted with long pulses of low-frequency content, also called thumps. This paper presents a method for audio de-thumping based on Huang’s Empirical Mode Decomposition (EMD), provided the pulse locations are known beforehand. Thus, the EMD is used as a means to obtain pulse estimates to be subtracted from the degraded signals. Despite its simplicity, the method is demonstrated to tackle well the challenging problem of superimposed pulses. Performance assessment against selected competing solutions reveals that the proposed solution tends to produce superior de-thumping results.
Download Vivos Voco: A survey of recent research on voice transformations at IRCAM IRCAM has a long experience in analysis, synthesis and transformation of voice. Natural voice transformations are of great interest for many applications and can be combine with text-to-speech system, leading to a powerful creation tool. We present research conducted at IRCAM on voice transformations for the last few years. Transformations can be achieved in a global way by modifying pitch, spectral envelope, durations etc. While it sacrifices the possibility to attain a specific target voice, the approach allows the production of new voices of a high degree of naturalness with different gender and age, modified vocal quality, or another speech style. These transformations can be applied in realtime using ircamTools TR A X.Transformation can also be done in a more specific way in order to transform a voice towards the voice of a target speaker. Finally, we present some recent research on the transformation of expressivity.