Download On the Application of RLS Adaptive Filtering for Voice Pitch Modification
This paper presents a pitch modification scheme, based on the recursive least-squares (RLS) adaptive algorithm, for speech and singing voice signals. The RLS filter is used to determine the linear prediction (LP) model on a sample-by-sample framework, as opposed to the LP-coding (LPC) method, which operates on a block basis. Therefore, an RLS-based approach is able to preserve the natural subtle variations on the vocal tract model, avoiding discontinuities in the synthesized signal and the inherent frame-delay associated to classic methods. The LP residual is modified in the synthesis stage in order to generate the output signal. Listening tests verify the overall quality of the synthesized signal using the RLS approach, indicating that this technique is suitable for realtime applications.
Download A Database of Partial Tracks for Evaluation of Sinusoidal Models
This paper presents a database of partial tracks extracted from synthetic as well as pre-recorded musical signals, designed to serve as an ancillary tool for evaluation of sinusoidal analysis algorithms. In order to accomplish this goal, the database requirements have been carefully specified. A semi-automatic analysis methodology to ensure the track parameters are precisely estimated has been employed. The overall methodology is validated via the application of performance tests over the synthetic source-signals.
Download High-Definition Time-Frequency Representation Based on Adaptive Combination of Fan-Chirp Transforms via Structure Tensor
This paper presents a novel technique for producing high-definition time-frequency representations by combining different instances of short-time fan-chirp transforms. The proposed method uses directional information provided by an image processing technique named structure tensor, applied over a spectrogram of the input signal. This information indicates the best analysis window size and chirp parameter for each time-frequency bin, and feeds a simple interpolation procedure, which produces the final representation. The method allows the proper representation of more than one sound source simultaneously via fan-chirp transforms with different resolutions, and provides a precise reproduction of transient information. Experiments in both synthetic and real audio illustrate the performance of the proposed system.