Download On the Application of RLS Adaptive Filtering for Voice Pitch Modification
This paper presents a pitch modification scheme, based on the recursive least-squares (RLS) adaptive algorithm, for speech and singing voice signals. The RLS filter is used to determine the linear prediction (LP) model on a sample-by-sample framework, as opposed to the LP-coding (LPC) method, which operates on a block basis. Therefore, an RLS-based approach is able to preserve the natural subtle variations on the vocal tract model, avoiding discontinuities in the synthesized signal and the inherent frame-delay associated to classic methods. The LP residual is modified in the synthesis stage in order to generate the output signal. Listening tests verify the overall quality of the synthesized signal using the RLS approach, indicating that this technique is suitable for realtime applications.
Download A Database of Partial Tracks for Evaluation of Sinusoidal Models
This paper presents a database of partial tracks extracted from synthetic as well as pre-recorded musical signals, designed to serve as an ancillary tool for evaluation of sinusoidal analysis algorithms. In order to accomplish this goal, the database requirements have been carefully specified. A semi-automatic analysis methodology to ensure the track parameters are precisely estimated has been employed. The overall methodology is validated via the application of performance tests over the synthetic source-signals.
Download High-Definition Time-Frequency Representation Based on Adaptive Combination of Fan-Chirp Transforms via Structure Tensor
This paper presents a novel technique for producing high-definition time-frequency representations by combining different instances of short-time fan-chirp transforms. The proposed method uses directional information provided by an image processing technique named structure tensor, applied over a spectrogram of the input signal. This information indicates the best analysis window size and chirp parameter for each time-frequency bin, and feeds a simple interpolation procedure, which produces the final representation. The method allows the proper representation of more than one sound source simultaneously via fan-chirp transforms with different resolutions, and provides a precise reproduction of transient information. Experiments in both synthetic and real audio illustrate the performance of the proposed system.
Download Low-cost Numerical Approximation of HRTFs: a Non-Linear Frequency Sampling Approach
Head-related transfer functions (HRTFs) describe filters that model the scattering effect of the human body on sound waves. In their discrete-time form, they are used in acoustic simulations for virtual reality (VR) or augmented reality (AR), and since HRTFs are listener-specific, the use of individualized HRTFs allows achieving more realistic perceptual results. One way to produce individualized HRTFs is by estimating the sound field around the subjects’ 3D representations (meshes) via numerical simulations, which compute discrete complex pressure values in the frequency domain in regular frequency steps. Despite the advances in the area, the computational resources required for this process are still considerably high and increase with frequency. The goal of this paper is to tackle the high computational cost associated with this task by sampling the frequency domain using hybrid linear-logarithmic frequency resolution. The results attained in simulations performed using 23 real 3D meshes suggest that the proposed strategy is able to reduce the computational cost while still providing remarkably low spectral distortion, even in simulations that require as little as 11.2% of the original total processing time.