Frequency-domain techniques for high-quality voice modification
This paper presents new frequency-domain voice modification techniques that combine the high-quality usually obtained by timedomain techniques such as TD-PSOLA with the flexibility provided by the frequency-domain representation. The technique only works for monophonic sources (single-speaker), and relies on a (possibly online) pitch detection. Based on the pitch, and according to the desired pitch and formant modifications, individual harmonics are selected and shifted to new locations in the spectrum. The harmonic phases are updated according to a pitchbased method that aims to achieve time-domain shape-invariance, thereby reducing or eliminating the usual artifacts associated with frequency-domain and sinusoidal-based voice modification techniques. The result is a fairly inexpensive, flexible algorithm which is able to match the quality of time-domain techniques, but provides vastly improved flexibility in the array of available modifications.