Download Pure Data External for Reactive HMM-Based Speech and Singing Synthesis
In this paper, we present the recent progress in the M AGE project. M AGE is a library for reactive HMM-based speech and singing synthesis. Here, it is integrated as a Pure Data external, called mage~, which provides reactive voice quality, prosody and identity manipulation combined with contextual control. mage~ brings together the high-quality, natural and expressive speech of HMMbased speech synthesis with high flexibility and reactive control over the speech production level. Such an object provides a basis for further research in gesturally-controlled speech synthesis. It is an object that can “listen” and reactively adjust itself to its environment. Further in this work, based on mage~ we create different interfaces and controllers in order to explore the realtime, expressive and interactive nature of speech.
Download Audio Time-Scaling for Slow Motion Sports Videos
Slow motion videos are frequently featured during broadcast of sports events. However, these videos do not feature any audio channel, apart from the live ambiance and comments from sports presenters. Standard audio time-scaling methods were not developed with such noisy signal in mind and they do not always permit to obtain an acceptable acoustic quality. In this work, we present a new approach that creates high-quality time-stretched version of sport audio recordings while preserving all their transient events.