Download Radial Basis Function Networks for conversion of sound spectra
In many high-level signal processing tasks, such as pitch shifting, voice conversion or sound synthesis, accurate spectral processing is required. Here, the use of Radial Basis Function Networks (RBFN) is proposed for the modeling of the spectral changes (or conversions) related to the control of important sound parameters, such as pitch or intensity. The identification of such conversion functions is based on a procedure which learns the shape of the conversion from few couples of target spectra from a data set. The generalization properties of RBFNs provides for interpolation with respect to the pitch range. In the construction of the training set, mel-cepstral encoding of the spectrum is used to catch the perceptually most relevant spectral changes. The RBFN conversion functions introduced are characterized by a perceptually-based fast training procedure, desirable interpolation properties and computational efficiency.
Download Physics-Based and Spike-Guided Tools for Sound Design
In this paper we present graphical tools and parameters search algorithms for the timbre space exploration and design of complex sounds generated by physical modeling synthesis. The tools are built around a sparse representation of sounds based on Gammatone functions and provide the designer with both a graphical and an auditory insight. The auditory representation of a number of reference sounds, located as landmarks in a 2D sound design space, provides the designer with an effective aid to direct his search for new sounds. The sonic landmarks can either be synthetic sounds chosen by the user or be automatically derived by using clever parameter search and clustering algorithms. The proposed probabilistic method in this paper makes use of the sparse representations to model the distance between sparsely represented sounds. A subsequent optimization model minimizes those distances to estimate the optimal parameters, which generate the landmark sounds on the given auditory landscape.
Download Symbolic and audio processing to change the expressive intention of a recorded music performance
A framework for real-time expressive modification of audio musical performances is presented. An expressiveness model compute the deviations of the musical parameters which are relevant in terms of control of the expressive intention. The modifications are then realized by the integration of the model with a sound processing engine.
Download Model-based synthesis and transformation of voiced sounds
In this work a glottal model loosely based on the Ishizaka and Flanagan model is proposed, where the number of parameters is drastically reduced. First, the glottal excitation waveform is estimated, together with the vocal tract filter parameters, using inverse filtering techniques. Then the estimated waveform is used in order to identify the nonlinear glottal model, represented by a closedloop configuration of two blocks: a second order resonant filter, tuned with respect to the signal pitch, and a regressor-based functional, whose coefficients are estimated via nonlinear identification techniques. The results show that an accurate identification of real data can be achieved with less than regressors of the nonlinear functional, and that an intuitive control of fundamental features, such as pitch and intensity, is allowed by acting on the physically informed parameters of the model. 10