Download Dispersive and Pitch-Synchronous Processing of Sounds
The aim of this paper is to present results on digital processing of sounds by means of both dispersive delay lines and pitch-synchronous transforms in a unified framework. The background on frequency warping is detailed and applications of this technique are pointed out with reference to the existing literature. These include transient extraction, pitch shifting, harmonic detuning and auditory modeling.
Download Time Varying Frequency Warping: Results And Experiments
Dispersive tapped delay lines are attractive structures for altering the frequency content of a signal. In previous papers we showed that in the case of a homogeneous line with first order all-pass sections the signal formed by the output samples of the chain of delays at a given time is equivalent to compute the Laguerre transform of the input signal. However, most musical signals require a time-varying frequency modification in order to be properly processed. Vibrato in musical instruments or voice intonation in the case of vocal sounds may be modeled as small and slow pitch variations. Simulations of these effects require techniques for time- varying pitch and/or brightness modification that are very useful for sound processing. In our experiments the basis for time-varying frequency warping is a time-varying version of the Laguerre transformation. The corre- sponding implementation structure is obtained as a dispersive tapped delay line, where each of the frequency dependent delay element has its own phase response. Thus, time-varying warping results in a space-varying, inhomogeneous, propagation structure. We show that time-varying frequency warping may be associated to expansion over biorthogonal sets generalizing the discrete Laguerre basis. Slow time-varying characteristics lead to slowly varying parameter sequences. The corresponding sound transformation does not suffer from discontinuities typical of delay lines based on unit delays.
Download Dynamic Models of Pseudo-Periodicity
Voiced musical sounds have non-zero energy in sidebands of the frequency partials. Our work is based on the assumption, often experimentally verified, that the energy distribution of the sidebands is shaped as powers of the inverse of the distance from the closest partial. The power spectrum of these pseudo-periodic processes is modeled by means of a superposition of modulated 1/f components, i.e., by a pseudo-periodic 1/f –like process. Due to the fundamental selfsimilar character of the wavelet transform, 1/f processes can be fruitfully analyzed and synthesized by means of wavelets, obtaining a set of very loosely correlated coefficients at each scale level that can be well approximated by white noise in the synthesis process. Our computational scheme is based on an orthogonal P-band filter bank and a dyadic wavelet transform per channel. The P channels are tuned to the left and right sidebands of the harmonics so that sidebands are mutually independent. The structure computes the expansion coefficients of a new orthogonal and complete set of Harmonic Wavelets. The main point of our scheme is that we need only one parameter in order to model the stochastic fluctuation of sounds from a pure periodic behavior.
Download Real-time time-varying frequency warping via short-time Laguerre transform
In this paper we address the problem of the real-time implementation of time-varying frequency warping. Frequency warping based on a one-parameter family of one-to-one warping maps can be realized by means of the Laguerre transform and implemented in a non-causal structure. This structure is not directly suited for real-time implementation since each output sample is formed by combining all of the input samples. Similarly, the recently proposed time-varying Laguerre transform has the same drawback. Furthermore, long frequency dependent delays destroy the time organization or macrostructure of the sound event. Recently, the author has introduced the Short-Time Laguerre Transform for the approximate real-time implementation of frequency warping. In this transform the short-time spectrum rather than the overall frequency spectrum is frequency warped. The input is subdivided into frames that are tapered by a suitably selected window. By careful design, the output frames correspond to warped versions of the input frames modulated by a stretched version of the window. It is then possible to overlap-add these frames without introducing audible distortion. The overlap-add technique can be generalized to time-varying warping. However, several issues concerning the design of the window and the selection of the overlap parameters need to be addressed. In this paper we discuss solutions for the overlap of the frames when the Laguerre parameter is kept constant but distinct in each frame and solutions for the computation of full time-varying frequency warping when the Laguerre parameter is changing within each frame.
Download Harmonic-band wavelet coefficient modeling for pseudo-periodic sounds processing
In previous papers [1], [2] we introduced a model for pseudo-periodic sounds based on Wornell results [3] concerning the synthesis of 1/f noise by means of the Wavelet transform (WT). This method provided a good model for representing not only the harmonic part of reallife sounds but also the stochastic components. The latter are of fundamental importance from a perceptual point of view since they contain all the information related to the natural dynamic of musical timbres. In this paper we introduce a refinement of the method, making the spectralmodel technique more flexible and the resynthesis coefficient model more accurate. In this way we obtain a powerful tool for sound processing and cross-synthesis.
Download Sound Source Separation: Preprocessing For Hearing Aids And Structured Audio Codin
In this paper we consider the problem of separating different sound sources in multichannel audio signals. Different approaches to the problem of Blind Source Separation (BSS), e.g. the Independent Component Analysis (ICA) originally proposed by Herault and Jutten, and extensions to this including delays, work fine for artificially mixed signals. However the quality of the separated signals is severely degraded for real sound recordings when there is reverberation. We consider the system with 2 sources and 2 sensors, and show how we can improve the quality of the separation by a simple model of the audio scene. More specifically we estimate the delays between the sensor signals, and put constraints on the deconvolution coefficients.
Download Multiresolution Sinusoidal/Stochastic Model For Voiced-Sounds
The goal of this paper is to introduce a complete analysis/resynthesis method for the stationary part of voiced-sounds. The method is based on a new class of wavelets, the Harmonic-Band Wavelets (HBWT). Wavelets have been widely employed in signal processing [1, 2]. In the context of sound processing they provided very interesting results in their first harmonic version: the Pitch Synchronous Wavelets Transform (PSWT) [3]. We introduced the Harmonic-Band Wavelets in a previous edition of the DAFx [4]. The HBWT, with respect to the PSWT allows one to manipulate the analysis coefficients of each harmonic independently. Furthermore one is able to group the analysis coefficients according to a finer subdivision of the spectrum of each harmonic, due to the multiresolution analysis of the wavelets. This allows one to separate the deterministic components of voiced sounds, corresponding to the harmonic peaks, from the noisy/stochastic components. A first result was the development of a parametric representation of the HBWT analysis coefficients corresponding to the stochastic components [5, 7]. In this paper we present the results concerning a parametric representation of the HBWT analysis coefficients of the deterministic components. The method recalls the sinusoidal models, where one models time-varying amplitudes and time varying phases [8, 9]. This method provides a new interesting technique for sound synthesis and sound processing, integrating a parametric representation of both the deterministic and the stochastic components of sounds. At the same time it can be seen as a tool for a parametric representation of sound and data compression.
Download An Extension for Source Separation Techniques Avoiding Beats
The problem of separating individual sound sources from a mixture of these, known as Source Separation or Computational Auditory Scene Analysis (CASA), has become popular in the recent decades. A number of methods have emerged from the study of this problem, some of which perform very well for certain types of audio sources, e.g. speech. For separation of instruments in music, there are several shortcomings. In general when instruments play together they are not independent of each other. More specifically the time-frequency distributions of the different sources will overlap. Harmonic instruments in particular have high probability of overlapping partials. If these overlapping partials are not separated properly, the separated signals will have a different sensation of roughness, and the separation quality degrades. In this paper we present a method to separate overlapping partials in stereo signals. This method looks at the shapes of partial envelopes, and uses minimization of the difference between such shapes in order to demix overlapping partials. The method can be applied to enhance existing methods for source separation, e.g. blind source separation techniques, model based techniques, and spatial separation techniques. We also discuss other simpler methods that can work with mono signals.
Download Inharmonic Sound Spectral Modeling by Means of Fractal Additive Synthesis
In previous editions of the DAFX [1, 2] we presented a method for the analysis and the resynthesis of voiced sounds, i.e., of sounds with well defined pitch and harmonic-peak spectra. In a following paper [3] we called the method Fractal Additive Synthesis (FAS). The main point of the FAS is to provide two different models for representing the deterministic and the stochastic components of voiced-sounds, respectively. This allows one to represent and reproduce voiced-sounds without loosing the noisy components and stochastic elements present in real-life sounds. These components are important in order to perceive a synthetic sound as a natural one. The topic of this paper is the extension of the technique to inharmonic sounds. We can apply the method to sounds produced by percussion instruments as gongs, tympani or tubular bells, as well as to sounds with expanded quasi-harmonic spectrum as piano sounds.
Download On the use of spatial cues to improve binaural source separation
Motivated by the human hearing sense we devise a computational model suitable for the localization of many sources in stereo signals, and apply this to the separation of sound sources. The method employs spatial cues in order to resolve high-frequency phase ambiguities. More specifically we use relationships between the short time Fourier transforms (STFT) of the two signals in order to estimate the two most important spatial cues, namely time differences (TD) and level differences (LD) between the sensors. By using models of both free field wave propagation and head related transfer functions (HRTF), these cues are combined to form estimates of spatial parameters such as the directions of arrival (DOA). The theory is validated with the help of the experimental results presented in the paper.