Download Inverting dynamics compression with minimal side information Dynamics processing is a widespread technique, both at music production and diffusion stages. In particular, dynamic compression is often used in such a way that the “average” listener can best enjoy the music. However, this may lead to an excessive use of compression, especially with respect to listeners in quiet listening conditions. This paper presents estimates on the amount of extra data that is needed to invert the effects of such non-linear processing, using simple blind identification techniques. We present two simple test cases, first in the case when perfect reconstruction is needed, and second when the ancillary data rate is constrained, leading to an approximate reconstruction.
Download Time and pitch scale modification: A real-time framework and tutorial A framework is presented which is designed to address the issues related to the real-time implementation of time-scale and pitchscale modification algorithms. This framework can be used as the basis for the developments of applications which allow for a seamless real-time transition between continually varying timescale and pitch-scale parameters which arise as a result of manual or automatic intervention.
Download Improved hidden Markov model partial tracking through time-frequency analysis In this article we propose a modification to the combinatorial hidden Markov model developed in [1] for tracking partial frequency trajectories. We employ the Wigner-Ville distribution and Hough transform in order to (re)estimate the frequency and chirp rate of partials in each analysis frame. We estimate the initial phase and amplitude of each partial by minimizing the squared error in the time-domain. We then formulate a new scoring criterion for the hidden Markov model which makes the tracker more robust for non-stationary and noisy signals. We achieve good performance tracking crossing linear chirps and crossing FM signals in white noise as well as real instrument recordings.
Download Two-step modal identification for increased resolution analysis of percussive sounds Modal synthesis is a practical and efficient way to model sounding structures with strong resonances. In order to create realistic sounds, one has to be able to extract the parameters of this model from recorded sounds produced by the physical system of interest. Many methods are available to achieve this goal, and most of them require a careful parametrization and a post-selection of the modes to guarantee a good quality/complexity trade-off. This paper introduces a two step analysis method aiming at an automatic and reliable identification of the modes. The first step is performed at a global level with few assumptions about the spectro/temporal content of the considered signal. From the knowledge gained with this global analysis, one can focus on specific frequency regions and perform a local analysis with strong assumptions. The gains of such a two step approach are a better estimation of the number of modal components as well as a better estimate of their parameters.
Download Binaural partial tracking Partial tracking in sinusoidal models have been studied for over twenty years now, and have been enhanced, making it precise and useful to analyse noiseless harmonic sounds. However, such tools have always been used in a monophonic (single channel) context. A method is thus proposed to adapt the partial tracking to the case of binaural signals. This gives a tool to perform spectral analysis of such signals, keeping relevant information from both left and right channels. Moreover, azimuth (position in the horizontal plane) information for each partial is gained using interaural cues, such as interaural time differences (ITDs) and interaural level differences (ILDs). The azimuth information can then be used as an attribute or as a constraint in the binaural partial tracking algorithm. Finally, some classification results using the azimuth of partials are presented.
Download Hybrid room impulse response synthesis in digital waveguide mesh based room acoustics simulation The digital waveguide mesh (DWM) and related finite difference time domain techniques offer significant promise for room acoustics simulation problems. However high resolution 3-D DWMs of large spaces remain beyond the capabilities of current desktop based computers, due to prohibitively long run-times and large memory requirements. This paper examines how hybrid room impulse response synthesis might be used to better enable virtual environment simulation through the use of otherwise computationally expensive DWM models. This is facilitated through the introduction of the RenderAIR virtual environment simulation system and comparison with both real-world measurements and more established modelling techniques. Results demonstrate good performance against acoustic benchmarks and significant computational savings when a 2-D DWM is used as part of an appropriate hybridization strategy.
Download A supervised learning approach to ambience extraction from mono recordings for blind upmixing A supervised learning approach to ambience extraction from onechannel audio signals is presented. The extracted ambient signals are applied for the blind upmixing of musical audio recordings to surround sound formats. The input signal is processed by means of short-term spectral attenuation. The spectral weights are computed using a low-level feature extraction process and a neural network regression method. The multi-channel audio signal is generated by feeding the computed ambient signal into the rear channels of a surround sound system.
Download Direct simulation for wind instrument synthesis There are now a number of methods available for generating synthetic sound based on physical models of wind instruments, including digital waveguides, wave digital filters, impedance-based methods and those involving impulse responses. Normally such methods are used to simulate the behaviour of the resonator, and the coupling to the excitation mechanism is carried out by making use of simple lumped finite difference schemes or digital filter structures. In almost all cases, a traveling wave, frequencydomain, or impulse response description of the resonator is used as a starting point—efficient structures may be arrived at when the bore is of a particularly simple form, such as a cylinder or cone. In recent years, however, due to the great computing power available, efficiency has become less of a concern—this is especially the case for musical instruments which may be well-modelled in 1D, such as wind instruments. In this paper, a fully time-space discrete algorithm for the simulation and synthesis of woodwind instrument sounds is presented; such a method, though somewhat more computationally intensive than an efficient waveguide structure, is still well within the realm of real-time performance. The main benefits of such a method are its generality (it is no longer necessary to make any assumptions about bore profile, which may be handled in an almost trivial manner), extensibility (i.e., the model may be generalized to handle nonlinear phenomena directly), ease of programming, and the possibility of direct proofs of numerical stability without invoking frequency domain concepts. Simulation results, sound examples and a graphical user interface, in the Matlab programming language are also presented.
Download Energy-stable modelling of contacting modal objects with piece-wise linear interaction force In discrete-time digital models of contact of vibrating objects stability and therefore control over system energy is an important issue. While numerical approximation is problematic in this context digital algorithms may meat this challenge when based on exact mathematical solution of the underlying equation. The latter may generally be possible under certain conditions of linearity. While a system of contacting solid objects is non-linear by definition, piece-wise linear models may be used. Here however the aspect of “switching” between different linear phases is crucial. An approach is presented for exact preservation of system energy when passing between different phases of contact. One basic principle used may be pictured as inserting appropriate ideal, massless and perfectly stiff, “connection rods” at discrete moments of phase switching. Theoretic foundations are introduced and the general technique is explained and tested at two simple examples.
Download Improvement of band extension technique for G.711 telephony speech based on full wave rectification This study investigates a band extension technique for the narrow-band speech encoded with G.711, the most common codec for digital speech communications such as VoIP. The proposed technique is based on the full wave rectification that generates high-band harmonics by nonlinear processing. In order to improve the conventional technique, this study focuses on the parameter control according to the characteristics of speech data. From the subjective evaluation, it is indicated that the proposed technique may potentially outperform the conventional technique.