Download Two-step modal identification for increased resolution analysis of percussive sounds Modal synthesis is a practical and efficient way to model sounding structures with strong resonances. In order to create realistic sounds, one has to be able to extract the parameters of this model from recorded sounds produced by the physical system of interest. Many methods are available to achieve this goal, and most of them require a careful parametrization and a post-selection of the modes to guarantee a good quality/complexity trade-off. This paper introduces a two step analysis method aiming at an automatic and reliable identification of the modes. The first step is performed at a global level with few assumptions about the spectro/temporal content of the considered signal. From the knowledge gained with this global analysis, one can focus on specific frequency regions and perform a local analysis with strong assumptions. The gains of such a two step approach are a better estimation of the number of modal components as well as a better estimate of their parameters.
Download On the control of the phase of resonant filters with applications to percussive sound modeling Source-filter models are widely used in numerous audio processing fields, from speech processing to percussive/contact sound synthesis. The design of filters for these models—be it from scratch or from spectral analysis—usually involves tuning frequency and damping parameters and/or providing an all-pole model of the resonant part of the filter. In this context, and for the modelling of percussive (non-sustained) sounds, a source signal can be estimated from a filtered sound through a time-domain deconvolution process. The result can be plagued with artifacts when resonances exhibit very low bandwidth and lie very close in frequency. We propose in this paper a method that noticeably reduces the artifacts of the deconvolution process through an inter-resonance phase synchronization. Results show that the proposed method is able to design filters inducing fewer artifacts at the expense of a higher dynamic range.
Download Additive Synthesis Of Sound By Taking Advantage Of Psychoacoustics In this paper we present an original technique designed in order to speed up additive synthesis. This technique consists in taking into account psychoacoustic phenomena (thresholds of hearing and masking) in order to ignore the inaudible partials during the synthesis process, thus saving a lot of computation time. Our algorithm relies on a specific data structure called “skip list” and has proven to be very efficient in practice. As a consequence, we are now able to synthesize an impressive number of spectral sounds in real time, without overloading the processor.
Download Differentiable Time–frequency Scattering on GPU Joint time–frequency scattering (JTFS) is a convolutional operator in the time–frequency domain which extracts spectrotemporal modulations at various rates and scales. It offers an idealized model of spectrotemporal receptive fields (STRF) in the primary auditory cortex, and thus may serve as a biological plausible surrogate for human perceptual judgments at the scale of isolated audio events. Yet, prior implementations of JTFS and STRF have remained outside of the standard toolkit of perceptual similarity measures and evaluation methods for audio generation. We trace this issue down to three limitations: differentiability, speed, and flexibility. In this paper, we present an implementation of time–frequency scattering in Python. Unlike prior implementations, ours accommodates NumPy, PyTorch, and TensorFlow as backends and is thus portable on both CPU and GPU. We demonstrate the usefulness of JTFS via three applications: unsupervised manifold learning of spectrotemporal modulations, supervised classification of musical instruments, and texture resynthesis of bioacoustic sounds.
Download The DESAM Toolbox: Spectral Analysis of Musical Audio In this paper is presented the DESAM Toolbox, a set of Matlab functions dedicated to the estimation of widely used spectral models for music signals. Although those models can be used in Music Information Retrieval (MIR) tasks, the core functions of the toolbox do not focus on any specific application. It is rather aimed at providing a range of state-of-the-art signal processing tools that decompose music files according to different signal models, giving rise to different “mid-level” representations. After motivating the need for such a toolbox, this paper offers an overview of the overall organization of the toolbox, and describes all available functionalities.