Download Phase-Change based Tuning for Automatic Chord Recognition
This paper focuses on automatic extraction of acoustic chord sequences from a piece of music. Firstly, the evaluation of a set of different windowing methods for Discrete Fourier Transform is investigated in terms of their efficiency. Then, a new tuning solution is introduced, based on a method developed in the past for phase vocoder. Pitch class profile vectors, that represent harmonic information, are extracted from the given audio signal. The resulting chord sequence is obtained by running a Viterbi decoder on trained hidden Markov models. We performed several experiments using the proposed technique. Results obtained on 175 manually-labeled songs provided an accuracy that is comparable to the state of the art.
Download An iterative Segmentation Algorithm for Audio Signal Spectra Depending on Local Centers of Gravity
Modern music production and sound generation often relies on manipulation of pre-recorded pieces of audio, so-called samples, taken from a huge database. Consequently, there is a increasing request to extensively adapt these samples to any new musical context in a flexible way. For this purpose, advanced digital signal processing is needed in order to realize audio effects like pitch shifting, time stretching or harmonization. Often, a key part of these processing methods is a signal adaptive, block based spectral segmentation operation. Hence, we propose a novel algorithm for such a spectral segmentation based on local centers of gravity (COG). The method was originally developed as part of a multiband modulation decomposition for audio signals. Nevertheless, this algorithm can also be used in the more general context of improved vocoder related applications.
Download Finding Latent Sources in Recorded Music with a Shift-invariant HDP
We present the Shift-Invariant Hierarchical Dirichlet Process (SIHDP), a nonparametric Bayesian model for modeling multiple songs in terms of a shared vocabulary of latent sound sources. The SIHDP is an extension of the Hierarchical Dirichlet Process (HDP) that explicitly models the times at which each latent component appears in each song. This extension allows us to model how sound sources evolve over time, which is critical to the human ability to recognize and interpret sounds. To make inference on large datasets possible, we develop an exact distributed Gibbs sampling algorithm to do posterior inference. We evaluate the SIHDP’s ability to model audio using a dataset of real popular music, and measure its ability to accurately find patterns in music using a set of synthesized drum loops. Ultimately, our model produces a rich representation of a set of songs consisting of a set of short sound sources and when they appear in each song.
Download Automatic Target Mixing using Least-Squares Optimization of Gains and Equalization Settings
The proposed automatic target mixing algorithm determines the gains and the equalization settings for the mixing of a multi-track recording using a least-squares optimization. These parameters are estimated using a single channel target mix, that is a signal which contains the same audio tracks as the multi-track recording, but that has been previously mixed using some unknown settings. Several tests have been done in order to evaluate the performances of two different approaches to the optimization, namely the sub-band estimator and the FIR filters estimator. The results show that, using the latter technique, the proposed algorithm is able to retrieve the parameters originally applied to the target mix. This achievement can be useful for remastering applications, where both the original recording sessions and the final mix are available, but there is the need to retrieve the mixing parameters originally applied to the various audio tracks.
Download Self-Authentication of Audio signals by Chirp Coding
This paper discusses a new approach to ‘watermarking’ digital signals using linear frequency modulated or ‘chirp’ coding. The principles underlying this approach are based on the use of a matched filter to provide a reconstruction of a chirped code that is uniquely robust in the case of signals with very low signal-to-noise ratios. Chirp coding for authenticating data is generic in the sense that it can be used for a range of data types and applications (the authentication of speech and audio signals, for example). The theoretical and computational aspects of the matched filter and the properties of a chirp are revisited to provide the essential background to the method. Signal code generating schemes are then addressed and details of the coding and decoding techniques considered. Finally, the paper briefly describes an example application which is available on-line for readers who are interested in using the approach for audio data authentication working with either WAV or MP3 files.
Download Adaptive Phase Distortion Synthesis
This article discusses Phase Distortion synthesis and its application to arbitrary input signals. The main elements that compose the technique are presented. Its similarities to Phase Modulation are discussed and the equivalence between the two techniques is explored. Two alternative methods of distorting the phase of an arbitrary signal are presented. The first is based on the audio-rate modulation of a first-order allpass filter coefficient. The other method relies on a re-casting of the Phase Modulation equation, which leads to a heterodyned form of waveshaping. The relationship of these implementations to the original technique is explored in detail. Complementing the article, a number of examples are discussed, demonstrating the application of the technique as an interesting digital audio effect.
Download A Modular Percussion Synthesis Environment
The construction of new virtual instruments is one long-term goal of physical modeling synthesis; a common strategy across various different physical modeling methodologies, including lumped network models, modal synthesis and scattering based methods, is to provide a canonical set of basic elements, and allow the user to build an instrument via certain specified connection rules. Such an environment may be described as modular. Percussion instruments form a good test-bed for the development of modular synthesis techniques—the basic components are bars and plates, and may be accompanied by connection elements, with a nonlinear character. Modular synthesis has been approached using all of the techniques mentioned above, but time domain finite difference schemes are an alternative, allowing many problems inherent in the above methods, including computability, large memory and precomputation requirements, and lack of extensibility to more complex systems, to be circumvented. One such network model is presented here along with the associated difference schemes, followed by a discussion of implementation details, the issues of excitation and output, and a description of various instrument configurations. The article concludes with a presentation of simulation results, generated in the Matlab prototyping language.
Download Vocoders: we have Loved, Loathed and Laughed at
Download Five Variations on a Feedback Theme
This is a study on a set of feedback amplitude modulation oscillator equations. It is based on a very simple and inexpensive algorithm which is capable of generating a complex spectrum from a sinusoidal input. We examine the original and five variations on it, discussing the details of each synthesis method. These include the addition of extra delay terms, waveshaping of the feedback signal, further heterodyning and increasing the loop delay. In complement, we provide a software implementation of these algorithms as a practical example of their application and as demonstration of their potential for synthesis instrument design.
Download Physically-based synthesis of nonlinear circular membranes
This paper investigates the properties of a recently proposed physical model of nonlinear tension modulation effects in a struck circular membrane. The model simulates dynamic variations of tension (and consequently of partial frequencies) due to membrane stretching during oscillation, and is based on a more general theory of geometric nonlinearities in elastic plates. The ability of the nonlinear membrane model to simulate real-world acoustic phenomena is assessed here through resynthesis of recorded membrane (rototom) sounds. The effects of air loading and tension modulation in the recorded sounds are analyzed, and model parameters for resynthesis are consequently estimated. The example reported in the paper show that the model is able to accurately simulate the analyzed rototom sounds.