Download Generalised Prior Subspace Analysis for Polyphonic Pitch Transcription A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translation of a single frequency prior subspace of a note to approximate other notes, overcoming the problem of needing a separate basis function for each note played by an instrument. Examples are then demonstrated which show the utility of the generalised PSA algorithm for the purposes of polyphonic pitch transcription.
Download Transforming Singing Voice Expression - The Sweetness Effect We propose a real-time system which is targeted to music production in the context of vocal recordings. The aim is to transform the singer’s voice characteristics in order to achieve a sweet sounding voice. It combines three different transformations namely SubHarmonic Component Reduction (reduction of sub-harmonics, which are found in voices with vocal disorders), Vocal Tract Excitation Modification (to achieve a change in loudness) and the Intonation Modification (to achieve smoother transitions in pitch). The transformations are done in the frequency domain based on an enhanced phase-locked vocoder. The Expression Adaptive Control estimates the amount of present vocal disorder in the singer’s voice. This estimate automatically controls the amount of SubHarmonic Component reduction to assure a natural sounding transformation.
Download Modal analysis of impact sounds with ESPRIT in Gabor transforms Identifying the acoustical modes of a resonant object can be achieved by expanding a recorded impact sound in a sum of damped sinusoids. High-resolution methods, e.g. the ESPRIT algorithm, can be used, but the time-length of the signal often requires a sub-band decomposition. This ensures, thanks to sub-sampling, that the signal is analysed over a significant duration so that the damping coefficient of each mode is estimated properly, and that no frequency band is neglected. In this article, we show that the ESPRIT algorithm can be efficiently applied in a Gabor transform (similar to a sub-sampled short-time Fourier transform). The combined use of a time-frequency transform and a high-resolution analysis allows selective and sharp analysis over selected areas of the time-frequency plane. Finally, we show that this method produces high-quality resynthesized impact sounds which are perceptually very close to the original sounds.
Download Object Coding of Harmonic Sounds Using Sparse and Structured Representations Object coding allows audio compression at extremely low bit-rates, provided that the objects are correctly modelled and identified. In this study, a codec has been implemented on the basis of a sparse decomposition of the signal with a dictionary of InstrumentSpecific Harmonic atoms. The decomposition algorithm extracts “molecules” i.e. linear combinations of such atoms, considered as note-like objects. Thus, they can be coded efficiently using notespecific strategies. For signals containing only harmonic sounds, the obtained bitrates are very low, typically around 2 kbs, and informal listening tests against a standard sinusoidal coder show promising performances.
Download Audio-Based Gesture Extraction on the ESITAR Controller Using sensors to extract gestural information for control parameters of digital audio effects is common practice. There has also been research using machine learning techniques to classify specific gestures based on audio feature analysis. In this paper, we will describe our experiments in training a computer to map the appropriate audio-based features to look like sensor data, in order to potentially eliminate the need for sensors. Specifically, we will show our experiments using the ESitar, a digitally enhanced sensor based controller modeled after the traditional North Indian sitar. We utilize multivariate linear regression to map continuous audio features to continuous gestural data.
Download Inverting dynamics compression with minimal side information Dynamics processing is a widespread technique, both at music production and diffusion stages. In particular, dynamic compression is often used in such a way that the “average” listener can best enjoy the music. However, this may lead to an excessive use of compression, especially with respect to listeners in quiet listening conditions. This paper presents estimates on the amount of extra data that is needed to invert the effects of such non-linear processing, using simple blind identification techniques. We present two simple test cases, first in the case when perfect reconstruction is needed, and second when the ancillary data rate is constrained, leading to an approximate reconstruction.
Download A General Use Circuit for Audio Signal Distortion Exploiting Any Non-Linear Electron Device In this paper, we propose the use of the transimpedance amplifier configuration as a simple generic circuit for electron device-based audio distortion. The goal is to take advantage of the non-linearities in the transfer curves of any device, such as diode, JFET, MOSFET, and control the level and type of harmonic distortion only through bias voltages and signal amplitude. The case of a nMOSFET is taken as a case study, revealing a rich dependence of generated harmonics on the region of operation (linear to saturation), and from weak to strong inversion. A continuous and analytical Lambert-W based model was used for simulations of harmonic distortion, which were verified through measurements.
Download A Method of Generic Programming for High Performance DSP This paper presents some key concepts for a new just in time programming language designed for high performance DSP. The language is primarily intended to implement an updated version of PWGLSynth, the synthesis extension to the visual musical programming environment PWGL. However, the system is suitable for use as a backend for any DSP platform. A flow control mechanism based on generic programming, polymorphism and functional programming practices is presented, which we believe is much better suited for visual programming than traditional loop constructs found in textual languages.
Download A Fast Mellin Transform with Applications in DAFx Many digital audio effects rely on transformations performed in the Fourier-transformed (frequency) domain. However, other transforms and domains exist and could be exploited. We propose to use the Mellin transform for a class of sound transformations. We present a fast implementation of the Mellin transform (more precisely a Fast Scale Transform), and we provide some examples on how it could be used in digital audio effects.
Download Audio Analysis, Visualization, and Transformation with the Matching Pursuit Algorithm The matching pursuit (or MP) algorithm decomposes audio data into a collection of thousands of constituent sound particles or gaborets. These particles correspond to the “quantum” or granular model of sound posited by Dennis Gabor. This robust and highresolution analysis technique creates new possibilities for sound visualization and transformation. This paper presents an account of a first round of experiments with MP-based visualization and transformation techniques.