Download A Comparison of Analysis and Resynthesis Methods for Directional Segmentation of Stereo Audio
A comparison of analysis and resynthesis methods for use with a system for dividing time-coincident stereo audio signals into directional segments is presented. The purpose of such a system is to give greater flexibility in the presentation of spatial information when two-channel audio is reproduced. Example applications include up-mixing and transforming panning from amplitude to time-delay based. Included in the methods are the dualtree complex wavelet transform and wavelet packet decomposition with best basis search. The directional segmentation system and the analysis and resynthesis methods are briefly described, with reference to the relevant underlying theory, figures of merit are presented for each method applied to three stereo mixtures of contrasting material and the subjective quality of the output (with links to all audio examples) is discussed.
Download FAUST-STK: a set of linear and nonlinear physical models for the FAUST programming language
The FAUST Synthesis ToolKit is a set of virtual musical instruments written in the FAUST programming language and based on waveguide algorithms and on modal synthesis. Most of them were inspired by instruments implemented in the Synthesis ToolKit (STK) and the program SynthBuilder. Our attention has partly been focused on the pedagogical aspect of the implemented objects. Indeed, we tried to make the FAUST code of each object as optimized and as expressive as possible. Some of the instruments in the FAUST-STK use nonlinear allpass filters to create interesting and new behaviors. Also, a few of them were modified in order to use gesture data to control the performance. A demonstration of this kind of use is done in the Pure Data program. Finally, the results of some performance tests of the generated C++ code are presented.
Download Analysis and Simulation of an Analog Guitar Compressor
The digital modeling of guitar effect units requires a high physical similarity between the model and the analog reference. The famous MXR DynaComp is used to sustain the guitar sound. In this work its complex circuit is analyzed and simulated by using state-space representations. The equations for the calculation of important parameters within the circuit are derived in detail and a mathematical description of the operational transconductance amplifier is given. In addition the digital model is compared to the original unit.
Download A Single-Azimuth Pinna-Related Transfer Function Database
Pinna-Related Transfer Functions (PRTFs) reflect the modifications undergone by an acoustic signal as it interacts with the listener’s outer ear. These can be seen as the pinna contribution to the Head-Related Transfer Function (HRTF). This paper describes a database of PRTFs collected from measurements performed at the Department of Signal Processing and Acoustics, Aalto University. Median-plane PRTFs at 61 different elevation angles from 25 subjects are included. Such data collection falls into a broader project in which evidence of the correspondence between PRTF features and anthropometry is being investigated.
Download FAUST Architectures Design and OSC Support
FAUST [Functional Audio Stream] is a functional programming language specifically designed for real-time signal processing and synthesis. It consists in a compiler that translates a FAUST program into an equivalent C++ program, taking care of generating the most efficient code. The FAUST environment also includes various architecture files, providing the glue between the FAUST C++ output and the host audio and GUI environments. The combination of architecture files and FAUST output gives ready to run applications or plugins for various systems, which makes a single FAUST specification available on different platforms and environments without additional cost. This article presents the overall design of the architecture files and gives more details on the recent OSC architecture.
Download A Grammar for Analyzing and Optimizing Audio Graphs
This paper presents a formal grammar for discussing data flows and dependencies in audio processing graphs. A graph is a highly general representation of an algorithm, applicable to most DSP processes. To demonstrate and exercise the grammar, three central problems in audio graph processing are examined. The grammar is used to exhaustively analyze the problem of scheduling processing nodes of the graph, examine automatic parallelization as well as signal rate inferral. The grammar is presented in terms of mathematical set theory, independent of and thus applicable to any conceivable software platform.
Download State of the Art in Sound Texture Synthesis
The synthesis of sound textures, such as rain, wind, or crowds, is an important application for cinema, multimedia creation, games and installations. However, despite the clearly defined requirments of naturalness and flexibility, no automatic method has yet found widespread use. After clarifying the definition, terminology, and usages of sound texture synthesis, we will give an overview of the many existing methods and approaches, and the few available software implementations, and classify them by the synthesis model they are based on, such as subtractive or additive synthesis, granular synthesis, corpus-based concatenative synthesis, wavelets, or physical modeling. Additionally, an overview is given over analysis methods used for sound texture synthesis, such as segmentation, statistical modeling, timbral analysis, and modeling of transitions. 2
Download Vector Phaseshaping Synthesis
This paper introduces the Vector Phaseshaping (VPS) synthesis technique, which extends the classic Phase Distortion method by providing flexible means to distort the phase of a sinusoidal oscillator. This is achieved by describing the phase distortion function using one or more breakpoint vectors, which are then manipulated in two dimensions to produce waveshape modulation at control and audio rates. The synthesis parameters and their effects are explained, and the spectral description of the method is derived. Certain synthesis parameter combinations result in audible aliasing, which can be reduced with a novel aliasing suppression algorithm described in the paper. The extension is capable of producing a variety of interesting harmonic and inharmonic spectra, including for instance, formant peaks, while the two-dimensional form of the control parameters is expressive and is well suited for interactive applications.
Download Non-Parallel Singing-Voice Conversion by Phoneme-based Mapping and Covariance Approximation
In this work we present an approach to perform voice timbre conversion from unpaired data. Voice Conversion strategies are commonly restricted to the use of parallel speech corpora. Our proposition is based on two main concepts: the modeling of the timbre space based on phonetic information and a simple approximation of the cross-covariance of source-target features. The experimental results based on the mentioned strategy in singing-voice data of the VOCALOID synthesizer showed a conversion performance comparable to that obtained by Maximum-Likelihood, thereby allowing us to achieve singer-timbre conversion from real singing performances.
Download Application of non-negative matrix factorization to signal-adaptive audio effects
This paper proposes novel audio effects based on manipulating an audio signal in a representation domain provided by non-negative matrix factorization (NMF). Critical-band magnitude spectrograms Y of sounds are first factorized into a product of two lower-rank matrices so that Y ≈ BG. The parameter matrices B and G are then processed in order to achieve the desired effect. Three classes of effects were investigated: 1) dynamic range compression (or expansion) of the component spectra or gains, 2) effects based on rank-ordering the components (colums of B and the corresponding rows of G) according to acoustic features extracted from them, and then weighting each component according to its rank, and 3) distortion effects based on controlling the amount of components (and thus the reconstruction error) in the above linear approximation. The subjective quality of the effects was assessed in a listening test.