Download Sound Morphing by Audio Descriptors and Parameter Interpolation
We present a strategy for static morphing that relies on the sophisticated interpolation of the parameters of the signal model and the independent control of high-level audio features. The source and target signals are decomposed into deterministic, quasi-deterministic and stochastic parts, and are processed separately according to sinusoidal modeling and spectral envelope estimation. We gain further intuitive control over the morphing process by altering the interpolated spectrum according to target values of audio descriptors through an optimization process. The proposed approach leads to convincing morphing results in the case of sustained or percussive, harmonic and inharmonic sounds of possibly different durations.
Download Real-Time Force-Based Sound Synthesis Using GPU Parallel Computing
In this paper we propose a real-time sound synthesis method using a force-based algorithm to control sinusoidal partials. This synthesis method can generate various sounds from musical tones and noises with three kinds of intuitive parameters, which are attractive force, repulsive force and resistance. However, the implementation of this method in real-time has difficulties due to a large volume of calculations for manipulating thousands or more partials. In order to resolve these difficulties, we utilize a GPU-based parallel computing technology and precalculations. Since GPUs allowed us to implement powerful simultaneous parallel processing, this synthesis method is made more efficient by using GPUs. Furthermore, by using familiar musical features, which include MIDI input for playing the synthesizer and ADSR envelope generators for time-varying parameters, an intuitive controller for this synthesis method is accomplished.
Download A Physical String Model with Adjustable Boundary Conditions
The vibration of strings in musical instruments depends not only on their geometry and material but also on their fixing at the ends of the string. In physical terms it is described by impedance boundary conditions. This contribution presents a functional transformation model for a vibrating string which is coupled to an external boundary circuit. Delay-free loops in the synthesis algorithm are avoided by a state-space formulation. The value of the boundary impedance can be adjusted without altering the core synthesis algorithm.
Download A Modal Approach to the Numerical Simulation of a String Vibrating Against an Obstacle: Applications to Sound Synthesis
A number of musical instruments (electric basses, tanpuras, sitars...) have a particular timbre due to the contact between a vibrating string and an obstacle. In order to simulate the motion of such a string with the purpose of sound synthesis, various technical issues have to be resolved. First, the contact phenomenon, inherently nonlinear and producing high frequency components, must be described in a numerical manner that ensures stability. Second, as a key ingredient for sound perception, a fine-grained frequencydependent description of losses is necessary. In this study, a new conservative scheme based on a modal representation of the displacement is presented, allowing the simulation of a stiff, damped string vibrating against an obstacle with an arbitrary geometry. In this context, damping parameters together with eigenfrequencies of the system can be adjusted individually, allowing for complete control over loss characteristics. Two cases are then numerically investigated: a point obstacle located in the vicinity of the boundary, mimicking the sound of the tanpura, and then a parabolic obstacle for the sound synthesis of the sitar.
Download A Real-Time Synthesis Oriented Tanpura Model
Physics-based synthesis of tanpura drones requires accurate simulation of stiff, lossy string vibrations while incorporating sustained contact with the bridge and a cotton thread. Several challenges arise from this when seeking efficient and stable algorithms for real-time sound synthesis. The approach proposed here to address these combines modal expansion of the string dynamics with strategic simplifications regarding the string-bridge and stringthread contact, resulting in an efficient and provably stable timestepping scheme with exact modal parameters. Attention is given also to the physical characterisation of the system, including string damping behaviour, body radiation characteristics, and determination of appropriate contact parameters. Simulation results are presented exemplifying the key features of the model.
Download Assessing Applause Density Perception Using Synthesized Layered Applause Signals
Applause signals are the sound of many persons gathered in one place clapping their hands and are a prominent part of live music recordings. Usually, applause signals are recorded together or alongside with the live performance and serve to evoke the feeling of participation in a real event within the playback recipient. Applause signals can be very different in character, depending on the audience size, location, event type, and many other factors. To characterize different types of applause signals, the attribute of ‘density’ appears to be suitable. This paper reports first investigations whether density is an adequate perceptual attribute to describe different types of applause. We describe the design of a listening test assessing density and the synthesis of suitable, strictly controlled stimuli for the test. Finally, we provide results, both on strictly controlled and on naturally recorded stimuli, that confirm the suitability of the attribute density to describe important aspects of the perception of different applause signal characteristics.
Download Time Domain Aspects of Artifact Reduction in Positioning Algorithm using Differential Head-Related Transfer Function
This paper focuses on consequences of artifact reduction in virtual sound source positioning method based on Differential HeadRelated Transfer Function (DHRTF). As resulted from previous experiments, spatial performance of this experimental method is very promising. However, under specific circumstances, artifacts may occur in the virtually positioned sound. Effective methods for artifact reduction were introduced before. This work discovers impact of the reducing algorithm in the time domain in order to understand phenomena occurring in the process. The cause of artifact presence results from narrow band peak(s) present in the DHRTF magnitude, which causes periodical character of the impulse response in the time domain.
Download Detection of Clicks in Analog Records Using Peripheral-Ear Model
This study describes a system which detects clicks in sound (audible degradations). The system is based on a computational model of the peripheral ear. In order to train and verify the system, a listening test was conducted using 89 short samples of analog (vinyl) records. The samples contained singing voice, music (rock’n’roll), or both. We randomly chose 30 samples from the set and used it to train the system; then we tested the system using the 59 remaining samples. The system performance expressed as a percentage of correct detections (78.1%) and false alarms (3.9%) is promising.
Download Perceptual Audio Source Culling for Virtual Environments
Existing game engines and virtual reality software, use various techniques to render spatial audio. One such technique, binaural synthesis, is achieved through the use of head-related transfer functions, in conjunction with artificial reverberators. For virtual environments that embody a large number of concurrent sound sources, binaural synthesis will be computationally costly. The work presented in this paper aims to develop a methodology that improves overall performance by culling inaudible and perceptually less prominent sound sources in order to reduce performance implications. The proposed algorithm is benchmarked and compared with distance-based, volumetric culling methodology. A subjective evaluation of the perceptual performance of the proposed algorithm for acoustic scenes having different compositions is also provided.
Download Automatic Violin Synthesis Using Expressive Musical Term Features
The control of interpretational properties such as duration, vibrato, and dynamics is important in music performance. Musicians continuously manipulate such properties to achieve different expressive intentions. This paper presents a synthesis system that automatically converts a mechanical, deadpan interpretation to distinct expressions by controlling these expressive factors. Extending from a prior work on expressive musical term analysis, we derive a subset of essential features as the control parameters, such as the relative time position of the energy peak in a note and the mean temporal length of the notes. An algorithm is proposed to manipulate the energy contour (i.e. for dynamics) of a note. The intended expressions of the synthesized sounds are evaluated in terms of the ability of the machine model developed in the prior work. Ten musical expressions such as Risoluto and Maestoso are considered, and the evaluation is done using held-out music pieces. Our evaluations show that it is easier for the machine to recognize the expressions of the synthetic version, comparing to those of the real recordings of an amateur student. While a listening test is under construction as a next step for further performance validation, this work represents to our best knowledge a first attempt to build and quantitatively evaluate a system for EMT analysis/synthesis.