Download Iterative Structured Shrinkage Algorithms for Stationary/Transient Audio Separation In this paper, we present novel strategies for stationary/transient signal separation in audio signals in order to exploit the basic observation that stationary components are sparse in frequency and persistent over time whereas transients are sparse in time and persistent across frequency. We utilize a multi-resolution STFT approach which allows to define structured shrinkage operators to tune into the characteristic spectrotemporal shapes of the stationary and transient signal layers. Structure is incorporated by considering the energy of time-frequency neighbourhoods or modulation spectrum regions instead of individual STFT coefficients, and shrinkage operators are employed in a dual-layered Iterated Shrinkage/Thresholding Algorithm (ISTA) framework. We further propose a novel iterative scheme, Iterative Cross-Shrinkage (ICS). In experiments using artificial test signals, ICS clearly outperforms the dual-layered ISTA and yields particularly good results in conjunction with a dynamic update of the shrinkage thresholds. The application of the novel algorithms to recordings from acoustic musical instruments provides perceptually convincing separation of transients.
Download An Explorative String-bridge-plate Model with Tunable Parameters The virtual exploration of the domain of mechano-acoustically produced sound and music is a long-held aspiration of physical modelling. A physics-based algorithm developed for this purpose combined with an interface can be referred to as a virtual-acoustic instrument; its design, formulation, implementation, and control are subject to a mix of technical and aesthetic criteria, including sonic complexity, versatility, modal accuracy, and computational efficiency. This paper reports on the development of one such system, based on simulating the vibrations of a string and a plate coupled via a (nonlinear) bridge element. Attention is given to formulating and implementing the numerical algorithm such that any of its parameters can be adjusted in real-time, thus facilitating musician-friendly exploration of the parameter space and offering novel possibilities regarding gestural control. Simulation results are presented exemplifying the sonic potential of the string-bridgeplate model (including bridge rattling and buzzing), and details regarding efficiency, real-time implementation and control interface development are discussed.
Download Modal Based Tanpura Simulation: Combining Tension Modulation and Distributed Bridge Interaction Techniques for the simulation of the tanpura have advanced significantly in recent years allowing numerically stable inclusion of bridge contact. In this paper tension modulation is added to a tanpura model containing a stiff lossy string, distributed bridge contact and the thread. The model is proven to be unconditionally stable and the numerical solver used has a unique solution as a result of choices made in the discretisation process. Effects due to the distribution of the bridge contact forces by comparison to a single point bridge and of introducing the tension modulation are studied in simulations. This model is intended for use in furthering the understanding of the physics of the tanpura and for informing the development of algorithms for sound synthesis of the tanpura and similar stringed instruments.
Download Physically Derived Synthesis Model of a Cavity Tone The cavity tone is the sound generated when air flows over the open surface of a cavity and a number of physical conditions are met. Equations obtained from fluid dynamics and aerodynamics research are utilised to produce authentic cavity tones without the need to solve complex computations. Synthesis is performed with a physical model where the geometry of the cavity is used in the sound synthesis calculations. The model operates in real-time making it ideal for integration within a game or virtual reality environment. Evaluation is carried out by comparing the output of our model to previously published experimental, theoretical and computational results. Results show an accurate implementation of theoretical acoustic intensity and sound propagation equations as well as very good frequency predictions. NOMENCLATURE c = speed of sound (m/s) f = frequency (Hz) ω = angular frequency = 2πf (rads/revolution) u = air flow speed (m/s) Re = Reynolds number (dimensionless) St = Strouhal number (dimensionless) r = distance between listener and sound source (m) φ = elevation angle between listener and sound source ϕ = azimuth angle between listener and sound source ρair = mass density of air (kgm−3 ) µair = dynamic viscosity of air (Pa s) M = Mach number, M = u/c (dimensionless) L = length of cavity (m) d = depth of cavity (m) b = width of cavity (m) κ = wave number, κ = ω/c (dimensionless) r = distance between source and listener (m) δ = shear layer thickness (m) δ ∗ = effective shear layer thickness (m) δ0 = shear layer thickness at edge separation (m) θ0 = shear layer momentum thickness at edge separation (m) C2 = pressure coefficient (dimensionless)
Download A Continuous Frequency Domain Description of Adjustable Boundary Conditions for Multidimensional Transfer Function Models Physical modeling of string vibrations strongly depends on the conditions at the system boundaries. The more complex the boundary conditions are the more complex is the process of physical modeling. Based on prior works, this contribution derives a general concept for the incorporation of complex boundary conditions into a transfer function model designed with simple boundary conditions. The concept is related to control theory and separates the treatment of the boundary conditions from the design of the string model.
Download EVERTims: Open Source Framework for Real-time Auralization in Architectural Acoustics and Virtual Reality This paper presents recent developments of the EVERTims project, an auralization framework for virtual acoustics and real-time room acoustic simulation. The EVERTims framework relies on three independent components: a scene graph editor, a room acoustic modeler, and a spatial audio renderer for auralization. The framework was first published and detailed in [1, 2]. Recent developments presented here concern the complete re-design of the scene graph editor unit, and the C++ implementation of a new spatial renderer based on the JUCE framework. EVERTims now functions as a Blender add-on to support real-time auralization of any 3D room model, both for its creation in Blender and its exploration in the Blender Game Engine. The EVERTims framework is published as open source software: http://evertims.ircam. fr.
Download A Comparison of Player Performance in a Gamified Localisation Task Between Spatial Loudspeaker Systems This paper presents an experiment comparing player performance in a gamified localisation task between three loudspeaker configurations: stereo, 7.1 surround-sound and an equidistantly spaced octagonal array. The test was designed as a step towards determining whether spatialised game audio can improve player performance in a video game, thus influencing their overall experience. The game required players to find as many sound sources as possible, by using only sonic cues, in a 3D virtual game environment. Results suggest that the task was significantly easier when listening over a 7.1 surround-sound system, based on feedback from 24 participants. 7.1 was also the most preferred of the three listening conditions. The result was not entirely expected in that the octagonal array did not outperform 7.1. It is thought that, for the given stimuli, this may be a repercussion due to the octagonal array sacrificing an optimal front stereo pair, for more consistent imaging all around the listening space.
Download Accurate Reverberation Time Control in Feedback Delay Networks The reverberation time is one of the most prominent acoustical qualities of a physical room. Therefore, it is crucial that artificial reverberation algorithms match a specified target reverberation time accurately. In feedback delay networks, a popular framework for modeling room acoustics, the reverberation time is determined by combining delay and attenuation filters such that the frequencydependent attenuation response is proportional to the delay length and by this complying to a global attenuation-per-second. However, only few details are available on the attenuation filter design as the approximation errors of the filter design are often regarded negligible. In this work, we demonstrate that the error of the filter approximation propagates in a non-linear fashion to the resulting reverberation time possibly causing large deviation from the specified target. For the special case of a proportional graphic equalizer, we propose a non-linear least squares solution and demonstrate the improved accuracy with a Monte Carlo simulation.
Download Binauralization of Omnidirectional Room Impulse Responses - Algorithm and Technical Evaluation The auralization of acoustic environments over headphones is often realized with data-based dynamic binaural synthesis. The required binaural room impulse responses (BRIRs) for the convolution process can be acquired by performing measurements with an artificial head for different head orientations and positions. This procedure is rather costly and therefore not always feasible in practice. Because a plausible representation is sufficient for many practical applications, a simpler approach is of interest. In this paper we present the BinRIR (Binauralization of omnidirectional room impulse responses) algorithm, which synthesizes BRIR datasets for dynamic auralization based on a single measured omnidirectional room impulse response (RIR). Direct sound, early reflections, and diffuse reverberation are extracted from the omnidirectional RIR and are separately spatialized. Spatial information is added according to assumptions about the room geometry and on typical properties of diffuse reverberation. The early part of the RIR is described by a parametric model and can easily be modified and adapted. Thus the approach can even be enhanced by considering modifications of the listener position. The late reverberation part is synthesized using binaural noise, which is adapted to the energy decay curve of the measured RIR. In order to examine differences between measured and synthesized BRIRs, we performed a technical evaluation for two rooms. Measured BRIRs are compared to synthesized BRIRs and thus we analyzed the inaccuracies of the proposed algorithm.
Download Pinna Morphological Parameters Influencing HRTF Sets Head-Related Transfer Functions (HRTFs) are one of the main aspects of binaural rendering. By definition, these functions express the deep linkage that exists between hearing and morphology especially of the torso, head and ears. Although the perceptive effects of HRTFs is undeniable, the exact influence of the human morphology is still unclear. Its reduction into few anthropometric measurements have led to numerous studies aiming at establishing a ranking of these parameters. However, no consensus has yet been set. In this paper, we study the influence of the anthropometric measurements of the ear, as defined by the CIPIC database, on the HRTFs. This is done through the computation of HRTFs by Fast Multipole Boundary Element Method (FM-BEM) from a parametric model of torso, head and ears. Their variations are measured with 4 different spectral metrics over 4 frequency bands spanning from 0 to 16kHz. Our contribution is the establishment of a ranking of the selected parameters and a comparison to what has already been obtained by the community. Additionally, a discussion over the relevance of each approach is conducted, especially when it relies on the CIPIC data, as well as a discussion over the CIPIC database limitations.