Download An Evaluation of Audio Feature Extraction Toolboxes
Audio feature extraction underpins a massive proportion of audio processing, music information retrieval, audio effect design and audio synthesis. Design, analysis, synthesis and evaluation often rely on audio features, but there are a large and diverse range of feature extraction tools presented to the community. An evaluation of existing audio feature extraction libraries was undertaken. Ten libraries and toolboxes were evaluated with the Cranfield Model for evaluation of information retrieval systems, reviewing the coverage, effort, presentation and time lag of a system. Comparisons are undertaken of these tools and example use cases are presented as to when toolboxes are most suitable. This paper allows a software engineer or researcher to quickly and easily select a suitable audio feature extraction toolbox.
Download On studying auditory distance perception in concert halls with multichannel auralizations
Virtual acoustics and auralizations have been previously used to study the perceptual properties of concert hall acoustics in a descriptive profiling framework. The results have indicated that the apparent auditory distance to the orchestra might play a crucial role in enhancing the listening experience and the appraisal of hall acoustics. However, it is unknown how the acoustics of the hall influence auditory distance perception in such large spaces. Here, we present one step towards studying auditory distance perception in concert halls with virtual acoustics. The aims of this investigation were to evaluate the feasibility of the auralizations and the system to study perceived distances as well as to obtain first evidence on the effects of hall acoustics and the source materials to distance perception. Auralizations were made from measured spatial impulse responses in two concert halls at 14 and 22 meter distances from the center of a calibrated loudspeaker orchestra on stage. Anechoic source materials included symphonic music and pink noise as well as signals produced by concatenating random segments of anechoic instrument recordings. Forty naive test subjects were blindfolded before entering the listening room, where they verbally reported distances to sound sources in the auralizations. Despite the large variance in distance judgments between the individuals, the reported distances were on average in the same range as the actual distances. The results show significant main effects of halls, distances and signals, but also some unexpected effects associated with the presentation order of the stimuli.
Download Towards Transient Restoration in Score-informed Audio Decomposition
Our goal is to improve the perceptual quality of transient signal components extracted in the context of music source separation. Many state-of-the-art techniques are based on applying a suitable decomposition to the magnitude of the Short-Time Fourier Transform (STFT) of the mixture signal. The phase information required for the reconstruction of individual component signals is usually taken from the mixture, resulting in a complex-valued, modified STFT (MSTFT). There are different methods for reconstructing a time-domain signal whose STFT approximates the target MSTFT. Due to phase inconsistencies, these reconstructed signals are likely to contain artifacts such as pre-echos preceding transient components. In this paper, we propose a simple, yet effective extension of the iterative signal reconstruction procedure by Griffin and Lim to remedy this problem. In a first experiment, under laboratory conditions, we show that our method considerably attenuates pre-echos while still showing similar convergence properties as the original approach. A second, more realistic experiment involving score-informed audio decomposition shows that the proposed method still yields improvements, although to a lesser extent, under non-idealized conditions.
Download Two polarisation finite difference model of bowed strings with nonlinear contact and friction forces
Recent bowed string sound synthesis has relied on physical modelling techniques; the achievable realism and flexibility of gestural control are appealing, and the heavier computational cost becomes less significant as technology improves. A bowed string is simulated in two polarisations by discretising the partial differential equations governing its behaviour, using the finite difference method; a globally energy balanced scheme is used, as a guarantee of numerical stability under highly nonlinear conditions. In one polarisation, a nonlinear contact model is used for the normal forces exerted by the dynamic bow hair, left hand fingers, and fingerboard. In the other polarisation, a force-velocity friction curve is used for the resulting tangential forces. The scheme update requires the solution of two nonlinear vector equations.Sound examples and video demonstrations are presented.
Download A toolkit for experimentation with signal interaction
This paper will describe a toolkit for experimentation with signal interaction techniques, also commonly referred to as cross adaptive processing. The technique allows analyzed features of one audio signal to inform the processing of another. Earlier used mainly for mixing and post production purposes, we now want to use it creatively as an intervention in the musical communication between two performers. The idea stems from Stockhausen’s use of intermodulation in the 1960’s, and as such we might also call the updated technique interprocessing. Our interest in the technique comes as a natural extension to previous research on live processing as an instrumental and performative activity. The automatic control of signal processing routines is related to previous work on adaptive audio effects and automatic mixing. The focus for our investigation and experimentation with the current toolkit will be how this affects the musical communication between performers, and how it changes what they can and will play. The program code for the toolkit is available as a github repository1 under an open source license.
Download Implementing a Low Latency Parallel Graphic Equalizer with Heterogeneous Computing
This paper describes the implementation of a recently introduced parallel graphic equalizer (PGE) in a heterogeneous way. The control and audio signal processing parts of the PGE are distributed to a PC and to a signal processor, of WaveCore architecture, respectively. This arrangement is particularly suited to the algorithm in question, benefiting from the low-latency characteristics of the audio signal processor as well as general purpose computing power for the more demanding filter coefficient computation. The design is achieved cleanly in a high-level language called Kronos, which we have adapted for the purposes of heterogeneous code generation from a uniform program source.