Download A Cosine-Distance Based Neural Network for Music Artist Recognition Using Raw I-Vector Feature
Recently, i-vector features have entered the field of Music Information Retrieval (MIR), exhibiting highly promising performance in important tasks such as music artist recognition or music similarity estimation. The i-vector modelling approach relies on a complex processing chain that limits by the use of engineered features such as MFCCs. The goal of the present paper is to make an important step towards a truly end-to-end modelling system inspired by the i-vector pipeline, to exploit the power of Deep Neural Networks1 (DNNs) to learn optimized feature spaces and transformations. Several authors have already tried to combine the power of DNNs with i-vector features, where DNNs were used for feature extraction, scoring or classification. In this paper, we try to use neural networks for the important step of i-vector post-processing and classification for the task of music artist recognition. Specifically, we propose a novel neural network for i-vector features with a cosine-distance loss function, optimized with stochastic gradient decent (SGD). We first show that current networks do not perform well with unprocessed i-vector features, and that post-processing methods such as Within-Class Covariance Normalization (WCCN) and Linear Discriminant Analysis (LDA) are crucially important to improve the i-vector representation. We further demonstrate that these linear projections (WCCN and LDA) can not be learned using general objective functions usually used in neural networks. We examine our network on a 50-class music artist recognition dataset using i-vectors extracted from frame-level timbre features. Our experiments suggest that using our network with fully unprocessed i-vectors, we can achieve the performance of the i-vector pipeline which uses i-vector post processing methods such as LDA and WCCN.
Download Assessing The Suitability of the Magnitude Slope Deviation Detection Criterion For Use In Automatic Acoustic Feedback Control
Acoustic feedback is a recurrent problem in live sound reinforcement scenarios. Many attempts have been made to produce an automated feedback cancellation system, but none have seen widespread use due to concerns over the accuracy and transparency of feedback howl cancellation. This paper investigates the use of the Magnitude Slope Deviation (MSD) algorithm to intelligently identify feedback howl in live sound scenarios. A new variation on this algorithm is developed, tested, and shown to be much more computationally efficient without compromising detection accuracy. The effect of varying the length of the frequency spectrum history buffer available for analysis is evaluated across various live sound scenarios. The MSD algorithm is shown to be very accurate in detecting howl frequencies amongst the speech and classical music stimuli tested here, but inaccurate in the rock music scenario even when a long history buffer is used. Finally, a new algorithm for setting the depth of howl-cancelling notch filters is proposed and investigated. The algorithm shows promise in keeping frequency attenuation to a minimum required level, but the approach has some problems in terms of time taken to cancel howl.
Download A Computational Model of the Hammond Organ Vibrato/Chorus using Wave Digital Filters
We present a computational model of the Hammond tonewheel organ vibrato/chorus, a musical audio effect comprising an LC ladder circuit and an electromechanical scanner. We model the LC ladder using the Wave Digital Filter (WDF) formalism, and introduce a new approach to resolving multiple nonadaptable linear elements at the root of a WDF tree. Additionally we formalize how to apply the well-known warped Bilinear Transform to WDF discretization of capacitors and inductors and review WDF polarity inverters. To model the scanner we propose a simplified and physically-informed approach. We discuss the time- and frequency-domain behavior of the model, emphasizing the spectral properties of interpolation between the taps of the LC ladder.
Download Model-Based Obstacle Sonification for the Navigation of Visually Impaired Persons
This paper proposes a sonification model for encoding visual 3D information into sounds, inspired by the impact properties of the objects encountered during blind navigation. The proposed model is compared against two sonification models developed for orientation and mobility, chosen based on their common technical requirements. An extensive validation of the proposed model is reported; five legally blind and five normally sighted participants evaluated the proposed model as compared to the two competitive models on a simplified experimental navigation scenario. The evaluation addressed not only the accuracy of the responses in terms of psychophysical measurements but also the cognitive load and emotional stress of the participants by means of biophysiological signals and evaluation questionnaires. Results show that the proposed impact sound model adequately conveys the relevant information to the participants with low cognitive load, following a short training session.