Download Onset Detection Revisited Various methods have been proposed for detecting the onset times of musical notes in audio signals. We examine recent work on onset detection using spectral features such as the magnitude, phase and complex domain representations, and propose improvements to these methods: a weighted phase deviation function and a halfwave rectified complex difference. These new algorithms are compared with several state-of-the-art algorithms from the literature, and these are tested using a standard data set of short excerpts from a range of instruments (1060 onsets), plus a much larger data set of piano music (106054 onsets). Some of the results contradict previously published results and suggest that a similarly high level of performance can be obtained with a magnitude-based (spectral flux), a phase-based (weighted phase deviation) or a complex domain (complex difference) onset detection function.
Download On the evaluation of perceptual similarity measures for music Several applications in the field of content-based interaction with music repositories rely on measures which estimate the perceived similarity of music. These applications include automatic genre recognition, playlist generation, and recommender systems. In this paper we study methods to evaluate the performance of such measures. We compare five measures which use only the information extracted from the audio signal and discuss how these measures can be evaluated qualitatively and quantitatively without resorting to large scale listening tests.
Download A Comparison of Extended Source-Filter Models for Musical Signal Reconstruction Recently, we have witnessed an increasing use of the sourcefilter model in music analysis, which is achieved by integrating the source filter model into a non-negative matrix factorisation (NMF) framework or statistical models. The combination of the source-filter model and NMF framework reduces the number of free parameters needed and makes the model more flexible to extend. This paper compares four extended source-filter models: the source-filter-decay (SFD) model, the NMF with timefrequency activations (NMF-ARMA) model, the multi-excitation (ME) model and the source-filter model based on β-divergence (SFbeta model). The first two models represent the time-varying spectra by adding a loss filter and a time-varying filter, respectively. The latter two are extended by using multiple excitations and including a scale factor, respectively. The models are tested using sounds of 15 instruments from the RWC Music Database. Performance is evaluated based on the relative reconstruction error. The results show that the NMF-ARMA model outperforms other models, but uses the largest set of parameters.
Download Characterisation of Acoustic Scenes Using a Temporally-constrained Shift-invariant Model In this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance.