Download Sub-Band Independent Subspace Analysis for Drum Transcription
While Independent Subspace Analysis provides a means of separating sound sources from a single channel signal, making it an effective tool for drum transcription, it does have a number of problems. Not least of these is that the amount of information required to allow separation of sound sources varies from signal to signal. To overcome this indeterminacy and improve the robustness of transcription an extension of Independent Subspace Analysis to include sub-band processing is proposed. The use of this approach is demonstrated by its application in a simple drum transcription algorithm.
Download Independent subspace analysis using locally linear embedding
While Independent Subspace Analysis provides a means of blindly separating sound sources from a single channel signal, it does have a number of problems. In particular the amount of information required for separation of sources varies with the signal. This is as a result of the variance-based nature of Principal Component Analysis, which is used for dimensional reduction in the Independent Subspace Analysis algorithm. In an attempt to overcome this problem the use of a non-variance based dimensional reduction method, Locally Linear Embedding, is proposed. Locally Linear Embedding is a geometry based dimensional reduction technique. The use of this approach is demonstrated by its application to single channel source separation, and its merits discussed.
Download An Efficient Phasiness Reduction Technique for Moderate Audio Time-Scale Modification
Phase vocoder approaches to time-scale modification of audio introduce a reverberant/phasy artifact into the time-scaled output due to a loss in phase coherence between short-time Fourier transform (STFT) bins. Recent improvements to the phase vocoder have reduced the presence of this artifact, however, it remains a problem. A method of time-scaling is presented that results in a further reduction in phasiness, for moderate time-scale factors, by taking advantage of some flexibility that exists in the choice of phase required so as to maintain horizontal phase coherence between related STFT bins. Furthermore, the approach leads to a reduction in computational load within the range of time-scaling factors for which phasiness is reduced.
Download Single-Note Ornamentation Transcription for the Irish Tin Whistle Based on Onset Detection
Ornamentation plays a very important role in Irish Traditional music, giving more expression to the music by altering or embellishing small pieces of a melody. Single-note ornamentation, such as cuts and strikes, are the most common type in Irish Traditional music and are played by articulating the note pitch during the onset stage. A technique for transcribing single note ornamentation for the tin whistle based on onset detection is presented. This method focuses on the characteristics of the tin whistle within Irish traditional music, customising a time-frequency based representation for detecting the instant when new notes played using single-note ornamentation start and release.
Download Sound Source Separation: Azimuth Discrimination and Resynthesis
In this paper we present a novel sound source separation algorithm which requires no prior knowledge, no learning, assisted or otherwise, and performs the task of separation based purely on azimuth discrimination within the stereo field. The algorithm exploits the use of the pan pot as a means to achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists between left and right channels for a single source. We use gain scaling and phase cancellation techniques to expose frequency dependent nulls across the azimuth domain, from which source separation and resynthesis is carried out. We present results obtained from real recordings, and show that for musical recordings, the algorithm improves upon the output quality of current source separation schemes.
Download Generalised Prior Subspace Analysis for Polyphonic Pitch Transcription
A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translation of a single frequency prior subspace of a note to approximate other notes, overcoming the problem of needing a separate basis function for each note played by an instrument. Examples are then demonstrated which show the utility of the generalised PSA algorithm for the purposes of polyphonic pitch transcription.
Download Time and pitch scale modification: A real-time framework and tutorial
A framework is presented which is designed to address the issues related to the real-time implementation of time-scale and pitchscale modification algorithms. This framework can be used as the basis for the developments of applications which allow for a seamless real-time transition between continually varying timescale and pitch-scale parameters which arise as a result of manual or automatic intervention.
Download Self-Authentication of Audio signals by Chirp Coding
This paper discusses a new approach to ‘watermarking’ digital signals using linear frequency modulated or ‘chirp’ coding. The principles underlying this approach are based on the use of a matched filter to provide a reconstruction of a chirped code that is uniquely robust in the case of signals with very low signal-to-noise ratios. Chirp coding for authenticating data is generic in the sense that it can be used for a range of data types and applications (the authentication of speech and audio signals, for example). The theoretical and computational aspects of the matched filter and the properties of a chirp are revisited to provide the essential background to the method. Signal code generating schemes are then addressed and details of the coding and decoding techniques considered. Finally, the paper briefly describes an example application which is available on-line for readers who are interested in using the approach for audio data authentication working with either WAV or MP3 files.
Download Using tensor factorisation models to separate drums from polyphonic music
This paper describes the use of Non-negative Tensor Factorisation models for the separation of drums from polyphonic audio. Improved separation of the drums is achieved through the incorporation of Gamma Chain priors into the Non-negative Tensor Factorisation framework. In contrast to many previous approaches, the method used in this paper requires little or no pre-training or use of drum templates. The utility of the technique is shown on real-world audio examples.
Download Shifted NMF with Group Sparsity for Clustering NMF Basis Functions
Recently, Non-negative Matrix Factorisation (NMF) has found application in separation of individual sound sources. NMF decomposes the spectrogram of an audio mixture into an additive parts based representation where the parts typically correspond to individual notes or chords. However, there is a need to cluster the NMF basis functions to their sources. Although, many attempts have been made to improve the clustering of the basis functions to sources, much research is still required in this area. Recently, Shifted Non-negative Matrix Factorisation (SNMF) was used to cluster these basis functions. To this end, we propose that the incorporation of group sparsity to the Shifted NMF based methods may benefit the clustering algorithms. We have tested this on SNMF algorithms with improved separation quality. Results show that this gives improved clustering of pitched basis functions over previous methods.