Download Sliding with a constant Q
The linear frequency (constant-bandwidth) scale of the FFT has long been recognised as a disadvantage for audio processing. Long analysis windows are required for adequate low-frequency resolution, while small windows offer lower latency, better handling of transients, and reduced computation cost. A constant-Q form of analysis offers the possibility of increased low-frequency resolution for a given window size, this resolution being essential for many fundamental processing tasks such as pitch shifting. We consider the application of the Sliding Discrete Fourier Transform to a Constant-Q analysis. The increased flexibility of sliding allows for a variety of data alignments, and we produce the mathematical formulation of these. Windowing in the frequency domain introduces further complications. Finally we consider the implementation of the analysis on both serial and parallel computers.
Download Vocal melody detection in the presence of pitched accompaniment using harmonic matching methods
Vocal music is characterized by a melodically salient singing voice accompanied by one or more instruments. With a pitched instrument background, multiple periodicities are simultaneously present and the task becomes one of identifying and tracking the vocal pitch based on pitch strength and smoothness constraints. Frequency domain harmonic matching methods can be applied to detect pitch via the harmonically related frequencies that fit the signal’s measured spectral peaks. The specific spectral fitness measure is expected to influence the performance of vocal pitch detection depending on the nature of the polyphonic mixture. In this work, we consider Indian classical music which provides important examples of singing voice accompanied by strongly pitched instruments. It is shown that the spectral fitness measure of the two-way mismatch method is well suited to track vocal pitch in the presence of the pitched percussion with its strong but sparse harmonic structure. The detected pitch is further used to obtain a measure of voicing that reliably discriminates vocal segments from purely instrumental regions.
Download Multiple-F0 tracking based on a high-order HMM model
This paper is about multiple-F0 tracking and the estimation of the number of harmonic source streams in music sound signals. A source stream is understood as generated from a note played by a musical instrument. A note is described by a hidden Markov model (HMM) having two states: the attack state and the sustain state. It is proposed to first perform the tracking of F0 candidates using a high-order hidden Markov model, based on a forward-backward dynamic programming scheme. The propagated weights are calculated in the forward tracking stage, followed by an iterative tracking of the most likely trajectories in the backward tracking stage. Then, the estimation of the underlying source streams is carried out by means of iteratively pruning the candidate trajectories in a maximum likelihood manner. The proposed system is evaluated by a specially constructed polyphonic music database. Compared with the frame-based estimation systems, the tracking mechanism improves significantly the accuracy rate.