Download Improved hidden Markov model partial tracking through time-frequency analysis
In this article we propose a modification to the combinatorial hidden Markov model developed in [1] for tracking partial frequency trajectories. We employ the Wigner-Ville distribution and Hough transform in order to (re)estimate the frequency and chirp rate of partials in each analysis frame. We estimate the initial phase and amplitude of each partial by minimizing the squared error in the time-domain. We then formulate a new scoring criterion for the hidden Markov model which makes the tracker more robust for non-stationary and noisy signals. We achieve good performance tracking crossing linear chirps and crossing FM signals in white noise as well as real instrument recordings.
Download Sparse Atomic Modeling of Audio: a Review
Research into sparse atomic models has recently intensified in the image and audio processing communities. While other reviews exist, we believe this paper provides a good starting point for the uninitiated reader as it concisely summarizes the state-of-the-art, and presents most of the major topics in an accessible manner. We discuss several approaches to the sparse approximation problem including various greedy algorithms, iteratively re-weighted least squares, iterative shrinkage, and Bayesian methods. We provide pseudo-code for several of the algorithms, and have released software which includes fast dictionaries and reference implementations for many of the algorithms. We discuss the relevance of the different approaches for audio applications, and include numerical comparisons. We also illustrate several audio applications of sparse atomic modeling.
Download Analysis/Synthesis Using Time-Varying Windows and Chirped Atoms
A common assumption that is often made regarding audio signals is that they are short-term stationary. In other words, it is typically assumed that the statistical properties of audio signals change slowly enough that they can be considered nearly constant over a short interval. However, using a fixed analysis window (which is typical in practice) we have no way to change the analysis parameters over time in order to track the slowly evolving properties of the audio signal. For example, while a long window may be appropriate for analyzing tonal phenomena it will smear subsequent note onsets. Furthermore, the audio signal may not be completely stationary over the duration of the analysis window. This is often true of sounds containing glissando, vibrato, and other transient phenomena. In this paper we build upon previous work targeted at non-stationary analysis/synthesis. In particular, we discuss how to simultaneously adapt the window length and the chirp rate of the analysis frame in order to maximally concentrate the spectral energy. This is done by a) finding the analysis window that leads to the minimum entropy spectrum; and, b) estimating the chirp rate using the distribution derivative method. We also discuss a fast method of analysis/synthesis using the fan-chirp transform and overlap-add. Finally, we analyze several real and synthetic signals and show a qualitative improvement in the spectral energy concentration.