Download Uniform Noise Sequences for Nonlinear System Identification
Noise-based nonlinear system identification techniques using Hammerstein and Wiener forms have found wide application in biological system modeling, and been applied to modeling nonlinear audio processors such as the ring modulator. These methods apply noise to the system, and project the system output onto a set of orthogonal polynomials to reveal parameters of the model. Though Gaussian sequences are invariably used to drive the unknown system, it seems clear that the statistics of the input will affect the model estimate. Motivated by the limited input and output ranges supported by analog systems, in this work, the use of an input noise sequence having a uniform distribution is explored. In addition, an error measure indicating harmonic distortion modeling accuracy is introduced. Simulation results identifying Hammerstein and Wiener systems show that the uniform and Gaussian distributions perform differently, with the uniform distribution generally producing more accurate harmonic responses. Finally, uniform noise and Gaussian noise are used to model a saturating low-pass circuit similar to that of the Tube Screamer, with the uniform distribution providing a modest improvement in noise response error.
Download Musical Aspects of Vowel Formants in the Extreme Metal Voice
Download Analysis/Synthesis Using Time-Varying Windows and Chirped Atoms
A common assumption that is often made regarding audio signals is that they are short-term stationary. In other words, it is typically assumed that the statistical properties of audio signals change slowly enough that they can be considered nearly constant over a short interval. However, using a fixed analysis window (which is typical in practice) we have no way to change the analysis parameters over time in order to track the slowly evolving properties of the audio signal. For example, while a long window may be appropriate for analyzing tonal phenomena it will smear subsequent note onsets. Furthermore, the audio signal may not be completely stationary over the duration of the analysis window. This is often true of sounds containing glissando, vibrato, and other transient phenomena. In this paper we build upon previous work targeted at non-stationary analysis/synthesis. In particular, we discuss how to simultaneously adapt the window length and the chirp rate of the analysis frame in order to maximally concentrate the spectral energy. This is done by a) finding the analysis window that leads to the minimum entropy spectrum; and, b) estimating the chirp rate using the distribution derivative method. We also discuss a fast method of analysis/synthesis using the fan-chirp transform and overlap-add. Finally, we analyze several real and synthetic signals and show a qualitative improvement in the spectral energy concentration.
Download Onset Time Estimation for the Analysis of Percussive Sounds using Exponentially Damped Sinusoids
Exponentially damped sinusoids (EDS) model-based analysis of sound signals often requires a precise estimation of initial amplitudes and phases of the components found in the sound, on top of a good estimation of their frequencies and damping. This can be of the utmost importance in many applications such as high-quality re-synthesis or identification of structural properties of sound generators (e.g. a physical coupling of vibrating devices). Therefore, in those specific applications, an accurate estimation of the onset time is required. In this paper we present a two-step onset time estimation procedure designed for that purpose. It consists of a “rough" estimation using an STFT-based method followed by a time-domain method to “refine" the previous results. Tests carried out on synthetic signals show that it is possible to estimate onset times with errors as small as 0.2ms. These tests also confirm that operating first in the frequency domain and then in the time domain allows to reach a better resolution vs. speed compromise than using only one frequency-based or one time-based onset detection method. Finally, experiments on real sounds (plucked strings and actual percussions) illustrate how well this method performs in more realistic situations.
Download Sound Morphing by Audio Descriptors and Parameter Interpolation
We present a strategy for static morphing that relies on the sophisticated interpolation of the parameters of the signal model and the independent control of high-level audio features. The source and target signals are decomposed into deterministic, quasi-deterministic and stochastic parts, and are processed separately according to sinusoidal modeling and spectral envelope estimation. We gain further intuitive control over the morphing process by altering the interpolated spectrum according to target values of audio descriptors through an optimization process. The proposed approach leads to convincing morphing results in the case of sustained or percussive, harmonic and inharmonic sounds of possibly different durations.
Download On the Design and Use of Once-differentiable High Dynamic Resolution Atoms for the Distribution Derivative Method
The accuracy of the Distribution Derivative Method (DDM) [1] is evaluated on mixtures of chirp signals. It is shown that accurate estimation can be obtained when the sets of atoms for which the inner product is large are disjoint. This amounts to designing atoms with windows whose Fourier transform exhibits low sidelobes but which are once-differentiable in the time-domain. A technique for designing once-differentiable approximations to windows is presented and the accuracy of these windows in estimating the parameters of sinusoidal chirps in mixture is evaluated.
Download REDS: A New Asymmetric Atom for Sparse Audio Decomposition and Sound Synthesis
In this paper, we introduce a function designed specifically for sparse audio representations. A progression in the selection of dictionary elements (atoms) to sparsely represent audio has occurred: starting with symmetric atoms, then to damped sinusoid and hybrid atoms, and finally to the re-appropriation of the gammatone (GT) and formantwave-function (FOF) into atoms. These asymmetric atoms have already shown promise in sparse decomposition applications, where they prove to be highly correlated with natural sounds and musical audio, but since neither was originally designed for this application their utility remains limited. An in-depth comparison of each existing function was conducted based on application specific criteria. A directed design process was completed to create a new atom, the ramped exponentially damped sinusoid (REDS), that satisfies all desired properties: the REDS can adapt to a wide range of audio signal features and has good mathematical properties that enable efficient sparse decompositions and synthesis. Moreover, the REDS is proven to be approximately equal to the previous functions under some common conditions.
Download Fast Partial Tracking of Audio with Real-Time Capability through Linear Programming
This paper proposes a new partial tracking method, based on linear programming, that can run in real-time, is simple to implement, and performs well in difficult tracking situations by considering spurious peaks, crossing partials, and a non-stationary shortterm sinusoidal model. Complex constant parameters of a generalized short-term signal model are explicitly estimated to inform peak matching decisions. Peak matching is formulated as a variation of the linear assignment problem. Combinatorially optimal peak-to-peak assignments are found in polynomial time using the Hungarian algorithm. Results show that the proposed method creates high-quality representations of monophonic and polyphonic sounds.
Download Damped Chirp Mixture Estimation via Nonlinear Bayesian Regression
Estimating mixtures of damped chirp sinusoids in noise is a problem that affects audio analysis, coding, and synthesis applications. Phase-based non-stationary parameter estimators assume that sinusoids can be resolved in the Fourier transform domain, whereas high-resolution methods estimate superimposed components with accuracy close to the theoretical limits, but only for sinusoids with constant frequencies. We present a new method for estimating the parameters of superimposed damped chirps that has an accuracy competitive with existing non-stationary estimators but also has a high-resolution like subspace techniques. After providing the analytical expression for a Gaussian-windowed damped chirp signal’s Fourier transform, we propose an efficient variational EM algorithm for nonlinear Bayesian regression that jointly estimates the amplitudes, phases, frequencies, chirp rates, and decay rates of multiple non-stationary components that may be obfuscated under the same local maximum in the frequency spectrum. Quantitative results show that the new method not only has an estimation accuracy that is close to the Cramér-Rao bound, but also a high resolution that outperforms the state-of-the-art.
Download On the Estimation of Sinusoidal Parameters via Parabolic Interpolation of Scaled Magnitude Spectra
Sinusoids are widely used to represent the oscillatory modes of music and speech. The estimation of the sinusoidal parameters directly affects the quality of the representation. A parabolic interpolation of the peaks of the log-magnitude spectrum is commonly used to get a more accurate estimation of the frequencies and the amplitudes of the sinusoids at a relatively low computational cost. Recently, Werner and Germain proposed an improved sinusoidal estimator that performs parabolic interpolation of the peaks of a power-scaled magnitude spectrum. For each analysis window type and size, a power-scaling factor p is pre-calculated via a computationally demanding heuristic. Consequently, the powerscaling estimation method is currently constrained to a few tabulated power-scaling factors for pre-selected window sizes, limiting its practical applications. In this article, we propose a method to obtain the power-scaling factor p for any window size from the tabulated values. Additionally, we investigate the impact of zeropadding on the estimation accuracy of the power-scaled sinusoidal parameter estimator.