Download Reservoir Computing: a powerful Framework for Nonlinear Audio Processing This paper proposes reservoir computing as a general framework for nonlinear audio processing. Reservoir computing is a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. It can in theory approximate arbitrary nonlinear dynamical systems with arbitrary precision, has an inherent temporal processing capability and is therefore well suited for many nonlinear audio processing problems. Always when nonlinear relationships are present in the data and time information is crucial, reservoir computing can be applied. Examples from three application areas are presented: nonlinear system identification of a tube amplifier emulator algorithm, nonlinear audio prediction, as necessary in a wireless transmission of audio where dropouts may occur, and automatic melody transcription out of a polyphonic audio stream, as one example from the big field of music information retrieval. Reservoir computing was able to outperform state-of-the-art alternative models in all studied tasks.
Download Automatic Noise Gate Settings for Multitrack Drum Recordings A method has been developed for automating the settings of a noise gate. The method has been applied to a kick drum track containing bleed from secondary drum sources and white noise. The optimal settings are found by maximising the signal to distortion ratio (SDR). The SDR has contributions from the distortion caused to the kick drum signal, and the residual bleed and noise. These two components are weighted, enabling the gate to be controlled by a single parameter. It is shown that the improvement in the SDR can be obtained when the two components of the SDR are approximated, enabling the optimal settings to be calculated from the noisy signal and a single kick drum hit. It is found that the optimal threshold is slightly above the peak level of the noise component of the signal.
Download Blind Separation of Monaural Signals using Complex Wavelets In this paper, a new method of blind source separation of monaural signals is presented. It is based on similarity criteria between envelopes and frequency trajectories of the components of the signal, and on its onset and offset times. The main difference with previous works is that in this paper, the input signal has been filtered using a flexible complex band pass filter bank that is a discrete version of the Complex Continuous Wavelet Transform (CCWT). Our main purpose is to show that the CCWT can be a powerful tool in blind separation, due to its strong coherence in both time and frequency domains. The presented separation algorithm is a first approximation to this important task. An example set of four synthetically mixed monaural signals have been analyzed by this method. The obtained results are promising.
Download A method for the modification of acoustic instrument tone dynamics A method is described for making natural sounding modifications of the dynamic level of tones produced by acoustic instruments. Each tone is first analyzed in the frequency domain and divided into a harmonic and a noise component. The two components are modified separately using filters based on spectral envelopes extracted from recordings of isolated tones played at different dynamic levels. When transforming from low to high dynamics, additional high frequency partials are added to the spectrum to enhance the brightness of the sound. Finally, the two modified components are summed and a time domain signal is synthesized.
Download Modeling Harmonic Phases at Glottal Closure Instants We propose a model that predicts harmonic phases at glottal closure instants. Phases are obtained from the scaled harmonic amplitude envelope derivative. This method is able to generate convincing synthesis results while avoids typical phasiness artifacts. A clear advantage of such model is to simplify the sample concatenation of sample based synthesizers. In addition, it helps to improve the sound quality of voice transformations in several contexts.
Download Chroma and MFCC Based Pattern Recognition in Audio Files Utilizing Hidden Markov Models And Dynamic Programming In this paper we present an algorithm to reveal the immanent musical structure within pieces of popular music. Our proposed model uses an estimate of the harmonic progression which is obtained by calculating beat-synchronous chroma vectors and letting a Hidden Markov Model (HMM) decide the most probable sequence of chords. In addition, MFCC vectors are computed to retrieve basic timbral information that can not be described by harmony. Subsequently, a dynamic programming algorithm is used to detect repetitive patterns in these feature sequences. Based on these patterns a second dynamic programming stage tries to find and link corresponding patterns to larger segments that reflect the musical structure.
Download Music Genre visualization and Classification Exploiting a Small set of High-level Semantic Features In this paper a system for continuous analysis, visualization and classification of musical streams is proposed. The system performs visualization and classification task by means of three high-level, semantic features extracted computing a reduction on a multidimensional low-level feature vector through the usage of Gaussian Mixture Models. The visualization of the semantic characteristics of the audio stream has been implemented by mapping the value of the high-level features on a triangular plot and by assigning to each feature a primary color. In this manner, besides having the representation of musical evolution of the signal, we have also obtained representative colors for each musical part of the analyzed streams. The classification exploits a set of one-against-one threedimensional Support Vector Machines trained on some target genres. The obtained results on visualization and classification tasks are very encouraging: our tests on heterogeneous genre streams have shown the validity of proposed approach.
Download Local Key estimation Based on Harmonic and Metric Structures In this paper, we present a method for estimating the local keys of an audio signal. We propose to address the problem of local key finding by investigating the possible combination and extension of different previous proposed global key estimation approaches. The specificity of our approach is that we introduce key dependency on the harmonic and the metric structures. In this work, we focus on the relationship between the chord progression and the local key progression in a piece of music. A contribution of our work is that we address the problem of finding a good analysis window length for local key estimation by introducing information related to the metric structure in our model. Key estimation is not performed on empirical-chosen segment length but on segments that are adapted to the analyzed piece and independent from the tempo. We evaluate and analyze our results on a new database composed of classical music pieces.
Download Spring Reverberation: A Physical Perspective Spring-based artificial reverberation was one of the earliest attempts at compact replication of room-like reverberation for studio use. The popularity and unique sound of this effect have given it a status and desirability apart from its original use. Standard methods for modeling analog audio effects are not well suited to modeling spring reverberation, due to the complex and dispersive nature of its mechanical vibration. Therefore, new methods must be examined. A typical impulse responses of a spring used for reverberation is examined, and important perceptual parameters identified. Mathematical models of spring vibration are considered, with the purpose of drawing conclusions relevant to their application in an audio environment. These models are used to produce new results relevant to the design of digital systems for the emulation of spring reverberation units. The numerical solution of these models via the finite difference method is considered. A set of measurements of two typical spring reverberation units are presented.
Download Pitch glide analysis and synthesis from Recorded Tones Pitch glide is an important effect that occurs in nearly all plucked string instruments. In essence, large amplitude waves traveling on a string during the note onset increases the string tension above its nominal value, and therefore cause the pitch to temporarily increase. Measurements are presented showing an exponential relaxation of all the partial frequencies to their nominal values with a time-constant related to the decay rate of transverse waves propagating on the string. This exponential pitch trajectory is supported by a simple physical model in which the increased tension is somewhat counterbalanced by the increased length of the string. Finally, a method for synthesizing the plucked string via a novel hybrid digital waveguide-modal synthesis model is presented with implementation details for time-varying resonators.