Download The Beating Equalizer and its Application to the Synthesis and Modification of Piano Tones
This paper presents an improved method for simulating and modifying the beating effect in piano tones. The beating effect is an audible phenomenon, which is characteristic to the piano, and, hence, it should be accounted for in realistic piano synthesis. The proposed method, which is independent of the synthesis technique, contains a cascade of second-order equalizing filters, where each filter produces the beating effect for a single partial by modulating the peak gain. Moreover, the method offers a way to control the beating frequency and the beating depth, and it can be used to modify the beating envelope in existing tones. The results show that the proposed method is able to simulate the desired beating effect.
Download Simplified, Physically-Informed Models of Distortion and Overdrive Guitar Effects Pedals
This paper explores a computationally efficient, physically informed approach to design algorithms for emulating guitar distortion circuits. Two iconic effects pedals are studied: the “Distortion” pedal and the “Tube Screamer” or “Overdrive” pedal. The primary distortion mechanism in both pedals is a diode clipper with an embedded low-pass filter, and is shown to follow a nonlinear ordinary differential equation whose solution is computationally expensive for real-time use. In the proposed method, a simplified model, comprising the cascade of a conditioning filter, memoryless nonlinearity and equalization filter, is chosen for its computationally efficient, numerically robust properties. Often, the design of distortion algorithms involves tuning the parameters of this filter-distortion-filter model by ear to match the sound of a prototype circuit. Here, the filter transfer functions and memoryless nonlinearities are derived by analysis of the prototype circuit. Comparisons of the resulting algorithms to actual pedals show good agreement and demonstrate that the efficient algorithms presented reproduce the general character of the modeled pedals.
Download Simulation of the Diode Limiter in Guitar Distortion Circuits by Numerical Solution of Ordinary Differential Equations
The diode clipper circuit with an embedded low-pass filter lies at the heart of both diode clipping “Distortion” and “Overdrive” or “Tube Screamer” effects pedals. An accurate simulation of this circuit requires the solution of a nonlinear ordinary differential equation (ODE). Numerical methods with stiff stability – Backward Euler, Trapezoidal Rule, and second-order Backward Difference Formula – allow the use of relatively low sampling rates at the cost of accuracy and aliasing. However, these methods require iteration at each time step to solve a nonlinear equation, and the tradeoff for this complexity must be evaluated against simple explicit methods such as Forward Euler and fourth order Runge-Kutta, which require very high sampling rates for stability. This paper surveys and compares the basic ODE solvers as they apply to simulating circuits for audio processing. These methods are compared to a static nonlinearity with a pre-filter. It is found that implicit or semiimplicit solvers are preferred and that the filter/static nonlinearity approximation is often perceptually adequate.
Download A Generic System for Audio Indexing: Application to Speech/Music Segmentation and Music Genre Recognition
In this paper we present a generic system for audio indexing (classification/ segmentation) and apply it to two usual problems: speech/ music segmentation and music genre recognition. We first present some requirements for the design of a generic system. The training part of it is based on a succession of four steps: feature extraction, feature selection, feature space transform and statistical modeling. We then propose several approaches for the indexing part depending of the local/ global characteristics of the indexes to be found. In particular we propose the use of segment-statistical models. The system is then applied to two usual problems. The first one is the speech/ music segmentation of a radio stream. The application is developed in a real industrial framework using real world categories and data. The performances obtained for the pure speech/ music classes problem are good. However when considering also the non-pure categories (mixed, bed) the performances of the system drop. The second problem is the music genre recognition. Since the indexes to be found are global, “segment-statistical models” are used leading to results close to the state of the art.
Download Analytical Features for the Classification of Percussive Sounds: The Case of the Pandeiro
There is an increasing need for automatically classifying sounds for MIR and interactive music applications. In the context of supervised classification, we describe an approach that improves the performance of the general bag-of-frame scheme without loosing its generality. This method is based on the construction and exploitation of specific audio features, called analytical, as input to classifiers. These features are better, in a sense we define precisely than standard, general features, or even than ad hoc features designed by hand for specific problems. To construct these features, our method explores a very large space of functions, by composing basic operators in syntactically correct ways. These operators are taken from the Mathematical and Audio Processing domains. Our method allows us to build a large number of these features, evaluate and select them automatically for arbitrary audio classification problems. We present here a specific study concerning the analysis of Pandeiro (Brazilian tambourine) sounds. Two problems are considered: the classification of entire sounds, for MIR applications, and the classification of attacks portions of the sound only, for interactive music applications. We evaluate precisely the gain obtained by analytical features on these two problems, in comparison with standard approaches.
Download Automatic Music Detection in Television Productions
This paper presents methods for the automatic detection of music within audio streams, in the fore- or background. The problem occurs in the context of a real-world application, namely, the analysis of TV productions w.r.t. the use of music. In contrast to plain speech/music discrimination, the problem of detecting music in TV productions is extremely difficult, since music is often used to accentuate scenes while concurrently speech and any kind of noise signals might be present. We present results of extensive experiments with a set of standard machine learning algorithms and standard features, investigate the difference between frame-level and clip-level features, and demonstrate the importance of the application of smoothing functions as a post-processing step. Finally, we propose a new feature, called Continuous Frequency Activation (CFA), especially designed for music detection, and show experimentally that this feature is more precise than the other approaches in identifying segments with music in audio streams.
Download Chorus Detection with Combined Use of MFCC and Chroma Features and Image Processing Filters
A computationally efficient method for detecting a chorus section in popular and rock music is presented. The method utilizes a distance matrix representation that is obtained by summing two separate distance matrices calculated using the mel-frequency cepstral coefficient and pitch chroma features. The benefit of computing two separate distance matrices is that different enhancement operations can be applied on each. An enhancement operation is found beneficial only for the chroma distance matrix. This is followed by detection of the off-diagonal segments of small distance from the distance matrix. From the detected segments, an initial chorus section is selected using a scoring mechanism utilizing several heuristics, and subjected to further processing. This further processing involves using image processing filters in a neighborhood of the distance matrix surrounding the initial chorus section. The final position and length of the chorus is selected based on the filtering results. On a database of 206 popular & rock music pieces an average F-measure of 86% is obtained. It takes about ten seconds to process a song with an average duration of three to four minutes on a Windows XP computer with a 2.8 GHz Intel Xeon processor.
Download A Matlab Toolbox for Musical Feature Extraction from Audio
We present MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files. The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and integrating different variants proposed by alternative approaches – including new strategies we have developed –, that users can select and parametrize. This paper offers an overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with MIRtoolbox. Four particular analyses are provided as examples. The toolbox also includes functions for statistical analysis, segmentation and clustering. Particular attention has been paid to the design of a syntax that offers both simplicity of use and transparent adaptiveness to a multiplicity of possible input types. Each feature extraction method can accept as argument an audio file, or any preliminary result from intermediary stages of the chain of operations. Also the same syntax can be used for analyses of single audio files, batches of files, series of audio segments, multichannel signals, etc. For that purpose, the data and methods of the toolbox are organised in an object-oriented architecture.
Download Real-Time Visualisation of Loudness Along Different Time Scales
We propose a set of design criteria for visualising loudness features of an audio signal, measured along different time scales. A novel real-time loudness meter, based on these criteria, is presented. The meter simultaneously shows short-term loudness, long-term loudness and peak level. The short-term loudness is displayed using a circular bar graph. The meter displays the longterm loudness by means of a circular envelope graph, organized according to an absolute time-scale – looking similar to a radar display. Typically, the loudness measured during the past hour is visible. The algorithms underlying the meter's loudness and peak level measurements take into account recent ITU-R recommendations and research into loudness modelling.
Download The Origins of DAFx and its Future within the Sound and Music Computing Field
DAFX is an established conference that has become a reference gathering for the researchers working on audio signal processing. In this presentation I will go back ten years to the beginning of this conference and to the ideas that promoted it. Then I will jump to the present, to the current context of our research field, different from the one ten years ago, and I will make some personal reflections on the current situation and the challenges that we are encountering.