Download An Evaluation of Audio Feature Extraction Toolboxes Audio feature extraction underpins a massive proportion of audio processing, music information retrieval, audio effect design and audio synthesis. Design, analysis, synthesis and evaluation often rely on audio features, but there are a large and diverse range of feature extraction tools presented to the community. An evaluation of existing audio feature extraction libraries was undertaken. Ten libraries and toolboxes were evaluated with the Cranfield Model for evaluation of information retrieval systems, reviewing the coverage, effort, presentation and time lag of a system. Comparisons are undertaken of these tools and example use cases are presented as to when toolboxes are most suitable. This paper allows a software engineer or researcher to quickly and easily select a suitable audio feature extraction toolbox.
Download Alloy Sounds: Non-Repeating Sound Textures With Probabilistic Cellular Automata Contemporary musicians commonly face the challenge of finding
new, characteristic sounds that can make their compositions more
distinct. They often resort to computers and algorithms, which can
significantly aid in creative processes by generating unexpected
material in controlled probabilistic processes. In particular, algorithms that present emergent behaviors, like genetic algorithms
and cellular automata, have fostered a broad diversity of musical explorations. This article proposes an original technique for
the computer-assisted creation and manipulation of sound textures.
The technique uses Probabilistic Cellular Automata, which are yet
seldom explored in the music domain, to blend two audio tracks
into a third, different one. The proposed blending process works
by dividing the source tracks into frequency bands and then associating each of the automaton’s cell to a frequency band. Only one
source, chosen by the cell’s state, is active within each band. The
resulting track has a non-repeating textural pattern that follows the
changes in the Cellular Automata. This blending process allows
the musician to choose the original material and the blend granularity, significantly changing the resulting blends. We demonstrate
how to use the proposed blending process in sound design and its
application in experimental and popular music.
Download Vibrato extraction and parameterization in the Spectral Modeling Synthesis framework Periodic or quasi-periodic low-frequency components (i.e. vibrato and tremolo) are present in steadystate portions of sustained instrumental sounds. If we are interested both in studying its expressive meaning, or in building a hierarchical multi-level representation of sound in order to manipulate it and transform it with musical purposes those components should be isolated and separated from the amplitude and frequency envelopes. Within the SMS analysis framework it is now feasible to extract high level time-evolving attributes starting from basic analysis data. In the case of frequency envelopes we can apply STFTs to them, then check if there is a prominent peak in the vibrato/tremolo range and, if it is true, we can smooth it away in the frequency domain; finally, we can apply an IFFT to each frame in order to re-construct an envelope that has been cleaned of those quasi-periodic low-frequency components. Two important problems nevertheless have to be tackled, and ways of overcoming them will be discussed in this paper: first, the periodicity of vibrato and tremolo, that is quite exact only when the performers are professional musicians; second: the interactions between formants and fundamental frequency trajectories, that blur the real tremolo component and difficult its analysis.
Download Automatic Tablature Transcription of Electric Guitar Recordings by Estimation of Score- and Instrument-Related Parameters In this paper we present a novel algorithm for automatic analysis, transcription, and parameter extraction from isolated polyphonic guitar recordings. In addition to general score-related information such as note onset, duration, and pitch, instrumentspecific information such as the plucked string, the applied plucking and expression styles are retrieved automatically. For this purpose, we adapted several state-of-the-art approaches for onset and offset detection, multipitch estimation, string estimation, feature extraction, and multi-class classification. Furthermore we investigated a robust partial tracking algorithm with respect to inharmonicity, an extensive extraction of novel and known audio features as well as the exploitation of instrument-based knowledge in the form of plausability filtering to obtain more reliable prediction. Our system achieved very high accuracy values of 98 % for onset and offset detection as well as multipitch estimation. For the instrument-related parameters, the proposed algorithm also showed very good performance with accuracy values of 82 % for the string number, 93 % for the plucking style, and 83 % for the expression style. Index Terms - playing techniques, plucking style, expression style, multiple fundamental frequency estimation, string classification, fretboard position, fingering, electric guitar, inharmonicity coefficient, tablature
Download Barberpole Phasing and Flanging Illusions Various ways to implement infinitely rising or falling spectral notches, also known as the barberpole phaser and flanging illusions, are described and studied. The first method is inspired by the Shepard-Risset illusion, and is based on a series of several cascaded notch filters moving in frequency one octave apart from each other. The second method, called a synchronized dual flanger, realizes the desired effect in an innovative and economic way using two cascaded time-varying comb filters and cross-fading between them. The third method is based on the use of single-sideband modulation, also known as frequency shifting. The proposed techniques effectively reproduce the illusion of endlessly moving spectral notches, particularly at slow modulation speeds and for input signals with a rich frequency spectrum. These effects can be programmed in real time and implemented as part of a digital audio processing system.
Download Modulation Extraction for LFO-driven Audio Effects Low frequency oscillator (LFO) driven audio effects such as phaser, flanger, and chorus, modify an input signal using time-varying filters and delays, resulting in characteristic sweeping or widening effects. It has been shown that these effects can be modeled using neural networks when conditioned with the ground truth LFO signal. However, in most cases, the LFO signal is not accessible and measurement from the audio signal is nontrivial, hindering the modeling process. To address this, we propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. Since our system imposes no restrictions on the LFO signal shape, we demonstrate its ability to extract quasiperiodic, combined, and distorted modulation signals that are relevant to effect modeling. Furthermore, we show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects using only dry and wet audio pairs, overcoming the need to access the audio effect or internal LFO signal. We make our code available and provide the trained audio effect models in a real-time VST plugin1 .
Download Explicit Vector Wave Digital Filter Modeling of Circuits with a Single Bipolar Junction Transistor The recently developed extension of Wave Digital Filters based on vector wave variables has broadened the class of circuits with linear two-port elements that can be modeled in a modular and explicit fashion in the Wave Digital (WD) domain. In this paper, we apply the vector definition of wave variables to nonlinear twoport elements. In particular, we present two vector WD models of a Bipolar Junction Transistor (BJT) using characteristic equations derived from an extended Ebers-Moll model. One, implicit, is based on a modified Newton-Raphson method; the other, explicit, is based on a neural network trained in the WD domain and it is shown to allow fully explicit implementation of circuits with a single BJT, which can be executed in real time.
Download DDSP-Based Neural Waveform Synthesis of Polyphonic Guitar Performance From String-Wise MIDI Input We explore the use of neural synthesis for acoustic guitar from string-wise MIDI input. We propose four different systems and compare them with both objective metrics and subjective evaluation against natural audio and a sample-based baseline. We iteratively develop these four systems by making various considerations on the architecture and intermediate tasks, such as predicting pitch and loudness control features. We find that formulating the control feature prediction task as a classification task rather than a regression task yields better results. Furthermore, we find that our simplest proposed system, which directly predicts synthesis parameters from MIDI input performs the best out of the four proposed systems. Audio examples and code are available.
Download Characterisation and Excursion Modelling of Audio Haptic Transducers Statement and calculation of objective audio haptic transducer performance metrics facilitates optimisation of multi-sensory sound reproduction systems. Measurements of existing haptic transducers are applied to the calculation of a series of performance metrics to demonstrate a means of comparative objective analysis. The frequency response, transient response and moving mass excursion characteristics of each measured transducer are quantified using novel and previously defined metrics. Objective data drawn from a series of practical measurements shows that the proposed metrics and means of excursion modelling applied herein are appropriate for haptic transducer evaluation and protection against over-excursion respectively.
Download Modeling and Extending the Rca Mark Ii Sound Effects Filter We have analyzed the Sound Effects Filter from the one-of-a-kind RCA Mark II sound synthesizer and modeled it as a Wave Digital Filter using the Faust language, to make this once exclusive device widely available. By studying the original schematics and measurements of the device, we discovered several circuit modifications. Building on these, we proposed a number of extensions to the circuit which increase its usefulness in music production.