Download A Model for Adaptive Reduced-Dimensionality Equalisation
We present a method for mapping between the input space of a parametric equaliser and a lower-dimensional representation, whilst preserving the effect’s dependency on the incoming audio signal. The model consists of a parameter weighting stage in which the parameters are scaled to spectral features of the audio signal, followed by a mapping process, in which the equaliser’s 13 inputs are converted to (x, y) coordinates. The model is trained with parameter space data representing two timbral adjectives (warm and bright), measured across a range of musical instrument samples, allowing users to impose a semantically-meaningful timbral modification using the lower-dimensional interface. We test 10 mapping techniques, comprising of dimensionality reduction and reconstruction methods, and show that a stacked autoencoder algorithm exhibits the lowest parameter reconstruction variance, thus providing an accurate map between the input and output space. We demonstrate that the model provides an intuitive method for controlling the audio effect’s parameter space, whilst accurately reconstructing the trajectories of each parameter and adapting to the incoming audio spectrum.
Download Towards a Fuzzy Logic Approach to Drum Pattern Humanisation
A fuzzy logic-based approach can be used to simulate human agents in many control situations. Numerous authors have noted that this methodology has advantages for a variety of tasks within the realm of computer music. In this paper, a review of such projects is conducted and a rudimentary example application of fuzzy logic techniques is presented. This automatically achieves a basic level of 'humanisation' of a drum pattern through strike velocity modification. Such a tool could significantly reduce the time spent on editing individual drum hits in a music production environment and has potential applications for rhythmic composition and performance.
Download GstPEAQ – an Open Source Implementation of the PEAQ Algorithm
In 1998, the ITU published a recommendation for an algorithm for objective measurement of audio quality, aiming to predict the outcome of listening tests. Despite the age, today only one implementation of that algorithm meeting the conformance requirements exists. Additionally, two open source implementations of the basic version of the algorithm are available which, however, do not meet the conformance requirements. In this paper, yet another non-conforming open source implementation, GstPEAQ, is presented. However, it improves upon the previous ones by coming closer to conformance and being computationally more efficient. Furthermore, it implements not only the basic, but also the advanced version of the algorithm. As is also shown, despite the nonconformance, the results obtained computationally still closely resemble those of listening tests.
Download A Direct Microdynamics Adjusting Processor with Matching Paradigm and Differentiable Implementation
In this paper, we propose a new processor capable of directly changing the microdynamics of an audio signal primarily via a single dedicated user-facing parameter. The novelty of our processor is that it has built into it a measure of relative level, a short-term signal strength measurement which is robust to changes in signal macrodynamics. Consequent dynamic range processing is signal level-independent in its nature, and attempts to directly alter its observed relative level measurements. The inclusion of such a meter within our proposed processor also gives rise to a natural solution to the dynamics matching problem, where we attempt to transfer the microdynamic characteristics of one audio recording to another by means of estimating appropriate settings for the processor. We suggest a means of providing a reasonable initial guess for processor settings, followed by an efficient iterative algorithm to refine upon our estimates. Additionally, we implement the processor as a differentiable recurrent layer and show its effectiveness when wrapped around a gradient descent optimizer within a deep learning framework. Moreover, we illustrate that the proposed processor has more favorable gradient characteristics relative to a conventional dynamic range compressor. Throughout, we consider extensions of the processor, matching algorithm, and differentiable implementation for the multiband case.
Download Audio Morphing Using Matrix Decomposition and Optimal Transport
This paper presents a system for morphing between audio recordings in a continuous parameter space. The proposed approach combines matrix decompositions used for audio source separation with displacement interpolation enabled by 1D optimal transport. By interpolating the spectral components obtained using nonnegative matrix factorization of the source and target signals, the system allows varying the timbre of a sound in real time, while maintaining its temporal structure. Using harmonic / percussive source separation as a pre-processing step, the system affords more detailed control of the interpolation in perceptually meaningful dimensions.
Download Alloy Sounds: Non-Repeating Sound Textures With Probabilistic Cellular Automata
Contemporary musicians commonly face the challenge of finding new, characteristic sounds that can make their compositions more distinct. They often resort to computers and algorithms, which can significantly aid in creative processes by generating unexpected material in controlled probabilistic processes. In particular, algorithms that present emergent behaviors, like genetic algorithms and cellular automata, have fostered a broad diversity of musical explorations. This article proposes an original technique for the computer-assisted creation and manipulation of sound textures. The technique uses Probabilistic Cellular Automata, which are yet seldom explored in the music domain, to blend two audio tracks into a third, different one. The proposed blending process works by dividing the source tracks into frequency bands and then associating each of the automaton’s cell to a frequency band. Only one source, chosen by the cell’s state, is active within each band. The resulting track has a non-repeating textural pattern that follows the changes in the Cellular Automata. This blending process allows the musician to choose the original material and the blend granularity, significantly changing the resulting blends. We demonstrate how to use the proposed blending process in sound design and its application in experimental and popular music.
Download Modeling the Frequency-Dependent Sound Energy Decay of Acoustic Environments with Differentiable Feedback Delay Networks
Differentiable machine learning techniques have recently proved effective for finding the parameters of Feedback Delay Networks (FDNs) so that their output matches desired perceptual qualities of target room impulse responses. However, we show that existing methods tend to fail at modeling the frequency-dependent behavior of sound energy decay that characterizes real-world environments unless properly trained. In this paper, we introduce a novel perceptual loss function based on the mel-scale energy decay relief, which generalizes the well-known time-domain energy decay curve to multiple frequency bands. We also augment the prototype FDN by incorporating differentiable wideband attenuation and output filters, and train them via backpropagation along with the other model parameters. The proposed approach improves upon existing strategies for designing and training differentiable FDNs, making it more suitable for audio processing applications where realistic and controllable artificial reverberation is desirable, such as gaming, music production, and virtual reality.
Download Blind Source Separation Using Repetitive Structure
Blind source separation algorithms typically involve decorrelating time-aligned mixture signals. The usual assumption is that all sources are active at all times. However, if this is not the case, we show that the unique pattern of source activity/inactivity helps separation. Music is the most obvious example of sources exhibiting repetitive structure because it is carefully constructed. We present a novel source separation algorithm based on spatial time-time distributions that capture the repetitive structure in audio. Our method outperforms time-frequency source separation when source spectra are highly overlapping.
Download Model Bending: Teaching Circuit Models New Tricks
A technique is introduced for generating novel signal processing systems grounded in analog electronic circuits, called model bending. By applying the ideas behind circuit bending to models of nonlinear analog circuits it is possible to create novel nonlinear signal processors which mimic the behavior of analog electronics, but which are not possible to implement in the analog realm. The history of both circuit bending and circuit modeling is discussed, as well as a theoretical basis for how these approaches can complement each other. Potential pitfalls to the practical application of model bending are highlighted and suggested solutions to those problems are provided, with examples.
Download Equalizing Loudspeakers in Reverberant Environments Using Deep Convolutive Dereverberation
Loudspeaker equalization is an established topic in the literature, and currently many techniques are available to address most practical use cases. However, most of these rely on accurate measurements of the loudspeaker in an anechoic environment, which in some occurrences is not feasible. This is the case, e.g. of custom digital organs, which have a set of loudspeakers that are built into a large and geometrically-complex piece of furniture, which may be too heavy and large to be transported to a measurement room, or may require a big one, making traditional impulse response measurements impractical for most users. In this work we propose a method to find the inverse of the sound emission system in a reverberant environment, based on a Deep Learning dereverberation algorithm. The method is agnostic of the room characteristics and can be, thus, conducted in an automated fashion in any environment. A real use case is discussed and results are provided, showing the effectiveness of the approach in designing filters that match closely the magnitude response of the ideal inverting filters.