Download Model Bending: Teaching Circuit Models New Tricks A technique is introduced for generating novel signal processing systems grounded in analog electronic circuits, called model bending. By applying the ideas behind circuit bending to models of nonlinear analog circuits it is possible to create novel nonlinear signal processors which mimic the behavior of analog electronics, but which are not possible to implement in the analog realm. The history of both circuit bending and circuit modeling is discussed, as well as a theoretical basis for how these approaches can complement each other. Potential pitfalls to the practical application of model bending are highlighted and suggested solutions to those problems are provided, with examples.
Download Modeling and Extending the Rca Mark Ii Sound Effects Filter We have analyzed the Sound Effects Filter from the one-of-a-kind RCA Mark II sound synthesizer and modeled it as a Wave Digital Filter using the Faust language, to make this once exclusive device widely available. By studying the original schematics and measurements of the device, we discovered several circuit modifications. Building on these, we proposed a number of extensions to the circuit which increase its usefulness in music production.
Download Deforming the Oscillator: Iterative Phases Over Parametrizable Closed Paths Iterative phase formulations allow for the generalization of many oscillatory sound synthesis methods from circles to general parametrizable loops, with or without explicit geometric contexts. This paper describes this approach, leading to the ability to perform modulation, feedback and chaotic oscillations over deformed circles that can include ill-behaved geometries, while allowing modulations or feedback to be deformed as well.
Download Differentiable Time–frequency Scattering on GPU Joint time–frequency scattering (JTFS) is a convolutional operator in the time–frequency domain which extracts spectrotemporal modulations at various rates and scales. It offers an idealized model of spectrotemporal receptive fields (STRF) in the primary auditory cortex, and thus may serve as a biological plausible surrogate for human perceptual judgments at the scale of isolated audio events. Yet, prior implementations of JTFS and STRF have remained outside of the standard toolkit of perceptual similarity measures and evaluation methods for audio generation. We trace this issue down to three limitations: differentiability, speed, and flexibility. In this paper, we present an implementation of time–frequency scattering in Python. Unlike prior implementations, ours accommodates NumPy, PyTorch, and TensorFlow as backends and is thus portable on both CPU and GPU. We demonstrate the usefulness of JTFS via three applications: unsupervised manifold learning of spectrotemporal modulations, supervised classification of musical instruments, and texture resynthesis of bioacoustic sounds.
Download On the Challenges of Embedded Real-Time Music Information Retrieval Real-time applications of Music Information Retrieval (MIR) have been gaining interest as of recently. However, as deep learning becomes more and more ubiquitous for music analysis tasks, several challenges and limitations need to be overcome to deliver accurate and quick real-time MIR systems. In addition, modern embedded computers offer great potential for compact systems that use MIR algorithms, such as digital musical instruments. However, embedded computing hardware is generally resource constrained, posing additional limitations. In this paper, we identify and discuss the challenges and limitations of embedded real-time MIR. Furthermore, we discuss potential solutions to these challenges, and demonstrate their validity by presenting an embedded real-time classifier of expressive acoustic guitar techniques. The classifier achieved 99.2% accuracy in distinguishing pitched and percussive techniques and a 99.1% average accuracy in distinguishing four distinct percussive techniques with a fifth class for pitched sounds. The full classification task is a considerably more complex learning problem, with our preliminary results reaching only 56.5% accuracy. The results were produced with an average latency of 30.7 ms.
Download Time-Varying Filter Stability and State Matrix Products We show a new sufficient criterion for time-varying digital filter stability: that the matrix norm of the product of state matrices over a certain finite number of time steps is bounded by 1. This extends Laroche’s Criterion 1, which only considered one time step, while hinting at extensions to two time steps. Further extending these results, we also show that there is no intrinsic requirement that filter coefficients be frozen over any time scale, and extend to any dimension a helpful theorem that allows us to avoid explicitly performing eigen- or singular value decompositions in studying the matrix norm. We give a number of case studies on filters known to be time-varying stable, that cannot be proven time-varying stable with the original criterion, where the new criterion succeeds.
Download Two Datasets of Room Impulse Responses for Navigation in Six Degrees-of-Freedom:a Symphonic Concert Hall and a Former Planetarium This paper presents two datasets of room impulse responses (RIRs) for navigable virtual acoustics. The first is a set of 240 mono and Ambisonic RIRs recorded at the Maison Symphonique, a symphonic concert hall in Montreal renowned for its great acoustic characteristics. The second is a set of 67 third-order Ambisonic RIRs which was recorded in the former planetarium of Montreal (currently known as the Centech), a space where the room acoustic includes an acoustic focal point where extreme reverberation times occur. The article first describes the two datasets and the methods that were used to capture them. A use case for these RIRs is then presented: an audio rendering of scene navigation using interpolation among RIRs.
Download Deep Learning Conditioned Modeling of Optical Compression Deep learning models applied to raw audio are rapidly gaining relevance in modeling audio analog devices. This paper investigates the use of different deep architectures for modeling audio optical compression. The models use as input and produce as output raw audio samples at audio rate, and it works with noor small-input buffers allowing a theoretical real-time and lowlatency implementation. In this study, two compressor parameters, the ratio, and threshold have been included in the modeling process aiming to condition the inference of the trained network. Deep learning architectures are compared to model an all-tube optical mono compressor including feed-forward, recurrent, and encoder-decoder models. The results of this study show that feedforward and long short-term memory architectures present limitations in modeling the triggering phase of the compressor, performing well only on the sustained phase. On the other hand, encoderdecoder models outperform other architectures in replicating the overall compression process, but they overpredict the energy of high-frequency components.
Download Realistic Gramophone Noise Synthesis Using a Diffusion Model This paper introduces a novel data-driven strategy for synthesizing gramophone noise audio textures. A diffusion probabilistic model is applied to generate highly realistic quasiperiodic noises. The proposed model is designed to generate samples of length equal to one disk revolution, but a method to generate plausible periodic variations between revolutions is also proposed. A guided approach is also applied as a conditioning method, where an audio signal generated with manually-tuned signal processing is refined via reverse diffusion to improve realism. The method has been evaluated in a subjective listening test, in which the participants were often unable to recognize the synthesized signals from the real ones. The synthetic noises produced with the best proposed unconditional method are statistically indistinguishable from real noise recordings. This work shows the potential of diffusion models for highly realistic audio synthesis tasks.
Download A Quaternion-Phase Oscillator An approach to designing dynamical systems with a three-dimensional state space is described that can be used to build a variety of non-periodic oscillators. The state space is taken to be a 3sphere, which is identified with the manifold of unit quaternions. Any such system can be described as a quaternion-valued ordinary differential equation, which is digitally realized using an approximation as a finite difference e quation. Two examples are shown. Compared to previous applications of dynamical systems used to generate audio samples, the approach described here offers a wide choice of specific flows which can neither diverge nor approach a stable limit point.