Download Neural Net Tube Models for Wave Digital Filters
Herein, we demonstrate the use of neural nets towards simulating multiport nonlinearities inside a wave digital filter. We introduce a resolved wave definition which allows us to extract features from a Kirchhoff domain dataset and train our neural networks directly in the wave domain. A hyperparameter search is performed to minimize error and runtime complexity. To illustrate the method, we model a tube amplifier circuit inspired by the preamplifier stage of the Fender Pro-Junior guitar amplifier. We analyze the performance of our neural nets models by comparing their distortion characteristics and transconductances. Our results suggest that activation function selection has a significant effect on the distortion characteristic created by the neural net.
Download A Structural Similarity Index Based Method to Detect Symbolic Monophonic Patterns in Real-Time
Automatic detection of musical patterns is an important task in the field of Music Information Retrieval due to its usage in multiple applications such as automatic music transcription, genre or instrument identification, music classification, and music recommendation. A significant sub-task in pattern detection is the realtime pattern detection in music due to its relevance in application domains such as the Internet of Musical Things. In this study, we present a method to identify the occurrence of known patterns in symbolic monophonic music streams in real-time. We introduce a matrix-based representation to denote musical notes using its pitch, pitch-bend, amplitude, and duration. We propose an algorithm based on an independent similarity index for each note attribute. We also introduce the Match Measure, which is a numerical value signifying the degree of the match between a pattern and a sequence of notes. We have tested the proposed algorithm against three datasets: a human recorded dataset, a synthetically designed dataset, and the JKUPDD dataset. Overall, a detection rate of 95% was achieved. The low computational load and minimal running time demonstrate the suitability of the method for real-world, real-time implementations on embedded systems.
Download Two Datasets of Room Impulse Responses for Navigation in Six Degrees-of-Freedom:a Symphonic Concert Hall and a Former Planetarium
This paper presents two datasets of room impulse responses (RIRs) for navigable virtual acoustics. The first is a set of 240 mono and Ambisonic RIRs recorded at the Maison Symphonique, a symphonic concert hall in Montreal renowned for its great acoustic characteristics. The second is a set of 67 third-order Ambisonic RIRs which was recorded in the former planetarium of Montreal (currently known as the Centech), a space where the room acoustic includes an acoustic focal point where extreme reverberation times occur. The article first describes the two datasets and the methods that were used to capture them. A use case for these RIRs is then presented: an audio rendering of scene navigation using interpolation among RIRs.
Download On the Challenges of Embedded Real-Time Music Information Retrieval
Real-time applications of Music Information Retrieval (MIR) have been gaining interest as of recently. However, as deep learning becomes more and more ubiquitous for music analysis tasks, several challenges and limitations need to be overcome to deliver accurate and quick real-time MIR systems. In addition, modern embedded computers offer great potential for compact systems that use MIR algorithms, such as digital musical instruments. However, embedded computing hardware is generally resource constrained, posing additional limitations. In this paper, we identify and discuss the challenges and limitations of embedded real-time MIR. Furthermore, we discuss potential solutions to these challenges, and demonstrate their validity by presenting an embedded real-time classifier of expressive acoustic guitar techniques. The classifier achieved 99.2% accuracy in distinguishing pitched and percussive techniques and a 99.1% average accuracy in distinguishing four distinct percussive techniques with a fifth class for pitched sounds. The full classification task is a considerably more complex learning problem, with our preliminary results reaching only 56.5% accuracy. The results were produced with an average latency of 30.7 ms.
Download Flutter Echo Modeling
Flutter echo is a well-known acoustic phenomenon that occurs when sound waves bounce between two parallel reflective surfaces, creating a repetitive sound. In this work, we introduce a method to recreate flutter echo as an audio effect. The proposed algorithm is based on a feedback structure utilizing velvet noise that aims to synthesize the fluttery components of a reference room impulse response presenting flutter echo. Among these, the repetition time defines the length of the delay line in a feedback filter. The specific spectral properties of the flutter are obtained with a bandpass attenuation filter and a ripple filter, which enhances the harmonic behavior of the sound. Additional temporal shaping of a velvet-noise filter, which processes the output of the feedback loop, is performed based on the properties of the reference flutter. The comparison between synthetic and measured flutter echo impulse responses shows good agreement in terms of both the repetition time and reverberation time values.
Download Dark Velvet Noise
This paper proposes dark velvet noise (DVN) as an extension of the original velvet noise with a lowpass spectrum. The lowpass spectrum is achieved by allowing each pulse in the sparse sequence to have a randomized pulse width. The cutoff frequency is controlled by the density of the sequence. The modulated pulse-width can be implemented efficiently utilizing a discrete set of recursive running-sum filters, one for each unique pulse width. DVN may be used in reverberation algorithms. Typical room reverberation has a frequency-dependent decay, where the high frequencies decay faster than the low ones. A similar effect is achieved by lowering the density and increasing the pulse-width of DVN in time, thereby making the DVN suitable for artificial reverberation.
Download Higher-Order Scattering Delay Networksfor Artificial Reverberation
Computer simulations of room acoustics suffer from an efficiency vs accuracy trade-off, with highly accurate wave-based models being highly computationally expensive, and delay-network-based models lacking in physical accuracy. The Scattering Delay Network (SDN) is a highly efficient recursive structure that renders first order reflections exactly while approximating higher order ones. With the purpose of improving the accuracy of SDNs, in this paper, several variations on SDNs are investigated, including appropriate node placement for exact modeling of higher order reflections, redesigned scattering matrices for physically-motivated scattering, and pruned network connections for reduced computational complexity. The results of these variations are compared to state-of-the-art geometric acoustic models for different shoebox room simulations. Objective measures (Normalized Echo Densities (NEDs) and Energy Decay Curves (EDCs)) showed a close match between the proposed methods and the references. A formal listening test was carried out to evaluate differences in perceived naturalness of the synthesized Room Impulse Responses. Results show that increasing SDNs’ order and adding directional scattering in a fully-connected network improves perceived naturalness, and higher-order pruned networks give similar performance at a much lower computational cost.
Download Multichannel Interleaved Velvet Noise
The cross-correlation of multichannel reverberation generated using interleaved velvet noise is studied. The interleaved velvetnoise reverberator was proposed recently for synthesizing the late reverb of an acoustic space. In addition to providing a computationally efficient structure and a perceptually smooth response, the interleaving method allows combining its independent branch outputs in different permutations, which are all equally smooth and flutter-free. For instance, a four-branch output can be combined in 4! or 24 ways. Additionally, each branch output set is mixed orthogonally, which increases the number of permutations from M ! to M 2 !, since sign inversions are taken along. Using specific matrices for this operation, which change the sign of velvet-noise sequences, decreases the correlation of some of the combinations. This paper shows that many selections of permutations offer a set of well decorrelated output channels, which produce a diffuse and colorless sound field, which is validated with spatial variation. The results of this work can be applied in the design of computationally efficient multichannel reverberators.
Download Pyroadacoustics: A Road Acoustics Simulator Based on Variable Length Delay Lines
In the development of algorithms for sound source detection, identification and localization, having the possibility to generate datasets in a flexible and fast way is of utmost importance. However, most of the available acoustic simulators used for this purpose target indoor applications, and their usefulness is limited when it comes to outdoor environments such as that of a road, involving fast moving sources and long distances travelled by the sound waves. In this paper we present an acoustic propagation simulator specifically designed for road scenarios. In particular, the proposed Python software package enables to simulate the observed sound resulting from a source moving on an arbitrary trajectory relative to the observer, exploiting variable length delay lines to implement sound propagation and Doppler effect. An acoustic model of the road reflection and air absorption properties has been designed and implemented using digital FIR filters. The architecture of the proposed software is flexible and open to extensions, allowing the package to kick-start the implementation of further outdoor acoustic simulation scenarios.
Download A Study of Control Methods for Percussive Sound Synthesis Based on Gans
The process of creating drum sounds has seen significant evolution in the past decades. The development of analogue drum synthesizers, such as the TR-808, and modern sound design tools in Digital Audio Workstations led to a variety of drum timbres that defined entire musical genres. Recently, drum synthesis research has been revived with a new focus on training generative neural networks to create drum sounds. Different interfaces have previously been proposed to control the generative process, from low-level latent space navigation to high-level semantic feature parameterisation, but no comprehensive analysis has been presented to evaluate how each approach relates to the creative process. We aim to evaluate how different interfaces support creative control over drum generation by conducting a user study based on the Creative Support Index. We experiment with both a supervised method that decodes semantic latent space directions and an unsupervised Closed-Form Factorization approach from computer vision literature to parameterise the generation process and demonstrate that the latter is the preferred means to control a drum synthesizer based on the StyleGAN2 network architecture.