Download Extensions and Applications of Modal Dispersive Filters Dispersive delay and comb filters, implemented as a parallel sum of high-Q mode filters tuned to provide a desired frequency-dependent delay characteristic, have advantages over dispersive filters that are implemented using cascade or frequency-domain architectures. Here we present techniques for designing the modal filter parameters for music and audio applications. Through examples, we show that this parallel structure is conducive to interactive and time-varying modifications, and we introduce extensions to the basic model.
Download A Minimal Passive Model of the Operational Amplifier: Application to Sallen-Key Analog Filters This papers stems from the fact that, whereas there are passive models of transistors and tubes, a minimal passive model of the operational amplifier does not seem to exist. A new behavioural model is presented that is memoryless, fully described by its interaction ports, with a minimal number of equations, for which a passive power balance can be defined. The proposed model handles saturation, asymmetric power supply, and can be used with nonideal voltage references. To illustrate the model in audio applications, the non-inverting voltage amplifier and a saturating Sallen-Key lowpass filter are considered.
Download Generalizations of Velvet Noise and their Use in 1-Bit Music A family of spectrally-flat noise sequences called “Velvet Noise” have found use in reverb modeling, decorrelation, speech synthesis, and abstract sound synthesis. These noise sequences are ternary—they consist of only the values −1, 0, and +1. They are also sparse in time, with pulse density being their main design parameter, and at typical audio sampling rates need only several thousand non-zero samples per second to sound “smooth.” This paper proposes “Crushed Velvet Noise” (CVN) generalizations to the classic family of Velvet Noise sequences including “Original Velvet Noise” (OVN), “Additive Random Noise” (ARN), and “Totally Random Noise” (TRN). In these generalizations, the probability of getting a positive or negative impulse is a free parameter. Manipulating this probability gives Crushed OVN and ARN low-shelf spectra rather than the flat spectra of standard Velvet Noise, while the spectrum of Crushed TRN is still flat. This new family of noise sequences is still ternary and sparse in time. However, pulse density now controls the shelf cutoff frequency, and the distribution of polarities controls the shelf depth. Crushed Velvet Noise sequences with pulses of only a single polarity are particularly useful in a niche style of music called “1- bit music”: music with a binary waveform consisting of only 0s and 1s. We propose Crushed Velvet Noise as a valuable tool in 1- bit music composition, where its sparsity allows for good approximations to operations, such as addition, which are impossible for signals in general in the 1-bit domain.
Download Improved Reverberation Time Control for Feedback Delay Networks Artificial reverberation algorithms generally imitate the frequency-dependent decay of sound in a room quite inaccurately. Previous research suggests that a 5% error in the reverberation time (T60) can be audible. In this work, we propose to use an accurate graphic equalizer as the attenuation filter in a Feedback Delay Network reverberator. We use a modified octave graphic equalizer with a cascade structure and insert a high-shelf filter to control the gain at the high end of the audio range. One such equalizer is placed at the end of each delay line of the Feedback Delay Network. The gains of the equalizer are optimized using a new weighting function that acknowledges nonlinear error propagation from filter magnitude response to reverberation time values. Our experiments show that in real-world cases, the target T60 curve can be reproduced in a perceptually accurate manner at standard octave center frequencies. However, for an extreme test case in which the T60 varies dramatically between neighboring octave bands, the error still exceeds the limit of the just noticeable difference but is smaller than that obtained with previous methods. This work leads to more realistic artificial reverberation.
Download The Shape of RemiXXXes to Come: Audio Texture Synthesis with Time-frequency Scattering This article explains how to apply time–frequency scattering, a convolutional operator extracting modulations in the time–frequency domain at different rates and scales, to the re-synthesis and manipulation of audio textures. After implementing phase retrieval in the scattering network by gradient backpropagation, we introduce scale-rate DAFx, a class of audio transformations expressed in the domain of time–frequency scattering coefficients. One example of scale-rate DAFx is chirp rate inversion, which causes each sonic event to be locally reversed in time while leaving the arrow of time globally unchanged. Over the past two years, our work has led to the creation of four electroacoustic pieces: FAVN; Modulator (Scattering Transform); Experimental Palimpsest; Inspection (Maida Vale Project) and Inspection II; as well as XAllegroX (Hecker Scattering.m Sequence), a remix of Lorenzo Senni’s XAllegroX, released by Warp Records on a vinyl entitled The Shape of RemiXXXes to Come.
Download Analysis and Correction of Maps Dataset Automatic music transcription (AMT) is the process of converting the original music signal into the digital music symbol. The MIDI Aligned Piano Sounds (MAPS) dataset was established in 2010 and is the most used benchmark dataset for automatic piano music transcription. In this paper, error screening is carried out through algorithm strategy, and three data annotation problems are found in ENSTDkCl, which is a subset of MAPS, usually used for algorithm evaluation: (1) there are 342 deviation errors of midi annotation; (2) there are 803 unplayed note errors; (3) there are 1613 slow starting process errors. After algorithm correction and manual confirmation, the corrected dataset is released. Finally, the better-performing Google model and our model are evaluated on the corrected dataset. The F values are 85.94% and 85.82%, respectively, and it is correspondingly improved compared with the original dataset, which proves that the correction of the dataset is meaningful.
Download Sound Source Separation in the Higher Order Ambisonics Domain In this article we investigate how the local Gaussian model (LGM) can be applied to separate sound sources in the higher-order ambisonics (HOA) domain. First, we show that in the HOA domain, the mathematical formalism of the local Gaussian model remains the same as in the microphone domain. Second, using an off-the shelf source separation toolbox (FASST) based on the local Gaussian model, we validate the efficiency of the approach in the HOA domain by comparing the performance of toolbox in the HOA domain with its performance in the microphone domain. To do this we discuss and run some simulations to ensure a fair comparison. Third, we check the efficiency of the local Gaussian model compared to other available source separation techniques in the HOA domain. Simulation results show that separating sources in the HOA domain results in a 1 to 12 dB increase in signal-to-distortion ratio, compared to the microphone domain. Multichannel source separation, local Gaussian model, Wiener filtering, 3D audio, Higher Order Ambisonics (HOA).
Download Analysis and Emulation of Early Digitally-Controlled Oscillators Based on the Walsh-Hadamard Transform Early analog synthesizer designs are very popular nowadays, and the discrete-time emulation of voltage-controlled oscillator (VCO) circuits is covered by a large number of virtual analog (VA) textbooks, papers and tutorials. One of the issues of well-known VCOs is their tuning instability and sensitivity to environmental conditions. For this reason, digitally-controlled oscillators were later introduced to provide stable tuning. Up to now, such designs have gained much less attention in the music processing literature. In this paper, we examine one of such designs, which is based on the Walsh-Hadamard transform. The concept was employed in the ARP Pro Soloist and in the Welson Syntex, among others. Some historical background is provided, along with a discussion on the principle, the actual implementation and a band-limited virtual analog derivation.
Download Visualaudio-Design – Towards a Graphical Sounddesign VisualAudio-Design (VAD) is a spectral-node based approach to visually design audio collages and sounds. The spectrogram as a visualization of the frequency-domain can be intuitively manipulated with tools known from image processing. Thereby, a more comprehensible sound design is described to address common abstract interfaces for DSP algorithms that still use direct value inputs, sliders, or knobs. In addition to interaction in the timedomain of audio and conventional analysis and restoration tasks, there are many new possibilities for spectral manipulation of audio material. Here, affine transformations and two-dimensional convolution filters are proposed.
Download Real-Time Black-Box Modelling With Recurrent Neural Networks This paper proposes to use a recurrent neural network for black-box modelling of nonlinear audio systems, such as tube amplifiers and distortion pedals. As a recurrent unit structure, we test both Long Short-Term Memory and a Gated Recurrent Unit. We compare the proposed neural network with a WaveNet-style deep neural network, which has been suggested previously for tube amplifier modelling. The neural networks are trained with several minutes of guitar and bass recordings, which have been passed through the devices to be modelled. A real-time audio plugin implementing the proposed networks has been developed in the JUCE framework. It is shown that the recurrent neural networks achieve similar accuracy to the WaveNet model, while requiring significantly less processing power to run. The Long Short-Term Memory recurrent unit is also found to outperform the Gated Recurrent Unit overall. The proposed neural network is an important step forward in computationally efficient yet accurate emulation of tube amplifiers and distortion pedals.