Download Non-Iterative Phaseless Reconstruction From Wavelet Transform Magnitude
In this work, we present an algorithm for phaseless reconstruction from magnitude-only wavelet coefficients. The method relies on an explicit relation between the log-magnitude and phase gradients of analytic wavelet transforms and an extension of the Phase-Gradient Heap Integration (PGHI) algorithm recently introduced for Gabor phaseless reconstruction. This relation is exact for a certain family of mother wavelets including Cauchy wavelets of arbitrary order, but only holds approximately otherwise. The presented experiments show that, in practice, the proposed wavelet PGHI method provides competitive quality for various mother wavelets. Furthermore, wavelet PGHI is a non-iterative scheme and thus computational performance is significantly better than established alternate projection methods.
Download High-Definition Time-Frequency Representation Based on Adaptive Combination of Fan-Chirp Transforms via Structure Tensor
This paper presents a novel technique for producing high-definition time-frequency representations by combining different instances of short-time fan-chirp transforms. The proposed method uses directional information provided by an image processing technique named structure tensor, applied over a spectrogram of the input signal. This information indicates the best analysis window size and chirp parameter for each time-frequency bin, and feeds a simple interpolation procedure, which produces the final representation. The method allows the proper representation of more than one sound source simultaneously via fan-chirp transforms with different resolutions, and provides a precise reproduction of transient information. Experiments in both synthetic and real audio illustrate the performance of the proposed system.
Download Non-Iterative Solvers For Nonlinear Problems: The Case of Collisions
Nonlinearity is a key feature in musical instruments and electronic circuits alike, and thus in simulation, for the purposes of physics-based modeling and virtual analog emulation, the numerical solution of nonlinear differential equations is unavoidable. Ensuring numerical stability is thus a major consideration. In general, one may construct implicit schemes using well-known discretisation methods such as the trapezoid rule, requiring computationally-costly iterative solvers at each time step. Here, a novel family of provably numerically stable time-stepping schemes is presented, avoiding the need for iterative solvers, and thus of greatly reduced computational cost. An application to the case of the collision interaction in musical instrument modeling is detailed.
Download Generalizations of Velvet Noise and their Use in 1-Bit Music
A family of spectrally-flat noise sequences called “Velvet Noise” have found use in reverb modeling, decorrelation, speech synthesis, and abstract sound synthesis. These noise sequences are ternary—they consist of only the values −1, 0, and +1. They are also sparse in time, with pulse density being their main design parameter, and at typical audio sampling rates need only several thousand non-zero samples per second to sound “smooth.” This paper proposes “Crushed Velvet Noise” (CVN) generalizations to the classic family of Velvet Noise sequences including “Original Velvet Noise” (OVN), “Additive Random Noise” (ARN), and “Totally Random Noise” (TRN). In these generalizations, the probability of getting a positive or negative impulse is a free parameter. Manipulating this probability gives Crushed OVN and ARN low-shelf spectra rather than the flat spectra of standard Velvet Noise, while the spectrum of Crushed TRN is still flat. This new family of noise sequences is still ternary and sparse in time. However, pulse density now controls the shelf cutoff frequency, and the distribution of polarities controls the shelf depth. Crushed Velvet Noise sequences with pulses of only a single polarity are particularly useful in a niche style of music called “1- bit music”: music with a binary waveform consisting of only 0s and 1s. We propose Crushed Velvet Noise as a valuable tool in 1- bit music composition, where its sparsity allows for good approximations to operations, such as addition, which are impossible for signals in general in the 1-bit domain.
Download Real-Time Implementation of an Elasto-Plastic Friction Model using Finite-Difference Schemes
The simulation of a bowed string is challenging due to the strongly non-linear relationship between the bow and the string. This relationship can be described through a model of friction. Several friction models in the literature have been proposed, from simple velocity dependent to more accurate ones. Similarly, a highly accurate technique to simulate a stiff string is the use of finitedifference time-domain (FDTD) methods. As these models are generally computationally heavy, implementation in real-time is challenging. This paper presents a real-time implementation of the combination of a complex friction model, namely the elastoplastic friction model, and a stiff string simulated using FDTD methods. We show that it is possible to keep the CPU usage of a single bowed string below 6 percent. For real-time control of the bowed string, the Sensel Morph is used.
Download Improving Monophonic Pitch Detection Using the ACF And Simple Heuristics
In this paper a study on the performance of the short time autocorrelation function for the determination of correct pitch candidates for non-stationary sounds is presented. Input segments of a music or speech signal are analyzed by extracting the autocorrelation function and a weighting function is used to weight candidates for assessing their harmonic strength. Furthermore, a decision is devised which alerts if there are possible non-related jumps on the fundamental frequency track. A technique to modify the spectral content of the signal is presented to compensate for these jumps, and a heuristic to return a steady fundamental frequency track for monophonic recordings is presented. The system is evaluated with several databases and with other algorithms. Using the compensation algorithm increases the performance of the ACF and outperforms current detection algorithms.
Download A Real-Time Audio Effect Plug-In Inspired by the Processes of Traditional Indonesian Gamelan Music
This paper presents Gamelanizer, a novel real-time audio effect inspired by Javanese gamelan music theory. It is composed of anticipatory or “negative” delay and time and pitch manipulations based on the phase vocoder. An open-source real-time C++Virtual Studio Technology (VST) implementation of the effect, made with the JUCE framework, is available at github.com/lukemcraig/ DAFx19-Gamelanizer, as well as audio examples and Python implementations of vectorized and frame by frame approaches.
Download Exploring audio immersion using user-generated recordings
The abundance and ever growing expansion of user-generated content defines a paradigm in multimedia consumption. While user immersion through audio has gained relevance in the later years due to the growing interest in virtual and augmented reality immersion technologies, the existent user-generated content visualization techniques are still not making use of immersion technologies. Here we propose a new technique to visualize multimedia content that provides immersion through the audio. While our technique focus on audio immersion, we also propose to combine it with a video interface that aims at providing an enveloping visual experience to end-users. The technique combines professional audio recordings with user-generated audio recordings of the same event. Immersion is granted through the spatialization of the user generated audio content with head related transfer functions.
Download Analysis and Correction of Maps Dataset
Automatic music transcription (AMT) is the process of converting the original music signal into the digital music symbol. The MIDI Aligned Piano Sounds (MAPS) dataset was established in 2010 and is the most used benchmark dataset for automatic piano music transcription. In this paper, error screening is carried out through algorithm strategy, and three data annotation problems are found in ENSTDkCl, which is a subset of MAPS, usually used for algorithm evaluation: (1) there are 342 deviation errors of midi annotation; (2) there are 803 unplayed note errors; (3) there are 1613 slow starting process errors. After algorithm correction and manual confirmation, the corrected dataset is released. Finally, the better-performing Google model and our model are evaluated on the corrected dataset. The F values are 85.94% and 85.82%, respectively, and it is correspondingly improved compared with the original dataset, which proves that the correction of the dataset is meaningful.
Download Optimization of audio graphs by resampling
Interactive music systems are dynamic real-time systems which combine control and signal processing based on an audio graph. They are often used on platforms where there are no reliable and precise real-time guarantees. Here, we present a method of optimizing audio graphs and finding a compromise between audio quality and gain in execution time by downsampling parts of the graph. We present models of quality and execution time and we evaluate the models and our optimization algorithm experimentally.