Download A Real-Time Approach for Estimating Pulse Tracking Parameters for Beat-Synchronous Audio Effects
Predominant Local Pulse (PLP) estimation, an established method for extracting beat positions and other periodic pulse information from audio signals, has recently been extended with an online variant tailored for real-time applications. In this paper, we introduce a novel approach to generating various real-time control signals from the original online PLP output. While the PLP activation function encodes both predominant pulse information and pulse stability, we propose several normalization procedures to discern local pulse oscillation from stability, utilizing the PLP activation envelope. Through this, we generate pulse-synchronous Low Frequency Oscillators (LFOs) and supplementary confidence-based control signals, enabling dynamic control over audio effect parameters in real-time. Additionally, our approach enables beat position prediction, providing a look-ahead capability, for example, to compensate for system latency. To showcase the effectiveness of our control signals, we introduce an audio plugin prototype designed for integration within a Digital Audio Workstation (DAW), facilitating real-time applications of beat-synchronous effects during live mixing and performances. Moreover, this plugin serves as an educational tool, providing insights into PLP principles and the tempo structure of analyzed music signals.
Download Impedance Synthesis for Hybrid Analog-Digital Audio Effects
Most real systems, from acoustics to analog electronics, are characterised by bidirectional coupling amongst elements rather than neat, unidirectional signal flows between self-contained modules. Integrating digital processing into physical domains becomes a significant engineering challenge when the application requires bidirectional coupling across the physical-digital boundary rather than separate, well-defined inputs and outputs. We introduce an approach to hybrid analog-digital audio processing using synthetic impedance: digitally simulated circuit elements integrated into an otherwise analog circuit. This approach combines the physicality and classic character of analog audio circuits alongside the precision and flexibility of digital signal processing (DSP). Our impedance synthesis system consists of a voltage-controlled current source and a microcontroller-based DSP system. We demonstrate our technique through modifying an iconic guitar distortion pedal, the Boss DS-1, showing the ability of the synthetic impedance to both replicate and extend the behaviour of the pedal’s diode clipping stage. We discuss the behaviour of the synthetic impedance in isolated laboratory conditions and in the DS-1 pedal, highlighting the technical and creative potential of the technique as well as its practical limitations and future extensions.
Download Estimation of Multi-Slope Amplitudes in Late Reverberation
The common-slope model is used to model late reverberation of complex room geometries such as multiple coupled rooms. The model fits band-limited room impulse responses using a set of common decay rates, with amplitudes varying based on listener positions. This paper investigates amplitude estimation methods within the common-slope model framework. We compare several traditional least squares estimation methods and propose using LINEX regression, a Maximum Likelihood approach using logsquared RIR statistics. Through statistical analysis and simulation tests, we demonstrate that LINEX regression improves accuracy and reduces bias when compared to traditional methods.
Download Differentiable Scattering Delay Networks for Artificial Reverberation
Scattering delay networks (SDNs) provide a flexible and efficient framework for artificial reverberation and room acoustic modeling. In this work, we introduce a differentiable SDN, enabling gradient-based optimization of its parameters to better approximate the acoustics of real-world environments. By formulating key parameters such as scattering matrices and absorption filters as differentiable functions, we employ gradient descent to optimize an SDN based on a target room impulse response. Our approach minimizes discrepancies in perceptually relevant acoustic features, such as energy decay and frequency-dependent reverberation times. Experimental results demonstrate that the learned SDN configurations significantly improve the accuracy of synthetic reverberation, highlighting the potential of data-driven room acoustic modeling.
Download DataRES and PyRES: A Room Dataset and a Python Library for Reverberation Enhancement System Development, Evaluation, and Simulation
Reverberation is crucial in the acoustical design of physical spaces, especially halls for live music performances. Reverberation Enhancement Systems (RESs) are active acoustic systems that can control the reverberation properties of physical spaces, allowing them to adapt to specific acoustical needs. The performance of RESs strongly depends on the properties of the physical room and the architecture of the Digital Signal Processor (DSP). However, room-impulse-response (RIR) measurements and the DSP code from previous studies on RESs have never been made open access, leading to non-reproducible results. In this study, we present DataRES and PyRES—a RIR dataset and a Python library to increase the reproducibility of studies on RESs. The dataset contains RIRs measured in RES research and development rooms and professional music venues. The library offers classes and functionality for the development, evaluation, and simulation of RESs. The implemented DSP architectures are made differentiable, allowing their components to be trained in a machine-learning-like pipeline. The replication of previous studies by the authors shows that PyRES can become a useful tool in future research on RESs.
Download Generation of Non-repetitive Everyday Impact Sounds for Interactive Applications
The use of high quality sound effects is growing rapidly in multimedia, interactive and virtual reality applications. The common source of audio events in these applications is impact sounds. The sound effects in such environments can be pre-recorded or synthesized in real-time as a result of a physical event. However, one of the biggest problems when using pre-recorded sound effects is the monotonous repetition of these sounds which can be tedious to the listener. In this paper, we present a new algorithm which generates non-repetitive impact sound effects using parameters from the physical interaction. Our approach aims to use audio grains to create finely-controlled synthesized sounds which are based on recordings of impact sounds. The proposed algorithm can also be used in a large set of audio data analysis, representation, and compression applications. A subjective test was carried out to evaluate the perceptual quality of the synthesized sounds.
Download Voice Features For Control: A Vocalist Dependent Method For Noise Measurement And Independent Signals Computation
Information about the human spoken and singing voice is conveyed through the articulations of the individual’s vocal folds and vocal tract. The signal receiver, either human or machine, works at different levels of abstraction to extract and interpret only the relevant context specific information needed. Traditionally in the field of human machine interaction, the human voice is used to drive and control events that are discrete in terms of time and value. We propose to use the voice as a source of realvalued and time-continuous control signals that can be employed to interact with any multidimensional human-controllable device in real-time. The isolation of noise sources and the independence of the control dimensions play a central role. Their dependency on individual voice represents an additional challenge. In this paper we introduce a method to compute case specific independent signals from the vocal sound, together with an individual study of features computation and selection for noise rejection.
Download Polyphonic Instrument Recognition for Exploring Semantic Similarities in Music
Similarity is a key concept for estimating associations among a set of objects. Music similarity is usually exploited to retrieve relevant items from a dataset containing audio tracks. In this work, we approach the problem of semantic similarity between short pieces of music by analysing their instrumentations. Our aim is to label audio excerpts with the most salient instruments (e.g. piano, human voice, drums) and use this information to estimate a semantic relation (i.e. similarity) between them. We present 3 different methods for integrating along an audio excerpt frame-based classifier decisions to derive its instrumental content. Similarity between audio files is then determined solely by their attached labels. We evaluate our algorithm in terms of label assignment and similarity assessment, observing significant differences when comparing it to commonly used audio similarity metrics. In doing so we test on music from various genres of Western music to simulate real world scenarios.
Download Music-Content-Adaptive Robust Principal Component Analysis for a Semantically Consistent Separation of Foreground and Background in Music Audio Signals
Robust Principal Component Analysis (RPCA) is a technique to decompose signals into sparse and low rank components, and has recently drawn the attention of the MIR field for the problem of separating leading vocals from accompaniment, with appealing results obtained on small excerpts of music. However, the performance of the method drops when processing entire music tracks. We present an adaptive formulation of RPCA that incorporates music content information to guide the decomposition. Experiments on a set of complete music tracks of various genres show that the proposed algorithm is able to better process entire pieces of music that may exhibit large variations in the music content, and compares favorably with the state-of-the-art.
Download Separating Piano Recordings into Note Events Using a Parametric Imitation Approach
In this paper we present a working system for separating a piano recording into events representing individual piano notes. Each note is parameterized with a transient-plus-harmonics model that, should all the parameters be reliably estimated, would produce near perfect reconstruction for each note as well as for the whole recording. However, interference between overlapping notes makes it hard to estimate parameters from their combination. In this work we propose to assess the estimability of sinusoidal parameters via their apparent degree of interference, estimate the estimable ones using algorithms suitable for different interference situations, and infer the hard-to-estimate parameters from the estimated ones. The outcome is a sequence of separate, parameterized piano notes that perceptually highly resemble, if are not identical to, the notes in the original recording. This allows for later analysis and processing stages using algorithms designed for separate notes.