Download Topology-Preserving Deformations of Digital Audio Topology provides global invariants for data as well as spaces of deformation. In this paper we discuss the deformations of audio signals which preserve topological information specified by sublevel set persistent homology. It is well known that the topological information only changes at extrema. We introduce box snakes as a data structure that captures permissible editing and deformation of signals and preserves the extremal properties of the signal while allowing for monotone deformations between them. The resulting algorithm works on any ordered discrete data hence can be applied to time and frequency domain finite length audio signals.
Download Characterisation and Excursion Modelling of Audio Haptic Transducers Statement and calculation of objective audio haptic transducer performance metrics facilitates optimisation of multi-sensory sound reproduction systems. Measurements of existing haptic transducers are applied to the calculation of a series of performance metrics to demonstrate a means of comparative objective analysis. The frequency response, transient response and moving mass excursion characteristics of each measured transducer are quantified using novel and previously defined metrics. Objective data drawn from a series of practical measurements shows that the proposed metrics and means of excursion modelling applied herein are appropriate for haptic transducer evaluation and protection against over-excursion respectively.
Download Impedance Synthesis for Hybrid Analog-Digital Audio Effects Most real systems, from acoustics to analog electronics, are
characterised by bidirectional coupling amongst elements rather
than neat, unidirectional signal flows between self-contained modules. Integrating digital processing into physical domains becomes
a significant engineering challenge when the application requires
bidirectional coupling across the physical-digital boundary rather
than separate, well-defined inputs and outputs. We introduce an
approach to hybrid analog-digital audio processing using synthetic
impedance: digitally simulated circuit elements integrated into an
otherwise analog circuit. This approach combines the physicality and classic character of analog audio circuits alongside the
precision and flexibility of digital signal processing (DSP). Our
impedance synthesis system consists of a voltage-controlled current source and a microcontroller-based DSP system. We demonstrate our technique through modifying an iconic guitar distortion pedal, the Boss DS-1, showing the ability of the synthetic
impedance to both replicate and extend the behaviour of the pedal’s
diode clipping stage. We discuss the behaviour of the synthetic
impedance in isolated laboratory conditions and in the DS-1 pedal,
highlighting the technical and creative potential of the technique as
well as its practical limitations and future extensions.
Download Learning Nonlinear Dynamics in Physical Modelling Synthesis Using Neural Ordinary Differential Equations Modal synthesis methods are a long-standing approach for modelling distributed musical systems. In some cases extensions are
possible in order to handle geometric nonlinearities. One such
case is the high-amplitude vibration of a string, where geometric nonlinear effects lead to perceptually important effects including pitch glides and a dependence of brightness on striking amplitude. A modal decomposition leads to a coupled nonlinear system of ordinary differential equations. Recent work in applied machine learning approaches (in particular neural ordinary differential equations) has been used to model lumped dynamic systems
such as electronic circuits automatically from data. In this work,
we examine how modal decomposition can be combined with neural ordinary differential equations for modelling distributed musical systems. The proposed model leverages the analytical solution
for linear vibration of system’s modes and employs a neural network to account for nonlinear dynamic behaviour. Physical parameters of a system remain easily accessible after the training without
the need for a parameter encoder in the network architecture. As
an initial proof of concept, we generate synthetic data for a nonlinear transverse string and show that the model can be trained to
reproduce the nonlinear dynamics of the system. Sound examples
are presented.
Download DataRES and PyRES: A Room Dataset and a Python Library for Reverberation Enhancement System Development, Evaluation, and Simulation Reverberation is crucial in the acoustical design of physical
spaces, especially halls for live music performances. Reverberation Enhancement Systems (RESs) are active acoustic systems that
can control the reverberation properties of physical spaces, allowing them to adapt to specific acoustical needs. The performance of
RESs strongly depends on the properties of the physical room and
the architecture of the Digital Signal Processor (DSP). However,
room-impulse-response (RIR) measurements and the DSP code
from previous studies on RESs have never been made open access, leading to non-reproducible results. In this study, we present
DataRES and PyRES—a RIR dataset and a Python library to increase the reproducibility of studies on RESs. The dataset contains RIRs measured in RES research and development rooms and
professional music venues. The library offers classes and functionality for the development, evaluation, and simulation of RESs.
The implemented DSP architectures are made differentiable, allowing their components to be trained in a machine-learning-like
pipeline. The replication of previous studies by the authors shows
that PyRES can become a useful tool in future research on RESs.
Download Biquad Coefficients Optimization via Kolmogorov-Arnold Networks Conventional Deep Learning (DL) approaches to Infinite Impulse
Response (IIR) filter coefficients estimation from arbitrary frequency response are quite limited. They often suffer from inefficiencies such as tight training requirements, high complexity, and
limited accuracy. As an alternative, in this paper, we explore the
use of Kolmogorov-Arnold Networks (KANs) to predict the IIR
filter—specifically biquad coefficients—effectively. By leveraging the high interpretability and accuracy of KANs, we achieve
smooth coefficients’ optimization. Furthermore, by constraining
the search space and exploring different loss functions, we demonstrate improved performance in speed and accuracy. Our approach
is evaluated against other existing differentiable IIR filter solutions. The results show significant advantages of KANs over existing methods, offering steadier convergences and more accurate
results. This offers new possibilities for integrating digital infinite
impulse response (IIR) filters into deep-learning frameworks.
Download Zero-Phase Sound via Giant FFT Given the speedy computation of the FFT in current computer
hardware, there are new possibilities for examining transformations for very long sounds. A zero-phase version of any audio
signal can be obtained by zeroing the phase angle of its complex
spectrum and taking the inverse FFT. This paper recommends additional processing steps, including zero-padding, transient suppression at the signal’s start and end, and gain compensation, to
enhance the resulting sound quality. As a result, a sound with the
same spectral characteristics as the original one, but with different temporal events, is obtained. Repeating rhythm patterns are
retained, however. Zero-phase sounds are palindromic in the sense
that they are symmetric in time. A comparison of the zero-phase
conversion to the autocorrelation function helps to understand its
properties, such as why the rhythm of the original sound is emphasized. It is also argued that the zero-phase signal has the same
autocorrelation function as the original sound. One exciting variation of the method is to apply the method separately to the real
and imaginary parts of the spectrum to produce a stereo effect. A
frame-based technique enables the use of the zero-phase conversion in real-time audio processing. The zero-phase conversion is
another member of the giant FFT toolset, allowing the modification of sampled sounds, such as drum loops or entire songs.
Download Generative Latent Spaces for Neural Synthesis of Audio Textures This paper investigates the synthesis of audio textures and the
structure of generative latent spaces using Variational Autoencoders (VAEs) within two paradigms of neural audio synthesis:
DSP-inspired and data-driven approaches. For each paradigm, we
propose VAE-based frameworks that allow fine-grained temporal
control. We introduce datasets across three categories of environmental sounds to support our investigations. We evaluate and compare the models’ reconstruction performance using objective metrics, and investigate their generative capabilities and latent space
structure through latent space interpolations.
Download Efficient Simulation of the Bowed String in Modal Form The motion of a bowed string is a typical nonlinear phenomenon resulting from a friction force via interaction with the bow. The system can be described using suitable differential equations. Implicit numerical discretisation methods are known to yield energy consistent algorithms, essential to ensure stability of the timestepping schemes. However, reliance on iterative nonlinear root finders carries significant implementation issues. This paper explores a method recently developed which allows nonlinear systems of ordinary differential equations to be solved non-iteratively. Case studies of a mass-spring system and an ideal string coupled with a bow are investigated. Finally, a stiff string with loss is also considered. Combining semi-discretisation and a modal approach results in an algorithm yielding faster than real-time simulation of typical musical strings.
Download A Structural Similarity Index Based Method to Detect Symbolic Monophonic Patterns in Real-Time Automatic detection of musical patterns is an important task in the field of Music Information Retrieval due to its usage in multiple applications such as automatic music transcription, genre or instrument identification, music classification, and music recommendation. A significant sub-task in pattern detection is the realtime pattern detection in music due to its relevance in application domains such as the Internet of Musical Things. In this study, we present a method to identify the occurrence of known patterns in symbolic monophonic music streams in real-time. We introduce a matrix-based representation to denote musical notes using its pitch, pitch-bend, amplitude, and duration. We propose an algorithm based on an independent similarity index for each note attribute. We also introduce the Match Measure, which is a numerical value signifying the degree of the match between a pattern and a sequence of notes. We have tested the proposed algorithm against three datasets: a human recorded dataset, a synthetically designed dataset, and the JKUPDD dataset. Overall, a detection rate of 95% was achieved. The low computational load and minimal running time demonstrate the suitability of the method for real-world, real-time implementations on embedded systems.