Download Expressive Piano Performance Rendering from Unpaired Data Recent advances in data-driven expressive performance rendering have enabled automatic models to reproduce the characteristics and the variability of human performances of musical compositions. However, these models need to be trained with aligned pairs of scores and performances and they rely notably on score-specific markings, which limits their scope of application. This work tackles the piano performance rendering task in a low-informed setting by only considering the score note information and without aligned data. The proposed model relies on an adversarial training where the basic score notes properties are modified in order to reproduce the expressive qualities contained in a dataset of real performances. First results for unaligned score-to-performance rendering are presented through a conducted listening test. While the interpretation quality is not on par with highly-supervised methods and human renditions, our method shows promising results for transferring realistic expressivity into scores.
Download Differentiable Attenuation Filters for Feedback Delay Networks We introduce a novel method for designing attenuation filters in
digital audio reverberation systems based on Feedback Delay Networks (FDNs). Our approach uses Second Order Sections (SOS)
of Infinite Impulse Response (IIR) filters arranged as parametric
equalizers (PEQ), enabling fine control over frequency-dependent
reverberation decay. Unlike traditional graphic equalizer designs,
which require numerous filters per delay line, we propose a scalable solution where the number of filters can be adjusted. The frequency, gain, and quality factor (Q) parameters are shared parameters across delay lines and only the gain is adjusted based on delay
length. This design not only reduces the number of optimization
parameters, but also remains fully differentiable and compatible
with gradient-based learning frameworks. Leveraging principles
of analog filter design, our method allows for efficient and accurate filter fitting using supervised learning. Our method delivers
a flexible and differentiable design, achieving state-of-the-art performance while significantly reducing computational cost.
Download Designing a Library for Generative Audio in Unity This paper overviews URALi, a library designed to add generative sound synthesis capabilities to Unity. This project, in particular, is directed towards audiovisual artists keen on working with algorithmic systems in Unity but can not find native solutions for procedural sound synthesis to pair with their visual and control ones. After overviewing the options available in Unity concerning audio, this paper reports on the functioning and architecture of the library, which is an ongoing project.
Download Partiels – Exploring, Analyzing and Understanding Sounds This
article
presents
Partiels,
an
open-source
application
developed at IRCAM to analyze digital audio files and explore
sound characteristics.
The application uses Vamp plug-ins to
extract various information on different aspects of the sound, such
as spectrum, partials, pitch, tempo, text, and chords. Partiels is the
successor to AudioSculpt, offering a modern, flexible interface for
visualizing, editing, and exporting analysis results, addressing a
wide range of issues from musicological practice to sound creation
and signal processing research. The article describes Partiels’ key
features, including analysis organization, audio file management,
results visualization and editing, as well as data export and sharing
options, and its interoperability with other software such as Max
and Pure Data. In addition, it highlights the numerous analysis
plug-ins developed at IRCAM, based in particular on machine
learning models, as well as the IRCAM Vamp extension, which
overcomes certain limitations of the original Vamp format.
Download SCHAEFFER: A Dataset of Human-Annotated Sound Objects for Machine Learning Applications Machine learning for sound generation is rapidly expanding within
the computer music community. However, most datasets used to
train models are built from field recordings, foley sounds, instrumental notes, or commercial music. This presents a significant
limitation for composers working in acousmatic and electroacoustic music, who require datasets tailored to their creative processes.
To address this gap, we introduce the SCHAEFFER Dataset (Spectromorphological Corpus of Human-annotated Audio with Electroacoustic Features For Experimental Research), a curated collection of 1000 sound objects designed and annotated by composers and students of electroacoustic composition. The dataset,
distributed under Creative Commons licenses, features annotations
combining technical and poetic descriptions, alongside classifications based on pre-defined spectromorphological categories.
Download Decorrelation for Immersive Audio Applications and Sound Effects Audio decorrelation is a fundamental building block for immersive audio applications. It has applications in parametric spatial audio coding, audio upmix, audio sound effects and audio rendering for virtual or augmented reality applications. In this paper, we provide insights into the practical design considerations of an audio decorrelator on the example of the decorrelator contained within the upcoming MPEG-I Immersive Audio ISO standard [1]. We describe the desirable properties of such a decorrelator, common approaches for implementation and our particular technology choices for the decorrelator used in MPEG-I for rendering sound sources with homogeneous extent.
Download Pywdf: An Open Source Library for Prototyping and Simulating Wave Digital Filter Circuits in Python This paper introduces a new open-source Python library for the modeling and simulation of wave digital filter (WDF) circuits. The library, called pwydf, allows users to easily create and analyze WDF circuit models in a high-level, object-oriented manner. The library includes a variety of built-in components, such as voltage sources, capacitors, diodes etc., as well as the ability to create custom components and circuits. Additionally, pywdf includes a variety of analysis tools, such as frequency response and transient analysis, to aid in the design and optimization of WDF circuits. We demonstrate the library’s efficacy in replicating the nonlinear behavior of an analog diode clipper circuit, and in creating an allpass filter that cannot be realized in the analog world. The library is well-documented and includes several examples to help users get started. Overall, pywdf is a powerful tool for anyone working with WDF circuits, and we hope it can be of great use to researchers and engineers in the field.
Download Automatic Recognition of Cascaded Guitar Effects This paper reports on a new multi-label classification task for guitar effect recognition that is closer to the actual use case of guitar effect pedals. To generate the dataset, we used multiple clean guitar audio datasets and applied various combinations of 13 commonly used guitar effects. We compared four neural network structures: a simple Multi-Layer Perceptron as a baseline, ResNet models, a CRNN model, and a sample-level CNN model. The ResNet models achieved the best performance in terms of accuracy and robustness under various setups (with or without clean audio, seen or unseen dataset), with a micro F1 of 0.876 and Macro F1 of 0.906 in the hardest setup. An ablation study on the ResNet models further indicates the necessary model complexity for the task.
Download Revisiting the Second-Order Accurate Non-Iterative Discretization Scheme In the field of virtual analog modeling, a variety of methods have been proposed to systematically derive simulation models from circuit schematics. However, they typically rely on implicit numerical methods to transform the differential equations governing the circuit to difference equations suitable for simulation. For circuits with non-linear elements, this usually means that a non-linear equation has to be solved at run-time at high computational cost. As an alternative to fully-implicit numerical methods, a family of non-iterative discretization schemes has recently been proposed, allowing a significant reduction of the computational load. However, in the original presentation, several assumptions are made regarding the structure of the ODE, limiting the generality of these schemes. Here, we show that for the second-order accurate variant in particular, the method is applicable to general ODEs. Furthermore, we point out an interesting connection to the implicit midpoint method.
Download Flutter Echo Modeling Flutter echo is a well-known acoustic phenomenon that occurs when sound waves bounce between two parallel reflective surfaces, creating a repetitive sound. In this work, we introduce a method to recreate flutter echo as an audio effect. The proposed algorithm is based on a feedback structure utilizing velvet noise that aims to synthesize the fluttery components of a reference room impulse response presenting flutter echo. Among these, the repetition time defines the length of the delay line in a feedback filter. The specific spectral properties of the flutter are obtained with a bandpass attenuation filter and a ripple filter, which enhances the harmonic behavior of the sound. Additional temporal shaping of a velvet-noise filter, which processes the output of the feedback loop, is performed based on the properties of the reference flutter. The comparison between synthetic and measured flutter echo impulse responses shows good agreement in terms of both the repetition time and reverberation time values.