Download A Source-Filter Model for Quasi-Harmonic Instruments In this paper we propose a new method for a generalized model representing the time-varying spectral characteristics of quasi harmonic instruments. This approach comprises a linear sourcefilter model, a parameter estimation method and a model evaluation based on the prototype’s variance. The source-filter-model is composed of an excitation source generating sinusoidal parameter trajectories and a modeling resonance filter, whereas basic-splines (B-Splines) are used to model continuous trajectories. To estimate the model parameters we apply a gradient decent method to a training database and the prototype’s variance is being estimated on a test database. Such a model could later be used as a priori knowledge for polyphonic instrument recognition, polyphonic transcription and source separation algorithms as well as for resynthesis.
Download Pitch-Conditioned Instrument Sound Synthesis From an Interactive Timbre Latent Space This paper presents a novel approach to neural instrument sound
synthesis using a two-stage semi-supervised learning framework
capable of generating pitch-accurate, high-quality music samples
from an expressive timbre latent space. Existing approaches that
achieve sufficient quality for music production often rely on highdimensional latent representations that are difficult to navigate and
provide unintuitive user experiences. We address this limitation
through a two-stage training paradigm: first, we train a pitchtimbre disentangled 2D representation of audio samples using a
Variational Autoencoder; second, we use this representation as
conditioning input for a Transformer-based generative model. The
learned 2D latent space serves as an intuitive interface for navigating and exploring the sound landscape. We demonstrate that the
proposed method effectively learns a disentangled timbre space,
enabling expressive and controllable audio generation with reliable
pitch conditioning. Experimental results show the model’s ability to capture subtle variations in timbre while maintaining a high
degree of pitch accuracy. The usability of our method is demonstrated in an interactive web application, highlighting its potential
as a step towards future music production environments that are
both intuitive and creatively empowering:
https://pgesam.faresschulz.com/.