Download Audio Effect Chain Estimation and Dry Signal Recovery From Multi-Effect-Processed Musical Signals
In this paper we propose a method that can address a novel task, audio effect (AFX) chain estimation and dry signal recovery. AFXs are indispensable in modern sound design workflows. Sound engineers often cascade different AFXs (as an AFX chain) to achieve their desired soundscapes. Given a multi-AFX-applied solo instrument performance (wet signal), our method can automatically estimate the applied AFX chain and recover its unprocessed dry signal, while previous research only addresses one of them. The estimated chain is useful for novice engineers in learning practical usages of AFXs, and the recovered signal can be reused with a different AFX chain. To solve this task, we first develop a deep neural network model that estimates the last-applied AFX and undoes its AFX at a time. We then iteratively apply the same model to estimate the AFX chain and eventually recover the dry signal from the wet signal. Our experiments on guitar phrase recordings with various AFX chains demonstrate the validity of our method for both the AFX-chain estimation and dry signal recovery. We also confirm that the input wet signal can be reproduced by applying the estimated AFX chain to the recovered dry signal.
Download Hyperbolic Embeddings for Order-Aware Classification of Audio Effect Chains
Audio effects (AFXs) are essential tools in music production, frequently applied in chains to shape timbre and dynamics. The order of AFXs in a chain plays a crucial role in determining the final sound, particularly when non-linear (e.g., distortion) or timevariant (e.g., chorus) processors are involved. Despite its importance, most AFX-related studies have primarily focused on estimating effect types and their parameters from a wet signal. To address this gap, we formulate AFX chain recognition as the task of jointly estimating AFX types and their order from a wet signal. We propose a neural-network-based method that embeds wet signals into a hyperbolic space and classifies their AFX chains. Hyperbolic space can represent tree-structured data more efficiently than Euclidean space due to its exponential expansion property. Since AFX chains can be represented as trees, with AFXs as nodes and edges encoding effect order, hyperbolic space is well-suited for modeling the exponentially growing and non-commutative nature of ordered AFX combinations, where changes in effect order can result in different final sounds. Experiments using guitar sounds demonstrate that, with an appropriate curvature, the proposed method outperforms its Euclidean counterpart. Further analysis based on AFX type and chain length highlights the effectiveness of the proposed method in capturing AFX order.