Download Declaratively Programmable Ultra Low-Latency Audio Effects Processing on FPGA
WaveCore is a coarse-grained reconfigurable processor architecture, based on data-flow principles. The processor architecture consists of a scalable and interconnected cluster of Processing Units (PU), where each PU embodies a small floating-point RISC processor. The processor has been designed in technology-independent VHDL and mapped on a commercially available FPGA development platform. The programming methodology is declarative, and optimized to the application domain of audio and acoustical modeling. A benchmark demonstrator algorithm (guitar-model, comprehensive effects-gear box, and distortion/cabinet model) has been developed and applied to the WaveCore development platform. The demonstrator algorithm proved that WaveCore is very well suited for efficient modeling of complex audio/acoustical algorithms with negligible latency and virtually zero jitter. An experimental Faust-to-WaveCore compiler has shown the feasibility of automated compilation of Faust code to the WaveCore processor target. Keywords: ultra-low latency, zero-jitter, coarse-grained reconfigurable computing, declarative programming, automated manycore compilation, Faust-compatible, massively-parallel
Download Polyphonic Pitch Detection by Iterative Analysis of the Autocorrelation Function
In this paper, a polyphonic pitch detection approach is presented, which is based on the iterative analysis of the autocorrelation function. The idea of a two-channel front-end with periodicity estimation by using the autocorrelation is inspired by an algorithm from Tolonen and Karjalainen. However, the analysis of the periodicity in the summary autocorrelation function is enhanced with a more advanced iterative peak picking and pruning procedure. The proposed algorithm is compared to other systems in an evaluation with common data sets and yields good results in the range of state of the art systems.
Download Music-Content-Adaptive Robust Principal Component Analysis for a Semantically Consistent Separation of Foreground and Background in Music Audio Signals
Robust Principal Component Analysis (RPCA) is a technique to decompose signals into sparse and low rank components, and has recently drawn the attention of the MIR field for the problem of separating leading vocals from accompaniment, with appealing results obtained on small excerpts of music. However, the performance of the method drops when processing entire music tracks. We present an adaptive formulation of RPCA that incorporates music content information to guide the decomposition. Experiments on a set of complete music tracks of various genres show that the proposed algorithm is able to better process entire pieces of music that may exhibit large variations in the music content, and compares favorably with the state-of-the-art.
Download Semi-Blind Audio Source Separation of Linearly Mixed Two-Channel Recordings via Guided Matching Pursuit
This paper describes a source separation system with the intent to be used in high quality audio post-processing tasks. The system is to be used as the front-end of a larger system capable of modifying the individual sources of existing, two-channel, multi-source recordings. Possible applications include spatial re-configuration such as up-mixing and pan-transformation, re-mixing, source suppression/elimination, source extraction, elaborate filtering, timestretching and pitch-shifting. The system is based on a new implementation of the Matching Pursuit algorithm and uses a known mixing matrix. We compare the results of the proposed system with those from mpd-demix of the ’MPTK’ software package and show that we get similar evaluation scores and in some cases better perceptual scores. We also compare against a segmentation algorithm which is based on the same principles but uses the STFT as the front-end and show that source separation algorithms based on adaptive decomposition schemes tend to give better results. The novelty of this work is a new implementation of the original Matching Pursuit algorithm which adds a pre-processing step into the main sequence of the basic algorithm. The purpose of this step is to perform an analysis on the signal and based on important extracted features (e.g frequency components) create a mini-dictionary comprising atoms that match well with a specific part of the signal, thus leading to focused and more efficient exhaustive searches around centres of energy in the signal.
Download Finite Volume Perspectives on Finite Difference Schemes and Boundary Formulations for Wave Simulation
Time-domain finite difference (FD) and digital waveguide mesh (DWM) methods have seen extensive exploration as techniques for physical modelling sound synthesis and artificial reverberation. Various formulations of these methods have been unified under the FD framework, but many discrete boundary models important in room acoustics applications have not been. In this paper, the finite volume (FV) framework is used to unify various FD and DWM topologies, as well as associated boundary models. Additional geometric insights on existing stability conditions provide guidance into the FV meshing pre-processing step necessary for the acoustic modelling of irregular and realistic room geometries. DWM “1-D” boundary terminations are shown, through an equivalent FV formulation, to have a consistent multidimensional interpretation that is approximated to second-order accuracy, however the geometry and wall admittances being approximated may vary from what is desired. It is also shown that certain re-entrant corner configurations can lead to instabilities and an alternative stable update is provided for one problematic configuration.
Download A Cross-Adaptive Dynamic Spectral Panning Technique
This work presents an algorithm that is able to achieve novel spatialization effects on multitrack audio signals. It relies on a crossadaptive framework that dynamically maps the azimuth positions of each track’s time-frequency bins with the goal of reducing masking between source signals by dynamically separating them across space. The outputs of this system are compared to traditional panning strategies in subjective evaluation, and it is seen that scores indicate it performs well as a novel effect that can be used in live sound applications and creative sound design or mixing.
Download Low-Delay Error Concealment with Low Computational Overhead for Audio over IP Applications
A major problem in low-latency Audio over IP transmission is the unpredictable impact of the underlying network, leading to jitter and packet loss. Typically, error concealment strategies are employed at the receiver to counteract audible artifacts produced by missing audio data resulting from the mentioned network characteristics. Known concealment methods tend to achieve only unsatisfactory audio quality or cause high computational costs. Hence, this study aims at finding a new low-cost concealment strategy using simplest algorithms. The proposed system basically consists of an period extraction and alignment module to synthesize concealment signals from previous data. The audio quality is evaluated in form of automated measurements using PEAQ. Furthermore, the system’s complexity is analyzed by drawing the computational costs of all required modules in all operating modes and comparing its computational load versus another concealment method based on auto-regressive modeling.
Download Categorisation of Distortion Profiles in Relation to Audio Quality
Since digital audio is encoded as discrete samples of the audio waveform, much can be said about a recording by the statistical properties of these samples. In this paper, a dataset of CD audio samples is analysed; the probability mass function of each audio clip informs a feature set which describes attributes of the musical recording related to loudness, dynamics and distortion. This allows musical recordings to be classified according to their “distortion character”, a concept which describes the nature of amplitude distortion in mastered audio. A subjective test was designed in which such recordings were rated according to the perception of their audio quality. It is shown that participants can discern between three different distortion characters; ratings of audio quality were significantly different (F (1, 2) = 5.72, p < 0.001, η 2 = 0.008) as were the words used to describe the attributes on which quality was assessed (χ2 (8, N = 547) = 33.28, p < 0.001). This expands upon previous work showing links between the effects of dynamic range compression and audio quality in musical recordings, by highlighting perceptual differences.
Download Morphing of granular sounds
Granular sounds are commonly used in video games but the conventional approach of using recorded samples does not allow sound designers to modify these sounds. In this paper we present a technique to synthesize granular sound whose tone color lies at an arbitrary point between two given granular sound samples. We first extract grains and noise profiles from the recordings, morph between them and finally synthesize sound using the morphed data. During sound synthesis a number of parameters, such as the number of grains per second or the loudness distribution of the grains, can be altered to vary the sound. The proposed method does not only allow to create new sounds in real-time, it also drastically reduces the memory footprint of granular sounds by reducing a long recording to a few hundred grains of a few milliseconds length each.
Download Reverberation still in business: Thickening and Propagating micro-textures in physics-based sound modeling
Artificial reverberation is usually introduced, as a digital audio effect, to give a sense of enclosing architectural space. In this paper we argue about the effectiveness and usefulness of diffusive reverberators in physically-inspired sound synthesis. Examples are given for the synthesis of textural sounds, as they emerge from solid mechanical interactions, as well as from aerodynamic and liquid phenomena.