Download Spherical Decomposition of Arbitrary Scattering Geometries for Virtual Acoustic Environments
A method is proposed to encode the acoustic scattering of objects for virtual acoustic applications through a multiple-input and multiple-output framework. The scattering is encoded as a matrix in the spherical harmonic domain, and can be re-used and manipulated (rotated, scaled and translated) to synthesize various sound scenes. The proposed method is applied and validated using Boundary Element Method simulations which shows accurate results between references and synthesis. The method is compatible with existing frameworks such as Ambisonics and image source methods.
Download Sound Effects for a Silent Computer System
This paper proposes the sonification of the activity of a computer system that allows the user to monitor the basic performance parameters of the system, like CPU load, read and write activity of the hard disk or network traffic. Although, current computer systems still produce acoustic background noise, future and emerging computer systems will be more and more optimized with respect to their noise emission. In contrast to most of the concepts of auditory feedback, which present a particular sound as a feedback to a user’s command, the proposed feedback is mediated by the running computer system. The user’s interaction stimulates the system and hence the resulting feedback offers more realistic information about the current states of performance of the system. On the one hand the proposed sonification can mimic the acoustical behavior of operating components inside a computer system, while on the other hand, new qualities can be synthesized that enrich interaction with the device. Different forms of sound effects and generation for the proposed auditory feedback are realized to experiment with the usage in an environment of silent computer systems.
Download High frequency magnitude spectrogram reconstruction for music mixtures using convolutional autoencoders
We present a new approach for audio bandwidth extension for music signals using convolutional neural networks (CNNs). Inspired by the concept of inpainting from the field of image processing, we seek to reconstruct the high-frequency region (i.e., above a cutoff frequency) of a time-frequency representation given the observation of a band-limited version. We then invert this reconstructed time-frequency representation using the phase information from the band-limited input to provide an enhanced musical output. We contrast the performance of two musically adapted CNN architectures which are trained separately using the STFT and the invertible CQT. Through our evaluation, we demonstrate that the CQT, with its logarithmic frequency spacing, provides better reconstruction performance as measured by the signal to distortion ratio.
Download A comparison of music similarity measures for a P2P application
In this paper we compare different methods to compute music similarity between songs. The presented approaches have been reported by other authors in the field and we implemented minor improvements of them. We evaluated the different methods on a common database of MP3 encoded songs covering different genres, albums and artists. We used the best approach of the evaluation in a P2P scenario to compute song profiles and recommendations for similar songs. We will describe this integration in the second part of the paper.
Download Spatialized audio in a vision rehabilitation game for training orientation and mobility skills
Serious games can be used for training orientation and mobility skills of visually impaired children and youngsters. Here we present a serious game for training sound localization skills and concepts usually covered at orientation and mobility classes, such as front/back and left/right. In addition, the game helps the players to train simple body rotation mobility skills. The game was designed for touch screen mobile devices and has an audio virtual environment created with 3D spatialized audio obtained with head-related transfer functions. The results from a usability test with blind students show that the game can have a positive impact on the players’ skills, namely on their motor coordination and localization skills, as well as on their self-confidence.
Download Probabilistic Reverberation Model Based on Echo Density and Kurtosis
This article proposes a probabilistic model for synthesizing room impulse responses (RIRs) for use in convolution artificial reverberators. The proposed method is based on the concept of echo density. Echo density is a measure of the number of echoes per second in an impulse response and is a demonstrated perceptual metric of artificial reverberation quality. As echo density is related to the statistical measure of kurtosis, this article demonstrates that the statistics of an RIR can be modeled using a probabilistic mixture model. A mixture model designed specifically for modeling RIRs is proposed. The proposed method is useful for statistically replicating RIRs of a measured environment, thereby synthesizing new independent observations of an acoustic space. A perceptual pilot study is carried out to evaluate the fidelity of the replication process in monophonic and stereo artificial reverberators.
Download Neural Grey-Box Guitar Amplifier Modelling with Limited Data
This paper combines recurrent neural networks (RNNs) with the discretised Kirchhoff nodal analysis (DK-method) to create a grey-box guitar amplifier model. Both the objective and subjective results suggest that the proposed model is able to outperform a baseline black-box RNN model in the task of modelling a guitar amplifier, including realistically recreating the behaviour of the amplifier equaliser circuit, whilst requiring significantly less training data. Furthermore, we adapt the linear part of the DK-method in a deep learning scenario to derive multiple state-space filters simultaneously. We frequency sample the filter transfer functions in parallel and perform frequency domain filtering to considerably reduce the required training times compared to recursive state-space filtering. This study shows that it is a powerful idea to separately model the linear and nonlinear parts of a guitar amplifier using supervised learning.
Download Unison Source Separation
In this work we present a new scenario of analyzing and separating linear mixtures of musical instrument signals. When instruments are playing in unison, traditional source separation methods are not performing well. Although the sources share the same pitch, they often still differ in their modulation frequency caused by vibrato and/or tremolo effects. In this paper we propose source separation schemes that exploit AM/FM characteristics to improve the separation quality of such mixtures. We show a method to process mixtures based on differences in their amplitude modulation frequency of the sources by using non-negative tensor factorization. Further, we propose an informed warped time domain approach for separating mixtures based on variations in the instantaneous frequencies of the sources.
Download Hierarchical Gaussian tree with inertia ratio maximization for the classification of large musical instrument databases
Download Score level timbre transformations of violin sounds
The ability of a sound synthesizer to provide realistic sounds depends to a great extent on the availability of expressive controls. One of the most important expressive features a user of the synthesizer would desire to have control of, is timbre. Timbre is a complex concept related to many musical indications in a score such as dynamics, accents, hand position, string played, or even indications referring timbre itself. Musical indications are in turn related to low level performance controls such as bow velocity or bow force. With the help of a data acquisition system able to record sound synchronized to performance controls and aligned to the performed score and by means of statistical analysis, we are able to model the interrelations among sound (timbre), controls and musical score indications. In this paper we present a procedure for score-controlled timbre transformations of violin sounds within a sample based synthesizer. Given a sound sample and its trajectory of performance controls: 1) a transformation of the controls trajectory is carried out according to the score indications, 2) a new timbre corresponding to the transformed trajectory is predicted by means of a timbre model that relates timbre with performance controls and 3) the timbre of the original sound is transformed by applying a timevarying filter calculated frame by frame as the difference of the original and predicted envelopes.