DAFx Paper Archive - Search for machine learning in papers from2005, page 1 of 2

Polyphonic music analysis by signal processing and support vector machines

DAFx-2005 - Madrid

In this paper an original system for the analysis of harmony and polyphonic music is introduced. The system is based on signal processing and machine learning. A new multi-resolution, fast analysis method is conceived to extract time-frequency energy spectrum at the signal processing stage, while support vector machine is used as machine learning technology. Aiming at the analysis of rather general audio content, experiments are made on a huge set of recorded samples, using 19 music instruments combined together or alone, with different polyphony. Experimental results show that fundamental frequencies are detected with a remarkable success ratio and that the method can provide excellent results in general cases.

Download

A Framework for Sonification of Vicon Motion Capture Data

Ajay Kapur; George Tzanetakis; Naznin Virji-Babul; Ge Wang; Perry R. Cook

DAFx-2005 - Madrid

This paper describes experiments on sonifying data obtained using the VICON motion capture system. The main goal is to build the necessary infrastructure in order to be able to map motion parameters of the human body to sound. For sonification the following three software frameworks were used: Marsyas, traditionally used for music information retrieval with audio analysis and synthesis, CHUCK, an on-the-fly real-time synthesis language, and Synthesis Toolkit (STK), a toolkit for sound synthesis that includes many physical models of instruments and sounds. An interesting possibility is the use of motion capture data to control parameters of digital audio effects. In order to experiment with the system, different types of motion data were collected. These include traditional performance on musical instruments, acting out emotions as well as data from individuals having impairments in sensor motor coordination. Rhythmic motion (i.e. walking) although complex, can be highly periodic and maps quite naturally to sound. We hope that this work will eventually assist patients in identifying and correcting problems related to motor coordination through sound.

Download

Acoustic localization of tactile interactions for the development of novel tangible interfaces

Pietro Polotti; Alain Crevoisier; Manuel Sampietro; Augusto Sarti; Stefano Tubaro

DAFx-2005 - Madrid

Download

Speech/music discrimination based on a new warped LPC-based feature and linear discriminant analysis

José E. Muñoz Expósito; Sebastián García Galán; Nicolás Ruiz Reyes; Pedro Vera Candeas; Fernando Rivas Peña

DAFx-2005 - Madrid

Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents a low complexity but effective approach, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPCSC). Comparison between WLPC-SC and the classical features proposed in [9] is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic based feature is reduced to a few statistical values (mean, variance and skewness), which are then transformed to a new feature space by applying LDA with the aim of increasing the classification accuracy percentage. The classification task is performed by applying SVM to the features in the transformed space. The classification results for different types of music and speech show the good discriminating power of the proposed approach.

Download

Adaptive Network-Based Fuzzy Inference System for Automatic Speech/Music Discrimination

José E. Muñoz Expósito; Sebastián García Galán; Nicolás Ruiz Reyes; Pedro Vera Candeas; Fernando Rivas Peña

DAFx-2005 - Madrid

Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents an effective approach based on an Adaptive Network-Based Fuzzy Inference System (ANFIS) for the classification stage required in a speech/music discrimination system. A new simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC), is also proposed. Comparison between WLPC-SC and some of the classical features proposed in [11] is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic-based feature is reduced to a few statistical values (mean, variance and skewness). To evaluate the performance of the ANFIS system for speech/music discrimination, comparison to other commonly used classifiers is reported. The classification results for different types of music and speech show the good discriminating power of the proposed approach.

Download

j-DAFx - Digital Audio Effects in Java

Mijail Guillemard; Christian Ruwwe; Udo Zölzer

DAFx-2005 - Madrid

This paper describes an attempt to provide an online learning platform for digital audio effects. After a comprehensive study of different technologies presenting multimedia content dynamically reacting to user input, we decided to use Java Applets. Further investigations regard the implementation issues - especially the processing and visualization of audio data - and present a general framework used in our department. Recent and future digital effects implemented in this framework can be found on our web site.

Download

Comparing synthetic and real templates for dynamic time warping to locate partial envelope features

Joseph Timoney; Thomas Lysaght; Lorcan Mac Manus; Victor Lazzarini

DAFx-2005 - Madrid

In this paper we compare the performance of a number of different templates for the purposes of split point identification of various clarinet envelopes. These templates were generated with AttackDecay-Sustain-Release (ADSR) descriptions commonly used in musical synthesis, along with a set of real templates obtained using k-means clustering of manually prepared test data. The goodness of fit of the templates to the data was evaluated using the Dynamic Time Warping (DTW) cost function, and by evaluating the square of the distance of the identified split points to the manually identified split points in the test data. It was found that the best templates for split point identification were the synthetic templates followed by the real templates having a sharp attack and release characteristic, as is characteristic of the clarinet envelope.

Download

An auditory 3D file manager designed from interaction patterns

Christopher Frauenberger; Veronika Putz; Robert Höldrich; Tony Stockman

DAFx-2005 - Madrid

This paper shows the design, implementation and evaluation of an auditory user interface for a file-manager application. The intention for building this prototype was to prove concepts developed to support user interface designers with design patterns in order to create robust and efficient auditory displays. The paper describes the motivation for introducing a mode-independent meta domain in which the design patterns were defined to overcome the problem of translating mainly visual concepts to the auditory domain. The prototype was implemented using the IEM Ambisonics libraries for Pure Data to produce high quality binaural audio rendering and used headtracking and a joystick as the main interaction devices.

Download

A New Functional Framework for a Sound System for Realtime Flight Simulation

Siegfried Vössner; Reinhard Braunstingl; Helmuth Ploner-Bernard; Alois Sontacchi

DAFx-2005 - Madrid

We will show a new sound framework and concept for realistic flight simulation. Dealing with a highly complex network of mechanical systems that act as physical sound sources the main focus is on a fully modular and extensible/scalable design. The prototype we developed is part of a fully functional Full Flight Simulator for Pilot Training.

Download

Performing Expressive Rhythms with BillaBoop Voice-Driven Drum Generator

Amaury Hazan

DAFx-2005 - Madrid

In a previous work we presented a system for transcribing spoken rhythms into a symbolic score. Thereafter, the system was extended to process the vocal stream in real-time in order to allow a musician to use it as a voice-driven drum generator. Extensions to this work are the following. First we achieved a study of the system classification accuracy based on typical onomatopoeia used in western beat boxing, with the perspective of building a general supervised model for immediate use. Also, we want the user to be able to generate expressive rhythms, beyond the symbolic drum representation. Thus we considered a class-specific mapping of continuous vocal stream descriptors with either effects or synthesis parameters of the drum generator. The extraction of the symbolic drum stream is implemented in the BillaBoop VST Core plug-in. The class-specific mapping and the sound synthesis are carried out in Plogue Bidule 1 framework. All these components are integrated into a low-latency application that allows its use for live performances.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors