Download Nonlinear time series analysis of musical signals
In this work the techniques of chaotic time series analysis are applied to music. The audio stream from musical recordings are treated as representing experimental data from a dynamical system. Several performance of well-known classical pieces are analysed using recurrence analysis, stationarity measures, information metrics, and other time series based approaches. The benefits of such analysis are reported.
Download Extraction of long-term structures in musical signals using the empirical mode decomposition
Long-term musical structures provide information concerning rhythm, melody and the composition. Although highly musically relevant, these structures are difficult to determine using standard signal processing. In this paper, a new technique based on the time-domain empirical mode decomposition is explained which enables us to analyse both short-term information and long-term structures in musical signals. It provides insight into perceived rhythms and their relationship to the signal. The technique is explained, and results are reported and discussed. Keywords: Empirical Mode Decomposition (EMD), Music Analysis, Santur, Long-term Structures, Fundamental Frequency, Rhythm.
Download Improved control for selective minimization of masking using inter-channel dependancy effects
A digital audio effect for real time mixing applications, which dynamically adapts to the multi-channel input, has been implemented. The resulting audio mix is the direct result of the analysis of the content of each individual channel with respect to the other channels. The implementation permits the enhancement of a source with respect to the rest of the mixture by selectivity unmasking its spectral content from spectrally related channels. A masking measurement has also been implemented in order to measure the efficiency of the algorithm.
Download A Cross-Adaptive Dynamic Spectral Panning Technique
This work presents an algorithm that is able to achieve novel spatialization effects on multitrack audio signals. It relies on a crossadaptive framework that dynamically maps the azimuth positions of each track’s time-frequency bins with the goal of reducing masking between source signals by dynamically separating them across space. The outputs of this system are compared to traditional panning strategies in subjective evaluation, and it is seen that scores indicate it performs well as a novel effect that can be used in live sound applications and creative sound design or mixing.
Download Automatic subgrouping of multitrack audio
Subgrouping is a mixing technique where the outputs of a subset of audio tracks in a multitrack are summed to a single audio bus. This is done so that the mix engineer can apply signal processing to an entire subgroup, speed up the mix work flow and manipulate a number of audio tracks at once. In this work, we investigate which audio features from a set of 159 can be used to automatically subgroup multitrack audio. We determine a subset of audio features from the original 159 audio features to use for automatic subgrouping, by performing feature selection using a Random Forest classifier on a dataset of 54 individual multitracks. We show that by using agglomerative clustering on 5 test multitracks, the entire set of audio features incorrectly clusters 35.08% of the audio tracks, while the subset of audio features incorrectly clusters only 7.89% of the audio tracks. Furthermore, we also show that using the entire set of audio features, ten incorrect subgroups are created. However, when using the subset of audio features, only five incorrect subgroups are created. This indicates that our reduced set of audio features provides a significant increase in classification accuracy for the creation of subgroups automatically.
Download An Evaluation of Audio Feature Extraction Toolboxes
Audio feature extraction underpins a massive proportion of audio processing, music information retrieval, audio effect design and audio synthesis. Design, analysis, synthesis and evaluation often rely on audio features, but there are a large and diverse range of feature extraction tools presented to the community. An evaluation of existing audio feature extraction libraries was undertaken. Ten libraries and toolboxes were evaluated with the Cranfield Model for evaluation of information retrieval systems, reviewing the coverage, effort, presentation and time lag of a system. Comparisons are undertaken of these tools and example use cases are presented as to when toolboxes are most suitable. This paper allows a software engineer or researcher to quickly and easily select a suitable audio feature extraction toolbox.
Download Latent Force Models for Sound: Learning Modal Synthesis Parameters and Excitation Functions from Audio Recordings
Latent force models are a Bayesian learning technique that combine physical knowledge with dimensionality reduction — sets of coupled differential equations are modelled via shared dependence on a low-dimensional latent space. Analogously, modal sound synthesis is a technique that links physical knowledge about the vibration of objects to acoustic phenomena that can be observed in data. We apply latent force modelling to sinusoidal models of audio recordings, simultaneously inferring modal synthesis parameters (stiffness and damping) and the excitation or contact force required to reproduce the behaviour of the observed vibrational modes. Exposing this latent excitation function to the user constitutes a controllable synthesis method that runs in real time and enables sound morphing through interpolation of learnt parameters.
Download Investigation of a Drum Controlled Cross-adaptive Audio Effect for Live Performance
Electronic music often uses dynamic and synchronized digital audio effects that cannot easily be recreated in live performances. Cross-adaptive effects provide a simple solution to such problems since they can use multiple feature inputs to control dynamic variables in real time. We propose a generic scheme for cross-adaptive effects where onset detection on a drum track dynamically triggers effects on other tracks. This allows a percussionist to orchestrate effects across multiple instruments during performance. We describe the general structure that includes an onset detection and feature extraction algorithm, envelope and LFO synchronization, and an interface that enables the user to associate different effects to be triggered depending on the cue from the percussionist. Subjective evaluation is performed based on use in live performance. Implications on music composition and performance are also discussed. Keywords: Cross-adaptive digital audio effects, live processing, real-time control, Csound.
Download Physically Derived Synthesis Model of a Cavity Tone
The cavity tone is the sound generated when air flows over the open surface of a cavity and a number of physical conditions are met. Equations obtained from fluid dynamics and aerodynamics research are utilised to produce authentic cavity tones without the need to solve complex computations. Synthesis is performed with a physical model where the geometry of the cavity is used in the sound synthesis calculations. The model operates in real-time making it ideal for integration within a game or virtual reality environment. Evaluation is carried out by comparing the output of our model to previously published experimental, theoretical and computational results. Results show an accurate implementation of theoretical acoustic intensity and sound propagation equations as well as very good frequency predictions. NOMENCLATURE c = speed of sound (m/s) f = frequency (Hz) ω = angular frequency = 2πf (rads/revolution) u = air flow speed (m/s) Re = Reynolds number (dimensionless) St = Strouhal number (dimensionless) r = distance between listener and sound source (m) φ = elevation angle between listener and sound source ϕ = azimuth angle between listener and sound source ρair = mass density of air (kgm−3 ) µair = dynamic viscosity of air (Pa s) M = Mach number, M = u/c (dimensionless) L = length of cavity (m) d = depth of cavity (m) b = width of cavity (m) κ = wave number, κ = ω/c (dimensionless) r = distance between source and listener (m) δ = shear layer thickness (m) δ ∗ = effective shear layer thickness (m) δ0 = shear layer thickness at edge separation (m) θ0 = shear layer momentum thickness at edge separation (m) C2 = pressure coefficient (dimensionless)
Download Unsupervised Taxonomy of Sound Effects
Sound effect libraries are commonly used by sound designers in a range of industries. Taxonomies exist for the classification of sounds into groups based on subjective similarity, sound source or common environmental context. However, these taxonomies are not standardised, and no taxonomy based purely on the sonic properties of audio exists. We present a method using feature selection, unsupervised learning and hierarchical clustering to develop an unsupervised taxonomy of sound effects based entirely on the sonic properties of the audio within a sound effect library. The unsupervised taxonomy is then related back to the perceived meaning of the relevant audio features.