Time mosaics - An image processing approach to audio visualization

Jessie Xin Zhang; Jacqueline L. Whalley; Stephen Brooks
DAFx-2008 - Espoo
This paper presents a new approach to the visualization of monophonic audio files that simultaneously illustrates general audio properties and the component sounds that comprise a given input file. This approach represents sound clip sequences using archetypal images which are subjected to image processing filters driven by audio characteristics such as power, pitch and signalto-noise ratio. Where the audio is comprised of a single sound it is represented by a single image that has been subjected to filtering. Heterogeneous audio files are represented as a seamless image mosaic along a time axis where each component image in the mosaic maps directly to a discovered component sound. To support this, in a given audio file, the system separates individual sounds and reveals the overlapping period between sound clips. Compared with existing visualization methods such as oscilloscopes and spectrograms, this approach yields more accessible illustrations of audio files, which are suitable for casual and nonexpert users. We propose that this method could be used as an efficient means of scanning audio database queries and navigating audio databases through browsing, since the user can visually scan the file contents and audio properties simultaneously.
Download