Download Generating similarity-based playlists using travleling salesman algorithms
When using a mobile music player en-route, usually only little attention can be paid to its handling. Nonetheless it is desirable that all music stored in the device can be accessed quickly, and that tracks played in a sequence should match up. In this paper, we present an approach to satisfy these constraints: a playlist containing all tracks stored in the music player is generated such that in average, consecutive pieces are maximally similar. This is achieved by applying a Traveling Salesman algorithm to the pieces, using timbral similarities as the distances. The generated playlist is linear and circular, thus the whole collection can easily be browsed with only one input wheel. When a chosen track finishes playing, the player advances to the consecutive tracks in the playlist, generally playing tracks similar to the chosen track. This behavior could be a favorable alternative to the wellknown shuffle function that most current devices – such as the iPod shuffle, for example – have. We evaluate the fitness of four different Traveling Salesman algorithms for this purpose. Evaluated aspects were runtime, the length of the resulting route, and the genre distribution entropy. We implemented a Java applet to demonstrate the application and its usability.
Download Automatic Music Detection in Television Productions
This paper presents methods for the automatic detection of music within audio streams, in the fore- or background. The problem occurs in the context of a real-world application, namely, the analysis of TV productions w.r.t. the use of music. In contrast to plain speech/music discrimination, the problem of detecting music in TV productions is extremely difficult, since music is often used to accentuate scenes while concurrently speech and any kind of noise signals might be present. We present results of extensive experiments with a set of standard machine learning algorithms and standard features, investigate the difference between frame-level and clip-level features, and demonstrate the importance of the application of smoothing functions as a post-processing step. Finally, we propose a new feature, called Continuous Frequency Activation (CFA), especially designed for music detection, and show experimentally that this feature is more precise than the other approaches in identifying segments with music in audio streams.
Download Informed Selection of Frames for Music Similarity Computation
In this paper we present a new method to compute frame based audio similarities, based on nearest neighbour density estimation. We do not recommend it is as a practical method for large collections because of the high runtime. Rather, we use this new method for a detailed analysis to get a deeper insight on how a bag of frames approach (BOF) determines similarities among songs, and in particular, to identify those audio frames that make two songs similar from a machine’s point of view. Our analysis reveals that audio frames of very low energy, which are of course not the most salient with respect to human perception, have a surprisingly big influence on current similarity measures. Based on this observation we propose to remove these low-energy frames before computing song models and show, via classification experiments, that the proposed frame selection strategy improves the audio similarity measure.
Download A High-Level Audio Feature for Music Retrieval and Sorting
We describe an audio analysis method to create a high-level audio annotation, expressed as a single scalar. Typically, low values of this feature indicate songs with dominant harmonic elements while high values indicate the dominance of mainly percussive or drum-like sounds. The proposed feature is based on a simple idea: Filters known from image processing are used to extract attack and harmonic parts of the spectrum, and the ratio of their overall strengths is used as the final feature. The feature takes values in the unit range, and is highly independent of the overall loudness. We present a number of experiments that indicate the potential of the proposed feature. A suggested application scenario is to write the feature value into the comments field of an audio file, so that it can be used by a number of existing audio players in conjunction with metadata-based search mechanisms, most notably genre.
Download Fusing Block-level Features for Music Similarity Estimation
In this paper we present a novel approach to computing music similarity based on block-level features. We first introduce three novel block-level features — the Variance Delta Spectral Pattern (VDSP), the Correlation Pattern (CP) and the Spectral Contrast Pattern (SCP). Then we describe how to combine the extracted features into a single similarity function. A comprehensive evaluation based on genre classification experiments shows that the combined block-level similarity measure (BLS) is comparable, in terms of quality, to the best current method from the literature. But BLS has the important advantage of being based on a vector space representation, which directly facilitates a number of useful operations, such as PCA analysis, k-means clustering, visualization etc. We also show that there is still potential for further improve of music similarity measures by combining BLS with another stateof-the-art algorithm; the combined algorithm then outperforms all other algorithms in our evaluation. Additionally, we discuss the problem of album and artist effects in the context of similaritybased recommendation and show that one can detect the presence of such effects in a given dataset by analyzing the nearest neighbor classification results.