Quad-Based Audio Fingerprinting Robust to Time and Frequency Scaling

Reinhard Sonnleitner; Gerhard Widmer
DAFx-2014 - Erlangen
We propose a new audio fingerprinting method that adapts findings from the field of blind astrometry to define simple, efficiently representable characteristic feature combinations called quads. Based on these, an audio identification algorithm is described that is robust to large amounts of noise and speed, tempo and pitch-shifting distortions. In addition to reliably identifying audio queries that are modified in this way, it also accurately estimates the scaling factors of the applied time/frequency distortions. We experimentally evaluate the performance of the method for a diverse set of noise, speed and tempo modifications, and identify a number of advantages of the new method over a recently published distortioninvariant audio copy detection algorithm.