Retrieval Oriented Robust Audio Hashing
Aiming at content-based audio retrieval (CBAR) applications, a robust audio hashing scheme is proposed. First the audio is divided to frame by fixed length and then low-frequent and high-frequent components are obtained by three-level lifting-based wavelet transformation in every frame. Secondly the audio frame is approximately represented as a product of a base matrix and an encoding matrix, or coefficient matrix, using non-negative matrix factorization (NMF). Finally the sum of each column in the coefficient matrix is calculated, which is then quantized to produce one bit of the hash sequence. Experiment results show that the proposed scheme is robust against Mp3 compression, Real compression, filtering, amplitude compression, equalization, echo, etc. It is insensitive to small local change, and therefore is suitable for distinguishing different audios.
Donald C. Wunsch II, Honghua Tan, Dehuai Zeng, Qi Luo
D. L. Cui and J. L. Zuo, "Retrieval Oriented Robust Audio Hashing", Advanced Materials Research, Vols. 121-122, pp. 854-859, 2010