A Resilient Novel Compressed-Domain Audio Recognition Method for Anti-Linear Speed Change

Article Preview

Abstract:

Audio fingerprint technology has been increasingly played an important role in audio content identification, audio information security, industrial process monitoring, etc. Due to compressed format has become the main way for audio files storage and transmission, it owns more practical significance that directly extracting audio fingerprint from compressed-domain. In general, existing compressed-domain audio fingerprint schemes are robust to common time-frequency-domain distortion, including noise, echo, band-pass filtering, 32Kbps@MP3 and others. But they are difficult to deal with large linear speed change distortion which is a frequent audio processing means in the field of television and broadcast. This paper proposes a novel compressed-domain audio recognition algorithm, which can resist linear speed change in the range of-10% to 10% (recognition rate is higher than 90%), via extracting fingerprint after do Fourier-Mellin transform for sub-band energy sequence of MDCT spectrum. This is enough to cope with almost all situations of audio acceleration/deceleration occurred in commercial application. In addition, it shows similarity in other performance compared with existing excellent compressed-domain audio recognition algorithms.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

613-618

Citation:

Online since:

August 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] L.M. Wu, W. Han and Y.H. Deng: International Journal of Advancements in Computing Technology, Vol. 5 (2013) No. 9, pp.291-298.

Google Scholar

[2] W. Li, Y.D. Liu and X.Y. Xue: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Geneva, Switzerland, July 19–23, 2010), pp.627-634.

Google Scholar

[3] C.C. Liu and Po-Feng Chang: The 9th International Conference on Advances in Mobile Computing & Multimedia (Ho Chi Minh City, Vietnam, December 5-7, 2011), pp.190-193.

Google Scholar

[4] C. PEDRO, B. ELOI and K. Ton: Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, Vol. 11 (2005) No. 3, pp.271-284.

Google Scholar

[5] J.X. Liu and T.X. Zhang: International Conference on Computing, Information and Control (Wuhan, China, September 17-18, 2011), pp.360-368.

Google Scholar

[6] B.L. Zhu, W. Li and Z.R. Wang: 18th ACM International Conference on Multimedia ACM Multimedia (Firenze, Italy, October 25-29, 2010), pp.987-990.

Google Scholar

[7] H. Jaap and K. Ton: IEEE International Conference on Acoustics, Speech and Signal Processing (Hong Kong, China, April 6-10, 2003), pp.728-731.

Google Scholar

[8] W. Sun, Z.M. Lu and F.X. Yu: International Journal of Digital Crime and Forensics, Vol. 4 (2012) No. 2, pp.49-69.

Google Scholar

[9] Y. Wang, L. Yaroslavsky and M. Vilermo: In Proceedings of the 5th International Conference on Signal Processing (Beijing, China, August 21-25, 2000), pp.44-47.

Google Scholar

[10] International Organization for Standardization. ISO/IEC 11172-3, (1993).

Google Scholar

[11] P. Supakorn, K. Fouad and B. Ahmed: 2013 Fourth International Conference on Emerging Security Technologies (Cambridge, England, September 9-11, 2013), pp.58-61.

Google Scholar

[12] H. Jaap and K. Ton: Proceedings of International Symposium on Music Information Retrieval (Paris, France, October 13-17, 2002), p.107-1l5.

Google Scholar