Discriminative Minimum Statistics Projection Coefficient Feature for Acoustic Context Recognition

Article Preview

Abstract:

Acoustic environment recognition, which can provide the important acoustic context, has been widely used in many applications and is a considerable difficult problem in the real-life and the complex environment. This paper proposes the discriminative minimum statistics project coefficient (MSPC) feature with the information of classification by using partial least squares (PLS). With the minimum statistics (MS) tracked from the input sound, the discriminative MSPC feature is extracted by projecting the MS into lower-dimensional feature subspace learned by using PLS analysis. Based on the proposed discriminative MSPC feature, the acoustic environment recognition is implemented by using Gaussian Mixture Model (GMM) for modeling each sound class. The experimental results show that the proposed discriminative MSPC feature based on PLS outperforms the MSPC feature based on PCA for acoustic environment recognition.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

304-309

Citation:

Online since:

October 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] A. K. Dey: Understanding and using context, Personal and ubiquitous computing, vol. 5, no. 1, pp.4-7, (2001).

Google Scholar

[2] J. J. Aucouturier, Y. Nonaka, K. Katahira and K. Okanoya: Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models, J. Acoust. Soc. Amer, vol. 130, no. 5, pp.2969-2977, (2011).

DOI: 10.1121/1.3641377

Google Scholar

[3] J. Pineau, M. Montemerlo, M. Pollack, N. Roy and S. Thrun: Towards robotic assistants in nursing homes: Challenges and results, Special Iss. Socially Interactive Robots, Robot., Autonomous Syst., vol. 42, no. 3-4, pp.271-281, (2003).

DOI: 10.1016/s0921-8890(02)00381-0

Google Scholar

[4] A. Kalmbach, Y. Girdhar and G. Dudek: Unsupervised Environment Recognition and Modeling using Sound Sensing, in Proc. Robotics and Automation, (2013), pp.2699-2704.

DOI: 10.1109/icra.2013.6630948

Google Scholar

[5] S. Chu, S. Narayanan, C. -C. J. Kuo and M. J. Mataric: Where am I? Scene recognition for mobile robots using audio features, in Proc. ICME, (2006), pp.885-888.

DOI: 10.1109/icme.2006.262661

Google Scholar

[6] T. Heittola, A. Mesaros, A. Eronen and T. Virtanen: Context-dependent sound event detection, EURASIP Journal on Audio, Speech, and Music Processing, (2013), doi: 10. 1186/1687-4722-2013-1.

DOI: 10.1186/1687-4722-2013-1

Google Scholar

[7] R. Cai, L. Lu, A. Hanjalic, H. Zhang and L. -H. Cai: A flexible framework for key audio effects detection and auditory context inference, IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp.1026-1039, (2006).

DOI: 10.1109/tsa.2005.857575

Google Scholar

[8] R. Cai, L. Lu and A. Hanjalic: Co-clustering for auditory scene categorization, IEEE Trans. on Multimedia, vol. 18, no. 6, pp.596-606, (2008).

DOI: 10.1109/tmm.2008.921739

Google Scholar

[9] A. J. Eronen, V. T. Peltonen, J. T. Tuomi, A. P. Klapuri, S. Fagerlund, T. Sorsa, G. Lorho and J. Huopaniemi: Audio-Based context recognition, IEEE Trans. on Audio, Speech, and Language Processing, vol. 14, no. 1, pp.321-329, (2006).

DOI: 10.1109/tsa.2005.854103

Google Scholar

[10] L. Ma, B. Milner and D. Smith: Acoustic environment classification, ACM Trans. Speech Lang. Process., vol. 3, no. 2, pp.1-22, (2006).

DOI: 10.1145/1149290.1149292

Google Scholar

[11] S. Chu, S. Narayanan and C. -C. Jay Kuo: Environmental sound recognition with time-frequency audio features, IEEE Trans. on Audio, Speech, and Language Processing, vol. 17, no. 6, pp.1142-1158, (2009).

DOI: 10.1109/tasl.2009.2017438

Google Scholar

[12] R. Mogi and H. Kasai: Noise-robust environmental sound classification method based on combination of ICA and MP features, Artificial Intelligence Research, vol. 2, no. 1, pp.107-121, (2013).

DOI: 10.5430/air.v2n1p107

Google Scholar

[13] S. -W Deng, J. -Q Han, C. -Z Zhang, T. -R Zheng and G. -B Zheng: Robust minimum statistics project coefficients feature for acoustic environment recognition, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), (2014).

DOI: 10.1109/icassp.2014.6855206

Google Scholar

[14] B. V. Srinivasan, Y. -C. Luo, G. -R. Daniel, D. N. Zotkin and R. Duraiswami: A symmetric kernel partial least squares framework for speaker recognition, IEEE Transactions on Audio, Speech and Language Processing, vol. 21, no. 7, pp.1415-1423, (2013).

DOI: 10.1109/tasl.2013.2253096

Google Scholar

[15] Q. Wang, F. Chen, Xu W and M. H. Yang: Object tracking via partial least squares analysis, IEEE Transactions on Image Processing, vol. 21, no. 10, pp.4454-4465, (2012).

DOI: 10.1109/tip.2012.2205700

Google Scholar

[16] R. Martin: Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. on Speech and Audio Processing, vol. 9, no. 5, pp.504-512, (2001).

DOI: 10.1109/89.928915

Google Scholar

[17] A. Hoskuldsson: PLS regression methods, Journal of Chemometrics, vol. 2, pp.211-228, (1988).

Google Scholar

[18] Online free sound resource on http: /www. freesound. org.

Google Scholar