An Unsupervised Approach to Close-Talk Speech Enhancement

Article Preview

Abstract:

A K-means based unsupervised approach to close-talk speech enhancement is proposed in this paper. With the frame work of computational auditory scene analysis (CASA), the dual-microphone energy difference (DMED) is used as the cue to classify the noise domain time-frequency (T-F) units and target speech domain units. A ratio mask is used to separate the target speech and noise. Experiment results show the robust performance of the proposed algorithm than the Wiener filtering algorithm.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

363-366

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] E.C. Cherry, on human communication. Cambridge, MA: MIT Press, (1957).

Google Scholar

[2] Jiang, Y., Liang, W., Zhou, H., Feng, Z. Performance of binary time-frequency masks in low signal to noise ratio environments,. Journal of Tsinghua University vol. 52, no. 5, (2012), pp.636-641.

Google Scholar

[3] D.L. Wang and G. J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, Wiley/IEEEPress, Hoboken, NJ, (2006).

Google Scholar

[4] G. Kim, Y. Lu, Y. Hu, and P.C. Loizou, An algorithm that improves speech intelligibility in noise for normal-hearing listeners,J. Acoust. Soc. Am., vol. 126, (2009), p.1486–1494.

DOI: 10.1121/1.3184603

Google Scholar

[5] Y.X. Wang, K. Han and D.L. Wang, Exploring monaural features for classification-based speech segregation, IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, (2013), pp.270-279.

DOI: 10.1109/tasl.2012.2221459

Google Scholar

[6] Jiang Y, Jiang M, Zu Y. M et al. Using energy difference for speech separation of dual-microphone close-talk system. Sensors and Transducers, vol. 21, no. 5, (2013), pp.122-127.

Google Scholar

[7] Scalart P. Speech enhancement based on a priori signal to noise estimation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, (1996), p.629–632.

DOI: 10.1109/icassp.1996.543199

Google Scholar