An Unsupervised Approach to Close-Talk Speech Enhancement

Yi Jiang; Yuan Yuan Zu; Ying Ze Wang

doi:10.4028/www.scientific.net/AMM.614.363

Paper Titles

Simultaneous Fault Diagnosis of Main Retarder Using Improved Paired Relevance Vector Machine Based on Multi-Kernel Learning
p.339

A Digital Text Watermarking for Word Document
p.347

A Fast DOA Algorithm Based on Propagator Method and Rotation Array
p.352

An Improved Doppler Shift Simulation Method Based on Third-Order DDS
p.356

An Unsupervised Approach to Close-Talk Speech Enhancement
p.363

Borel-Cantelli Lemma for Sugeno Measure
p.367

Elementary Transformation of Generalised Inverse A_i^-and Least Square Solution of Contradictory Equations Set
p.371

Extraction and Segmentation of Books Call Number Image for Books on the Shelves of Library
p.374

Fuzzy Logic and Classification of Information Sources
p.378

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 614An Unsupervised Approach to Close-Talk Speech...

An Unsupervised Approach to Close-Talk Speech Enhancement

Abstract:

A K-means based unsupervised approach to close-talk speech enhancement is proposed in this paper. With the frame work of computational auditory scene analysis (CASA), the dual-microphone energy difference (DMED) is used as the cue to classify the noise domain time-frequency (T-F) units and target speech domain units. A ratio mask is used to separate the target speech and noise. Experiment results show the robust performance of the proposed algorithm than the Wiener filtering algorithm.

You might also be interested in these eBooks

International Conference Machinery, Electronics and Control Simulation

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 614)

Pages:

363-366

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.614.363

Citation:

Cite this paper

Online since:

September 2014

Authors:

Yi Jiang*, Yuan Yuan Zu, Ying Ze Wang

Keywords:

Computational Auditory Scene Analysis (CASA), Dual-Microphone Energy Differences (DMED), Ratio Mask, Speech Enhancement, Unsupervised Classification

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] E.C. Cherry, on human communication. Cambridge, MA: MIT Press, (1957).

Google Scholar

[2] Jiang, Y., Liang, W., Zhou, H., Feng, Z. Performance of binary time-frequency masks in low signal to noise ratio environments,. Journal of Tsinghua University vol. 52, no. 5, (2012), pp.636-641.

Google Scholar

[3] D.L. Wang and G. J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, Wiley/IEEEPress, Hoboken, NJ, (2006).

Google Scholar

[4] G. Kim, Y. Lu, Y. Hu, and P.C. Loizou, An algorithm that improves speech intelligibility in noise for normal-hearing listeners,J. Acoust. Soc. Am., vol. 126, (2009), p.1486–1494.

DOI: 10.1121/1.3184603

Google Scholar

[5] Y.X. Wang, K. Han and D.L. Wang, Exploring monaural features for classification-based speech segregation, IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, (2013), pp.270-279.

DOI: 10.1109/tasl.2012.2221459

Google Scholar

[6] Jiang Y, Jiang M, Zu Y. M et al. Using energy difference for speech separation of dual-microphone close-talk system. Sensors and Transducers, vol. 21, no. 5, (2013), pp.122-127.

Google Scholar

[7] Scalart P. Speech enhancement based on a priori signal to noise estimation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, (1996), p.629–632.

DOI: 10.1109/icassp.1996.543199

Google Scholar