Energy Difference Based Speech Segregation for Close-Talk System


Article Preview

Within the framework of computational auditory scene analysis (CASA), a speech separation algorithm based on energy difference for close-talk system was proposed. The two microphones received the mixture signal of close target speech and far noise sound at the same time. The inter-microphone intensity differences (IMID) of the two microphones in time-frequency (T-F) units were calculated. And used as cues to generate the binary masks with the K-means two class clustering method. Experiments indicated that this novel algorithm could separate the target speech from the mixture sound, and performed well in a big noise environment.



Edited by:

Mohamed Othman




H. Zhou et al., "Energy Difference Based Speech Segregation for Close-Talk System", Applied Mechanics and Materials, Vols. 229-231, pp. 1738-1741, 2012

Online since:

November 2012




[1] T. Jan, W. W. Wang and D. L. Wang, A multistage approach to blind separation of convolutive speech mixtures, Speech Communication. Vol. 53(2011), pp.524-539.


[2] Brown G J, Cooke M. Computational auditory scene analysis,. Computer Speech and Language. vol. 8, No. 4(1994), pp.297-336.


[3] Chao-Ling H, Jang J S R. On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset,. IEEE Trans. Audio, Speech, and Language Processing. Vol. 18, No. 2(2010), pp.310-319.


[4] Hu G, Wang D. Monaural speech separation based on pitch tracking and amplitude modulation,. IEEE Trans. Neural Networks. Vol. 15, No. 5(2004), pp.1135-1150.


[5] Boll S F. Suppression of acoustic noise in speech using spectral subtraction,. IEEE Trans. ACOUSTICS SPEECH AND SIGNAL PROCESSING. Vol. 27, No. 2(1979), pp.113-120.


[6] Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D. L. Wang and G. J. Brown, Eds. Hoboken, NJ: Wiley and IEEE Press. (2006).