Paper Titles

Researches on the Application Value and Practical Exploration of SNS Used in Rural Information Service
p.725

The Research of Virtualization Emulator Based on Open Shortest Path First
p.729

Protection Strategy for Two-Stage Cascaded Passive Optical Networks with Various Topologies
p.735

Optimizing Advertising Using Wireless Communication Technology: A Zero-Inflated Poisson Approach
p.739

Noise-Robust Voice Activity Detector Based on Four States-Based HMM
p.743

Investigation of the Performance of OOK, 2DPSK, QDPSK in Downlink of Ground-to-Satellite Laser Communication Systems
p.749

Performance of OOK, 2PSK, QPSK Modulation Format in Downlink of Ground-to-Satellite Laser Communication under the Fluctuation of Atmosphere
p.753

The Performance of MSK in Downlink of Ground-to-Satellite Laser Communication Systems
p.757

Numerically Analysis on the Role of the Modulation Depth in an All-Fiber Erbium-Doped Laser with Normal Cavity Dispersion
p.761

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 411-414Noise-Robust Voice Activity Detector Based on Four...

Noise-Robust Voice Activity Detector Based on Four States-Based HMM

Article Preview

Abstract:

Voice activity detection (VAD) is more and more essential in the noisy environments to provide an accuracy performance in the speech recognition. In this paper, we provide a method based on left-right hidden Markov model (HMM) to identify the start and end of the speech. The method builds two models of non-speech and speech instead of existed two states, formally, each model could include several states, we also analysis other features, such as pitch index, pitch magnitude and fractal dimension of speech and non-speech.. We compare the VAD results with the proposed algorithm and two states HMM. Experiments show that the proposed method make a better performance than two states HMMs in VAD, especially in the low signal-to-noise ratio (SNR) environment.

You might also be interested in these eBooks

Information Technology Applications in Industry II

Info:

Periodical:

Applied Mechanics and Materials (Volumes 411-414)

Pages:

743-748

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.411-414.743

Citation:

Cite this paper

Online since:

September 2013

Authors:

Bin Zhou, Jing Liu, Zheng Pei

Keywords:

K-Means Clustering, Left-Right Hidden Markov Model, Low Signal-to-Noise Ratio, Voice Activity Detection

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] J. Sohn, N.S. Kim and W. Sung: A statistical model-based voice activity detection. IEEE Signal Processing Letter, vol. 6(1) (1999), pp.1-3.

[2] N. Mesgarani and S. Shamma: Speech enhancement based on filtering the spectrotemporal modulations. IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP)}, vol 1 (2005), pp.1520-6149.

DOI: 10.1109/icassp.2005.1415311

[3] F. Beritelli, S. Casale and A. Cavallaero: A robust voice activity detector for wireless communications using soft computing, Ist. di Inf. e Telecommun., Catania Univ, vol 16(9) (1998), pp.1818-1829.

DOI: 10.1109/49.737650

[4] S.G. Tanyer and H. Ozer: Voice activity detection in non-stationary noise data. IEEE Trans. Speech Audio Processing, vol 6(2) (2002), pp.478-482.

DOI: 10.1109/89.848229

[5] T. Kinnunen, E. Chernenko, M. Tuononen, et al: Voice activity detection using MFCC features and support vector machine. Int. Conf. on Speech and Computer, (2007), pp.2685-2692.

[6] A. Davis, S. Nordholm and R. Togneri: Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. IEEE Trans. Audio Speech Lang. Process, vol 14(2) (2007), pp.2693-2709.

DOI: 10.1109/tsa.2005.855842

[7] S. Valipour, F. Razzazi, et al: The reduced nearest neighbor rule. 2th International Conference on Computational Intelligence, Modelling and Simulation, vol 18(3) (2010), pp.345-350.

[8] Y. Liang, X. Liu, Y. Lou and B. Shan: An improved noise robust voice activity detector based on hidden semi-Markov models. Pattern Recognition Letters, vol 32(7) (2011), pp.1044-1053.

DOI: 10.1016/j.patrec.2011.02.015

[9] L.R. Rabiner: A tutorial on hidden Markov model and selected applications in speech recognition. IEEE Proceedings, vol 77(2) (1989), pp.257-286.

DOI: 10.1109/5.18626

[10] S. Shafieea,F. Almasganj, B. Vazirnezhad and A. Jafari: A two-stage speech activity detection system considering fractal aspects of prosody. Pattern Recognition Letters, vol 31(9) (2007), pp.936-948.

DOI: 10.1016/j.patrec.2009.12.014

[11] J.W. Shin, J.H. Chang and N.S. Kim: Voice activity detection based on a family of parametric distributions. Pattern Recognition Letters, vol 28(11) (2007), pp.1295-1299.

DOI: 10.1016/j.patrec.2006.11.015

[12] R. Bakis: Continuous speech word recognition via centisecond acoustic states. In Proc. ASA Meeting, Washington, DC 179, (1976), pp.2273-2282.

[13] Y. Hu and P. Loizou: Evaluation of objective quality measures for speech enhancement. IEEE Tran. Speech Audio Process, vol 16(1) (2008), pp.229-238.

DOI: 10.1109/tasl.2007.911054

[14] K. Kokkinos and P. Maragos: Nonlinear speech analysis using models for chaotic systems, IEEE Tran. Speech Audio Process, vol 13 (2005), pp.1098-1109.

DOI: 10.1109/tsa.2005.852982

[15] M. Banbrook and S. McLaughlin: Is speech chaotic?: Invariant geometrical measures for speech data, IEEE Colloquium on Exploiting Chaos in Signal Processing, vol 16(8) (1994), pp.1-8.

[16] R. Esteller, G. Vachtsevanos, J. Echauz and B. Litt: Finding representative patterns with ordered projections. IEEE Trans. Circuits Syst., vol 48(2) (2001), pp.177-183.

DOI: 10.1109/81.904882

[17] B. Luo, Z. Pei, L. Xu, D. Hu: A New Method Based on HMMs and K-means Algorithms for Noise-Robust Voice Activity Detector, Applied Mechanics and Materials Vols 128-129 (2012), pp.461-464.

DOI: 10.4028/www.scientific.net/amm.128-129.461