Voice Activity Detection Based on Multiple Statistical Models

Chang Peng Ji; Mo Gao; Jie Yang

doi:10.4028/www.scientific.net/AMR.181-182.765

Paper Titles

Comparable Experiments Study on the Embedded Optimization Methodology of Self-Organizing
p.744

Micromechanics Analysis of the Tensile Behavior of Twaron Fiber Tows at Various Strain Rates
p.749

A Concept Lattice Merger Approach for Ontology Construction
p.754

An Effective Tabu Search for Vehicle Routing Problem with Pickup and Delivery Service
p.760

Voice Activity Detection Based on Multiple Statistical Models
p.765

The Digital Watermarking Used to the Smart Power Grid Security
p.770

Video-Based Automatic Incident Detection of Highway Network Monitoring System
p.776

Research on Variant Design of Parts Based on Tabular Layouts of Article Characteristics
p.782

Automotive Air Conditioning System Fuzzy Control Algorithm
p.787

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 181-182Voice Activity Detection Based on Multiple...

Voice Activity Detection Based on Multiple Statistical Models

Abstract:

One of the key issues in practical speech processing is to achieve robust voice activity detection (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ the Gaussian assumption in the discrete Fourier transform (DFT) domain, which, however, deviates from the real observation. For a class of VAD algorithms based on Gaussian model and Laplacian model, we incorporate complex Laplacian probability density function to our analysis of statistical properties. Since the statistical characteristics of the speech signal are differently affected by the noise types and levels, to cope with the time-varying environments, our approach is aimed at finding adaptively an appropriate statistical model in an online fashion. The performance of the proposed VAD approaches in stationary noise environment is evaluated with the aid of an objective measure.

You might also be interested in these eBooks

Advanced Materials Science and Technology, ICMST 2010

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 181-182)

Pages:

765-769

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.181-182.765

Citation:

Cite this paper

Online since:

January 2011

Authors:

Chang Peng Ji, Mo Gao, Jie Yang

Keywords:

DD, KS Test, Maximum Likelihood, Pdfs, VAD

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Y. Ephraim and D. Malah: Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE trans. Acoustic, Speech, Signal Process., Vol. 32(1984), p.1109.

DOI: 10.1109/tassp.1984.1164453

Google Scholar

[2] Y. D. Cho and A. Kondoz: Analysis and improvement of a statistical model-based voice activity detector, IEEE Signal Process. Letters, Vol. 8(2001), p.276.

DOI: 10.1109/97.957270

Google Scholar

[3] N. S. Kim and J. -H. Chang: Spectral enhancement based on global soft decision, IEEE Signal Process. Letters, Vol. 7(2000), p.108.

Google Scholar

[4] R. Martin: Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors, IEEE Int. Conf. Acoustic., Speech, Signal Process., Vol. 1(2002), p.1253.

DOI: 10.1109/icassp.2002.1005724

Google Scholar

[5] S. Gazor and W. Zhang: Speech probability distribution, IEEE Signal Process. Letter, Vol. 10(2003), p.204.

Google Scholar

[6] J. Sohn, N. S. Kim, and W. Sung: A statistical model-based voice activity detection, IEEE Signal Process. Letters, Vol. 6(1999), p.1.

Google Scholar

[7] R. C. Reininger and J. D. Gibson: Distributions of the two dimensional DCT coefficients for images, IEEE Trans. Commun., Vol. 31(1983), p.835.

DOI: 10.1109/tcom.1983.1095893

Google Scholar

[8] J. Sohn and W. Sung: A voice activity detector employing soft decision based noise spectrum adaptation, ICASSP 1998. p.365.

DOI: 10.1109/icassp.1998.674443

Google Scholar

[9] J. -H. Chang and N. S. Kim: Speech enhancement: New approaches to soft decision, IEICE Trans. Vol. 27(2001), p.1231.

Google Scholar

[10] I. Cohen and B. Berdugo: Speech enhancement for non-stationary noise environments, Signal Process., Vol. 81(2001), p.2403.

DOI: 10.1016/s0165-1684(01)00128-1

Google Scholar

[11] I. Cohen: Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator, IEEE Signal Process. Letters, Vol. 9(2002), p.113.

DOI: 10.1109/97.1001645

Google Scholar

[12] O. Cappé: Elimination of musical noise phenomenon with the Ephraim and Malah noise suppressor, IEEE Trans. Speech Audio Process., Vol. 2(1994), p.345.

DOI: 10.1109/89.279283

Google Scholar

[13] J. A. Haigh and J. S. Mason: Robust voice activity detection using cepstral feature, IEEE TELCON, China, 1993, p.321.

Google Scholar