Video Based Visual Speech Feature Model Construction

Xi Bin Jia; Mei Xia Zheng

doi:10.4028/www.scientific.net/AMM.182-183.1367

Paper Titles

Design of Peer-to-Peer Traffic Classification System Model Based on Cloud Computing
p.1347

A Novel QoS Guaranteed Cross-Layer Scheduling Scheme for Downlink Multiuser OFDM Systems
p.1352

The Research Based on RBF Neural Network in the Power of Prediction of Grain Depot
p.1358

Simulation Analysis of Guided Wave Testing on Wind Turbine Blade Shell with Epoxy Resin and Semi-Analytical Finite Element Modeling
p.1362

Video Based Visual Speech Feature Model Construction
p.1367

A Fuzzy Robust Model to a Project Decision Management for DMSMS Problem
p.1372

Control System Design and Research of Parallel Ship Motion Simulator
p.1377

A Simulation and Verification Tool for Online Scheduling Algorithms Based on Component Technology
p.1383

An Integrative Model for Self-Service Technology Adoption
p.1387

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 182-183Video Based Visual Speech Feature Model...

Video Based Visual Speech Feature Model Construction

Abstract:

This paper aims to give a solutions for the construction of chinese visual speech feature model based on HMM. We propose and discuss three kind representation model of the visual speech which are lip geometrical features, lip motion features and lip texture features. The model combines the advantages of the local LBP and global DCT texture information together, which shows better performance than the single feature. Equally the model combines the advantages of the local LBP and geometrical information together is better than single feature. By computing the recognition rate of the visemes from the model, the paper shows the HMM which describing the dynamic of speech, coupled with the combined feature for describing the global and local texture is the best model.

You might also be interested in these eBooks

Applied Mechanics and Mechatronics Automation

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 182-183)

Pages:

1367-1371

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.182-183.1367

Citation:

Cite this paper

Online since:

June 2012

Authors:

Xi Bin Jia, Mei Xia Zheng

Keywords:

Feature Analysis, HMM, Visual Speech

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] HongXun Yao, Wen Gao, Rui Wang. Acta Electronic Sinica. Vol 29. pp.239-249(2009). (in chinese).

Google Scholar

[2] H. Ertan Çetingül, Yücel Yemez, Engin Erzin and A. Murat Tekalp: Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading, Vol 15 of IEEE Transactions On Image Processing(2006).

DOI: 10.1109/tip.2006.877528

Google Scholar

[3] G. Potamianos, C. Neti, G. Gravier, A. Garg and A.W. Senior: Recent Advances in the Automatic Recognition of Audio-visual Speech, Proceedings of the IEEE. Vol 91(2003).

DOI: 10.1109/jproc.2003.817150

Google Scholar

[4] L. YePin, L. FengTing, C. ZhaoLong, ZH. RenYi. Journal of China Institute of Communications. Vol 25. pp.106-116(2004). (in chinese).

Google Scholar

[5] YunLong Wei , Mei Xie, Rui Sun, Tao Li: Face Location with LBP Scale Transform, IEEE, pp.347-350(2010).

DOI: 10.1109/icccas.2010.5581980

Google Scholar

[6] M. Li, RC. Staunton: Optimum Gabor Filter Design and Local Binary Patterns for Texture Segmentation, Pattern Recognition Letters. Vol 29. pp.664-672(2008).

DOI: 10.1016/j.patrec.2007.12.001

Google Scholar

[7] Kaynak, Zh. Qi: Analysis of Lip Geometric Features for Audio-visual Speech Recognition, IEEE Transactions on Systems Man and Cybernetics. Vol 34. pp.564-570(2004).

DOI: 10.1109/tsmca.2004.826274

Google Scholar

[8] J. Gao, R. T Collins, A. G Hauptmann: Wactlar H D Articulated Motion Modeling for Activity Analysis, Proceedings of The 2004 Conference on Computer Vision and Pattern Recognition Workshop, Washington DC, 20. ( 2004).

DOI: 10.1109/cvpr.2004.303

Google Scholar

[9] Q. Lu, Q. Ping: Applying Stochastic Process Tutorial, Tsinghua university press, BeiJing(2004).

Google Scholar

[10] JianHua Zhou. Journal of Jiamusi University(Natural Science Edition). Vol 28. pp.485-488(2010). (in chinese).

Google Scholar

[11] ZhiYong Wu, Shen Zhang, LiangHong Cai, Helen M. Real-time Synthsis of Chinese Visual Speech and Fiacal Expressions Using MPEG-4 FAP Features In a Three-dimensional Avatar. pp.1-5(2006).

DOI: 10.21437/interspeech.2006-498

Google Scholar