A Novel Method to Extract Lip-Reading Features by Using LGEI and DWT

Article Preview

Abstract:

This paper proposes a novel algorithm used in extraction of lip feature extraction for to improved efficiency and robustness of lip-reading system. First, Lip Gray Energy Image (LGEI) is used to smooth noise, and improve noise resistance of the system. Second, Discrete Wavelet Analysis (DWT) is used to extract salient visual speech information from lip by decorrelating spectral information. Last, lip features are obtained by downsampling data from second step, the resample can effectively reduce the amount of computation. Experimental results show the performance of this method is exceedingly discriminative, accurate and computation efficient, the precision rate can rate 96%.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 1079-1080)

Pages:

820-823

Citation:

Online since:

December 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] X. P. Hong, H. X. Yao, Q. H. Liu, R. Chen. An information acquiring channel lip movement[C]. ICACII, 2005: 232-238.

Google Scholar

[2] M. Leszczynski, W. Skarbek. Viseme recognition - a comparative study[C]. In AVSS-Advanced Video and Signal Based Surveillance, 2005: 287-292.

DOI: 10.1109/avss.2005.1577282

Google Scholar

[3] Kaynak M N, Zhi Q, Cheok A D, et a1. Analysis of lip geometric features for audio—visual speech recognition[J]. IEEE Transactions on System, Man, and Cybernetics, Part A: Systems and Humans, 2004, 34(4): 564—570.

DOI: 10.1109/tsmca.2004.826274

Google Scholar

[4] W. Wang, D. Cosker, Y. Hicks, S. Saneit, J. Chambers. Video assisted speech source separation[C]. Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on, 2005: 425-428.

DOI: 10.1109/icassp.2005.1416331

Google Scholar

[5] Matthews I, Cootes T F, Bangham J A, et. al, Extraction of visual features for lipreading. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, Vol. 24 (2): 198-213.

DOI: 10.1109/34.982900

Google Scholar

[6] Tamura S, 1wano K, Furui S. A robust multi-modal speech recognition method using optical-flow analysis[C]. In: Proceedings of ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments. Kioster Irsee, Germany, 2002. 2-4.

DOI: 10.1007/978-1-4757-6363-8_4

Google Scholar

[7] Yaling Liang, Minghui Dui. Lip-reading method based on gray energy diagram[J]. Journal of South China University of Technology(Social Science Edition). 2011. 7, Vol 39 No. 7: 88-94.

Google Scholar

[8] Cootes T F, Hill A, Taylor C J, et a1. the use of active shape models for locating structures in medical images[J]. Image and Vision Computing, 1994, 12(6): 355-366.

DOI: 10.1016/0262-8856(94)90060-4

Google Scholar