Speech Recognition for Endoscopic Automatic Positioning System


Article Preview

A novel system for minimally invasive surgery is presented in this paper. The system utilized an Endoscopic Automatic Positioner (EAP) controlled by Speech Recognition Engine to implement the clamping and dynamically positioning of the laparoscope. The motion instructions of the EAP are transformed from voice commands of specific doctor recognized by an improved algorithm named Normalized Average- Dynamic Time Warping (NA-DTW). An embedded platform based on ARM is designed to run the NA-DTW on Windows CE operating system. 1250 groups of experiments from 10 individual speakers demonstrate the performance of DTW. Compared with traditional algorithms, the enhanced algorithm improves the recognition rate from 96.6% to 99.76% and shortens the time of calculation by 51%. The results demonstrate the enhanced algorithm being effective and can satisfy the real time requirement in embedded system.



Advanced Materials Research (Volumes 588-589)

Edited by:

Lawrence Lim




N. Ma et al., "Speech Recognition for Endoscopic Automatic Positioning System", Advanced Materials Research, Vols. 588-589, pp. 1296-1299, 2012

Online since:

November 2012




[1] Schuller, S. Can, H. Feussner, M. W¨ollmer, et al. Speech control in surgery: a field analysis and strategies[J]. In Proc. ICME, 2009: 1214-1217.

[2] LI Zhen-jing, WANG Guo-yin, YANG Yong. Improved Spectral Subtraction Method Based on Spectral Entropy Noise Estimation [J]. Computer Engineering. 2009, 35(18):165-166.

[3] ZHANG Zhen, WANG Hua-qing. Improved algorithm of Mel- Frequence Cepstral Coefficients in characteristics extraction based on voice signal. Computer Engineering and Applications, 2008, 44 ( 22) : 54- 55.

[4] C. Myers, L.R. Rabinar ,A.E. Rosenberg. Performance Tradeoffs in Dynamic Time Warping Algorithm for Isolated Word Recognition[J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1980, 28: 623-635.

DOI: https://doi.org/10.1109/tassp.1980.1163491

[5] Zhang Yuxin , Miyanaga Y. An improved dynamic time warping algorithm employing nonlinear median filtering[C]. IEEE Conferences on Communications and Information Technologies (ISCIT), Hangzhou, 2011: 439-442.

DOI: https://doi.org/10.1109/iscit.2011.6089967

[6] T. Zaharia,S. Segarceanu,M. Cotescu , et al. Quantized dynamic time warping (DTW) algorithm[C]. The 8th International Conference on Communications (COMM), Bucharest, 2010: 91-94.

DOI: https://doi.org/10.1109/iccomm.2010.5509068

[7] HANG Ji-qing, ZHANG Lei , Speech signal processing. Tsinghua University Publications, Bei Jing: (2004).

[8] W. Abdulla, D. Chow, G. Sin. Cross-Words Reference Template for DTW-based Speech Recognition Systems[C]. IEEE TENCON, Conferences on Convergent Technologies for Asia-Pacific Region, Bangalore, 2003: 1576 - 1579 , Vol. 4.

DOI: https://doi.org/10.1109/tencon.2003.1273186

[9] WANG Bing, LI Cun-bin, CHEN Peng,. EVC senior programming and application development. China WaterPower Press,Bei Jing , (2005).