An Improved Method for Mathematical Formula Extraction in Printed English and Chinese Documents

Article Preview

Abstract:

Accurately locating mathematical formulas in scientific documents is the basis of their recognition. The existing formula extraction methods mostly aim at the documents in one language, which is inadaptable to the documents in other languages. This paper describes an improved method to extract formulas not only in Chinese but also in English documents. First, using run-number as the features to distinguish the documents’ language; and then according to the difference between Chinese and English documents, corresponding features and parameters are chosen for the formula extraction. The experimental results show that this method can improve the robustness of formula extraction.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1174-1179

Citation:

Online since:

January 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2010 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] H. J. Lee, J. S. Wang: Design of a Mathematical Expression Recognition System(Proceedings of 3rd International Conference on Document Analysis and Recognition, Canada 1995).

Google Scholar

[2] Richard J. Fateman: How to Find Mathematics on a Scanned Page, submitted to Technical Report (1996).

Google Scholar

[3] J.M. Jin, H.Y. Jiang et al. Mathematical Expression Recognition System: MatheReader. Chinese Journal of Computers, Vol. 29 (2006), p.2018-(2026).

Google Scholar

[4] Z.W. Zhang, F.R. Kong et al: Extraction of Mathematical Expression in Printed Chinese Technical Documents. Journal of Chinese Information Processing, Vol. 21 (2007), pp.86-91.

Google Scholar

[5] C.S. Wu: A Study of Language Recognition System for Documents Images (Dissertation of Sichuan University, Sichuan 2005).

Google Scholar

[6] L.B. Wang: Formulas Extraction and Symbols Lactation in Printing Mathematic Expression Recognitions (Dissertation of Harbin Engineering University, Harbin 2004).

Google Scholar

[7] F. Li, W. Wu: A Mathematical Formula Localization Method in English Scientific Document Recognition. Journal of Dalian University of Technology, Vol. 49 (2009), pp.139-143.

Google Scholar

[8] F. Li: Extraction, Recognition and Reconstruction of Mathematics Formulas in English Scientific Document, Dalian University of Technology, (2007).

Google Scholar

[9] K.J. Wang, L.B. Wang et al: Technique Summarize on Mathematics Formulas Location in Science Documents. Techniques of Automation and Applications, Vol. 23 (2004), pp.1-4.

Google Scholar

[10] X.F. Chang, J. Cui et al: Research on Mathematical Formulas Extraction from Printed Document based on Neural Network. Application Research of Computers, Vol. 25 (2008), pp.3483-3485, 3500.

Google Scholar