A Method of the Post-Processing of Printed Formula Extraction

Article Preview

Abstract:

As an important step of printed formula recognition system, formula extraction locates the formula fields on the layout images of printed documents, which influences the performance of formula recognition to a great extent. However, the errors of automatic formula extraction occur inevitably because of the complexity of formulas themselves and the layouts which the formulas situated. To solve this problem, this paper designed a post-processing method to correct the errors existing in the results of formula extraction algorithm according to relative layout knowledge. First of all, the geometrical features of various layout fields were employed to correct the extraction errors. Then, the syntax rules were used to check the boundary components of different kinds of areas in layouts to identify which field it should belong to. Finally, the formula area was adjusted according to above mentioned information.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2878-2881

Citation:

Online since:

December 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] R. H. Anderson: Syntax-directed recognition of hand-printed two-dimensional mathematics, in: Interactive Systems for Experimental Applied Mathematics, Academic Press (1968).

DOI: 10.1016/b978-0-12-395608-8.50048-7

Google Scholar

[2] K. F. Chan, D. B. Yeung: Mathematical expression recognition: a survey. International Journal on Document Analysis and Recognition, Vol. 3 (2000), p.3.

Google Scholar

[3] A. Kacem, A. Belald: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context, International Journal on Document Analysis and Recognition, Vol. 4 (2001), p.97.

DOI: 10.1007/s100320100064

Google Scholar

[4] J. M. Jin, X. Han, Q. R. Wang: Mathematical formulas extraction. Proc. of the 7th International Conference on Document Analysis and Recognition, Scotland (2003), p.1138.

Google Scholar

[5] R. Fatematn, T. Tokuyasu, B. Berman, et al.: Optical character recognition and parsing of typeset mathematics. Journal of Visual Communication and Image Representation, Vol. 7 (1996), p.2.

DOI: 10.1006/jvci.1996.0002

Google Scholar

[6] H. J. Lee, J. S. Wang: Design of mathematical expression understanding system. Pattern Recognition Letters, Vol. 18 (1997), p.289.

Google Scholar

[7] U. Garain, B. B. Chaudhuri, A. Ray Chaudhuri: Identification of embedded mathematical expressions in scanned documents. Proceedings of the 17th International Conference on Pattern Recognition, University of Cambridge, UK, Vol. 1-01 (2004), p.384.

DOI: 10.1109/icpr.2004.1334132

Google Scholar

[8] K. Inoue R. Miyazaki, M. Suzuki: Optical recognition of printed mathematical documents. Proc. of the 3rd Asian Technology Conference in Mathematics, Tsukuba, Japan (1998) p.280.

Google Scholar

[9] Z. W. Zhang, F. R. Kong, W. L. Liu, et al.: Extraction of mathematical expressions in printed Chinese technical documents. Journal of Chinese Information Processing, Vol. 12 (2007), p.86.

Google Scholar

[10] Y. S. Guo, N. T. Tan, L. Huang, et al.: An identification method for mathematical expressions in scanned Chinese document. Journal of Chinese Information Processing, Vol. 22 (2008), p.83.

Google Scholar