Detecting and Normalizing Formulas in Electronic Literature Resources

Article Preview

Abstract:

Formulas exist in various kinds of documents with different formats. Extracting and normalizing them into a unique form are the precondition of mathematical retrieval. In this paper, an extraction and conversion method of formulas in Word documents is built for mathematical expression retrieval. Firstly, the mathematical expressions in Word documents are detected through the processing of OLE objects. Then, the matching rules of formula format conversion are defined. Finally, the extracted mathematical expressions in OMML format are converted into LaTeX format follow the defined rules and stored in a txt file. Furthermore, the formulas exist in MathType format are stored in bitmap documents and converted into LaTeX documents through formula recognition and reconstruction module. Experiments show the effectiveness of the designed approach.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

875-880

Citation:

Online since:

March 2015

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] LaTeX-A Document Preparation System on http: /www. latex-project. org.

Google Scholar

[2] J. Nie, T. Y. Chen and H. G. Fu. Journal of Computer Applications Vol. 30(2) (2010), pp.312-315. In Chinese.

Google Scholar

[3] Y. P. Qin, Y. W. Tang, S. X. Lun and X. K. Wang. Computer Science Vol. 40(5) (2013), p.251, 252, 278. In Chinese.

Google Scholar

[4] L. A. Sobreviela: A Reduce-based OpenMath↔ MathML Translator. ACM SIGSAM Bulletin Vol 34(2) (2000), pp.31-32.

DOI: 10.1145/362001.362018

Google Scholar

[5] C. M. So and S.M. Watt: On the Conversion between Content Mathml and Openmath. International Conference on Applications of Computer Algebra. Raleigh, North Carolina. July (2003).

Google Scholar

[6] H. Stamerjohanns, D. Ginev, C. David, et al: MathML-aware Article Conversion from LaTeX. Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009, (2009), pp.109-120.

Google Scholar

[7] A dynamic LaTeX Mathematics to Mathml Converter on http: /www. Maths. nottingham. ac. uk/ personal /drw/lm. html.

Google Scholar

[8] S. B. Sun, C. J. Hua. Computer Application Vol. 17(5) (1997), pp.54-55. In Chinese.

Google Scholar

[9] H. F. Wang, R. Ran, Y. Y. Dai and Y. Zhang. Computer Systems Application Vol. 11 (2008), pp.76-79. In Chinese.

Google Scholar

[10] H. J. Wang. Journal of Lanzhou Institute of Technology Vol. 20(2) (2013), pp.28-30. In Chinese.

Google Scholar

[11] B. Zhou. Journal of Kunming Metallurgy College Vol. 29(3) (2013), pp.45-49. In Chinese.

Google Scholar

[12] Math in Office 2010 and OMML Specification on http: /blogs. msdn. com/b/murrays.

Google Scholar

[13] Omml2mml on http: /www. dcarlisle. demon. co. uk/omml2mml.

Google Scholar

[14] Word2MathML on http: /www. codeplex. com/Word2mathml.

Google Scholar

[15] L. W. Cui: The Research on Extraction of Mathematics Expressions on Web. Lanzhou University, (2012). In Chinese.

Google Scholar

[16] F. H. Li, X. Huang. Development and Application of the Computer Vol. 20(3) (2007), pp.27-29. In Chinese.

Google Scholar

[17] L. H. Wu: Analysis and Reconstruction of Special Structure in Printed Mathematical Expressions. Hebei University, (2009). In Chinese.

Google Scholar