p.857
p.861
p.866
p.871
p.875
p.881
p.885
p.889
p.895
Detecting and Normalizing Formulas in Electronic Literature Resources
Abstract:
Formulas exist in various kinds of documents with different formats. Extracting and normalizing them into a unique form are the precondition of mathematical retrieval. In this paper, an extraction and conversion method of formulas in Word documents is built for mathematical expression retrieval. Firstly, the mathematical expressions in Word documents are detected through the processing of OLE objects. Then, the matching rules of formula format conversion are defined. Finally, the extracted mathematical expressions in OMML format are converted into LaTeX format follow the defined rules and stored in a txt file. Furthermore, the formulas exist in MathType format are stored in bitmap documents and converted into LaTeX documents through formula recognition and reconstruction module. Experiments show the effectiveness of the designed approach.
Info:
Periodical:
Pages:
875-880
Citation:
Online since:
March 2015
Authors:
Keywords:
Price:
Сopyright:
© 2015 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: