An OCR Post-Processing Method Based on Dictionary Matching and Matrix Transforming

Article Preview

Abstract:

This paper describes a post-processing method for Chinese and Japanese character recognition based on dictionary. By the analysis results of recognition in the processing of OCR, we can find some segmentation and recognition errors do not conform to the rules of lexical and just recognized as the characters which its fonts approach to the scanned texts. For these errors we can deal with them by the Fix Length Segmentation Matching based on Dictionary and the Glyph Code Matrix Transforming. Through the above processing, most of the inaccurate recognitions can be corrected and by the experimental results, it can be proved that this method is an effective way to improve the recognition rate of Chinese and Japanese Character.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1861-1865

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Rainer Hoch, Hans-Gunther Hein, and Thomas Kieninger, Using a Partitioned Dictionary for Contextual Post-Processing of OCR-Results, The 12th IAPR International Conference on Computer Vision & Image Processing. San Antonio, Texas, USA, Vol. 3, pp.274-278, Oct (1994).

DOI: 10.1109/icpr.1994.576918

Google Scholar

[2] Li Zhuang, TaBao, Xiaoyan Zhu, Chunheng Wang, and Satoshi Naoi, A Chinese OCR spelling check approach based on statistical language models, International Conference on Systems, Man and Cybernetics, Hague, Netherlands, pp.4727-4732, IEEE, Oct. (2004).

DOI: 10.1109/icsmc.2004.1401278

Google Scholar

[3] S. Mori, C.Y. Suen & K. Yamamoto, Historical Review of OCR Research and Development, Proc. IEEE, vol. 80, no. 7, p.1, 029-1, 058, (1992).

Google Scholar

[4] Tsuruoka, Morita H., Kimura F., and Miyake Y. Handwritten character recognition adaptable to the writer., IEICE J-70-D, 1953–1960 (1987).

Google Scholar

[5] Q. Huo, Z. -D. Feng, and Y. Ge, A Study on the Use of Gabor Features for Chinese OCR, in Proc. ISIMP-2001 (2001 International Symposium on Intelligent Multimedia, Video & Speech Processing), pp.389-392, (2001).

DOI: 10.1109/isimp.2001.925415

Google Scholar

[6] Tetsuo Araki, Satoru Ikehara, Nobuyki Tsukakhara, and Yasunori Komatsu, An Evaluation of a Method to Detect and Correct Erroneous Characters in Japanese input through an OCR using Markov Models, " ANLC, 94 Proceedings of the fourth conference on Applied Natural Language Processing, Stroudsburg, PA, USA, Pages198-199, (1994).

DOI: 10.3115/974358.974408

Google Scholar

[7] Tao Hong, Stephen W. Lam, Jonathan J. Hull, and Sargur N. Srihari, Visual Similarity Analysis of Chinese Characters and Its Uses in Japanese OCR, The 2nd Document Recognition Conference on SPIE, Vol. 2422, pages245-253.

DOI: 10.1117/12.205827

Google Scholar

[8] Q. Huo, Y. Ge, and Z. -D. Feng, High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training, Proc. ICASSP-2001, May (2001).

DOI: 10.1109/icassp.2001.941220

Google Scholar

[9] D. Deng et al., Handwritten Chinese character recognition using spatial Gabor filters and self-organizing feature MAPS, Proc. ICIP-1994, pp.940-944, (1994).

DOI: 10.1109/icip.1994.413707

Google Scholar

[10] Tetsuo Araki, Satoru Ikehara, Nobuyki Tsukakhara, and Yasunori Komatsu, An Evaluation of a Method to Detect and Correct Erroneous Character in Japanese input through an OCR using Markvo Models, " ANLC, 94 Processding of the fourth conference on Applied Natural Language Processing, Stroudsburg, PA, USA, pp.198-199, (2004).

DOI: 10.3115/974358.974408

Google Scholar