Recent Advances in Script Identification

Article Preview

Abstract:

There are a variety of different scripts in the world. Almost every country have there own languages and scripts which can distinguish from each other in different aspects. It is very essential to identify different scripts in multi-lingual, multi-script document. In recent years, different kinds of approaches have been developed for script identification and gotten promising results. In this paper, an overview of the script identification is proposed under different categories: script systems, extracted features and classification methods. Earlier researches and future property of this field is discussed. It is very obvious that, the research in this area is not so satisfied and still more research is to be done.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

734-740

Citation:

Online since:

August 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] S. Mori, C.Y. Suen, K. Yamamoto, Historical Review of OCR Research and Development, Proc. IEEE (1992) 80 (7) 1029- 1058.

DOI: 10.1109/5.156468

Google Scholar

[2] A. Spitz, Multilingual Document Recognition, Electronic Publishing, Document Manipulation, and Typography, R. Furuta, ed. Cambridge Univ. Press (1990) 193-206.

Google Scholar

[3] A. Spitz, Determination of the script and language content of document images , IEEE Transact ions on Pattern Analysis and Machine Intelligence (1997) 19 (3) 235-245.

DOI: 10.1109/34.584100

Google Scholar

[4] U. Pal and B.B. Chaudhuri, Indian Script Character Recognition: A Survey, Pattern Recognition (2004) 1887-1899.

DOI: 10.1016/j.patcog.2004.02.003

Google Scholar

[5] G. D. Joshi, S. Garg and J. Sivaswamy, Script Identification from Indian Documents — Proc. IAPR Intl Workshop Document analysis Systems (Feb. 2006) 255-267.

DOI: 10.1007/11669487_23

Google Scholar

[6] M. Swamy Das, D. Sandhya Rani, C. R. K. Reddy, Heuristic based Script Identification from Multilingual Text Documents, IEEE, 1st Int. Conf on Recent Advances in Information Technology (RAIT) (2012).

DOI: 10.1109/rait.2012.6194627

Google Scholar

[7] Shamita Ghosh and Bidyut B. Chaudhuri, Composite Script Identification and Orientation Detection for Indian Text Images, IEEE Intl. Conf. on Document Analysis and Recognition (2011).

DOI: 10.1109/icdar.2011.67

Google Scholar

[8] S. Chanda, U. Pal, K. Franke, F. Kimura, Script Identification – A Han & Roman Script Perspective, Pattern Recognition (ICPR) (2010) 2708 – 2711.

DOI: 10.1109/icpr.2010.1127

Google Scholar

[9] U. Pal and B.B. Chaudhuri, Automatic Identification of English, Chinese, Arabic Devnagari and Bangla Script Line, Proc. Sixth Int'l Conf. Document Analysis and Recognition (2001) 790-794.

DOI: 10.1109/icdar.2001.953896

Google Scholar

[10] K. Prakash, Aithal Rajesh, U. G. Dinesh, M. A. Krisnamoorthi, N. V. Subbareddy, Text Line Script Identification for a Trilingual Document, IEEE, 2nd Int. Conf. on Computing, Communication and Networking Technologies (2010).

DOI: 10.1109/icccnt.2010.5592562

Google Scholar

[11] S. Chanda, O. R. Terrades and U. Pal, SVM Based Scheme for Thai and English Script Identification, In Proc. ICDAR (2007) 551-555.

DOI: 10.1109/icdar.2007.4378770

Google Scholar

[12] M. C. Padma, P. A. Vijaya, Monothetic Separation of Telugu, Hindi and English Text Lines from a Multi Script Document, Systems, Man and Cybernetics (2009) 4870 – 4875.

DOI: 10.1109/icsmc.2009.5346045

Google Scholar

[13] R. Gopakumar, N. V. Subareddy, K. Makkithaya, U. D. Acharya, Zone-based Structural feature extraction for Script Identification from Indian Documents, IEEE, 5th Int. Conf. on Industrial & Information Systems (2010).

DOI: 10.1109/iciinfs.2010.5578668

Google Scholar

[14] S. A. Angadi, M. M. Kodabagi, A fuzzy approach for word level script identification of text in low resolution display board images using wavelet features Advances in Computing , Communications and Informatics (ICACCI) (2013) 1804 – 1811.

DOI: 10.1109/icacci.2013.6637455

Google Scholar

[15] M. Benjelil, R. Mullot, and A. Alimi, Language and script identification based on Steerable Pyramid Features, in proc. of ICFHR (2012) 712-717.

DOI: 10.1109/icfhr.2012.226

Google Scholar

[16] S. Chanda, S. Pal, K. Franke and U. Pal, Two-stage Approach for Word-wise Script Identification, in Proc. ICDAR (2009) 926-930.

DOI: 10.1109/icdar.2009.239

Google Scholar

[17] A. Saidani, A. K. Echi, A. Belaid, Identification of Machine-printed and Handwritten Words in Arabic and Latin Scripts, Document Analysis and Recognition (ICDAR) (2013) 798 – 802.

DOI: 10.1109/icdar.2013.163

Google Scholar

[18] B. V. Dhandra, M. Hangarge, R. Hegadi and V. S. Malemath, Word Level Script Identification in Bilingual Documents through Discriminating Features, IEEE - ICSCN (2007) 630-635.

DOI: 10.1109/icscn.2007.350686

Google Scholar

[19] R. Rani, R. Dhir, G. S. Lehal, Script Identification of Pre-Segmented Multi-Font Characters and Digits, Document Analysis and Recognition (ICDAR) (2013) 1150 – 1154.

DOI: 10.1109/icdar.2013.233

Google Scholar

[20] L. Lam, J. Ding, and C. Y. Suen, Differentiating between Oriental and European Scripts by Statistical Features, Int. J. Pattern Recognition and Artificial Intelligence, 12 (1998) 63-79.

DOI: 10.1142/s0218001498000063

Google Scholar

[21] U. Pal, B. B. Choudhuri, Script line separation from Indian multi-Script documents, Proc. of fifth Intl. Conf. on Document Analysis and Recognition (IEEE computer society press) (1999) 406-409.

DOI: 10.1109/icdar.1999.791810

Google Scholar

[22] J. Cheng, X. Ping, G. Zhou, and Y. Yang, Script Identification of Document Image Analysis, Proc. Int'l Conf. Innovative Computing, Information, and Control (2006) 178-181.

DOI: 10.1109/icicic.2006.518

Google Scholar

[23] P. K. Aithal, G. Rajesh, Dinesh U. Acharya, M. Krisnamoorthi , N.V. Subbareddy, Text Line Script Identification for a Trilingual Document, IEEE, 2nd Intl. Conf. on Computing, Communication and Networking Technologies (2010).

DOI: 10.1109/icccnt.2010.5592562

Google Scholar

[24] S. Kanoun, A. Ennaji, Y. LeCourtier and A. M. Alimi, Script and Nature Differentiation for Arabic and Latin Text Images, In Proc. IWFHR (2002) 309-313.

DOI: 10.1109/iwfhr.2002.1030928

Google Scholar

[25] U. Pal, S. Sinha and B. B Chaudhuri, Word-wise Script identification from a document containing English, Devnagari and Telgu Text, Proc. of NCDAR (2003) 213- 220.

Google Scholar

[26] B. V. Dhandra, H. Maliikarjun, R. Hegadi, V. S. Malemath, Word-wise script identification based on morphological reconstruction in printed bilingual documents (2006).

DOI: 10.1049/cp:20060562

Google Scholar

[27] A. Busch, W. W. Bolse, Sridharan. Texture for script identification, IEEE Transaction on Pattern Analysis and Machine Intelligence (2005) 1720-1732.

DOI: 10.1109/tpami.2005.227

Google Scholar

[28] A. Busch, Multi-Font Script Identification Using Texture-Based Features, Proc. Int'l Conf. Image Analysis and Recognition (2006) 844- 852.

DOI: 10.1007/11867661_76

Google Scholar

[29] P. S Hiremath, S. Shivashankar, Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image, Pattern Recognition Letters (2008) 1182-1189.

DOI: 10.1016/j.patrec.2008.01.012

Google Scholar

[30] P. S. Hiremath, S. Shivashankar, J. D. Pujari, V. Mouneswara, Script Identification in a handwritten document image using texture features, IEEE, Proc. 2nd Intl. Advance Computing Conf. (2010).

DOI: 10.1109/iadcc.2010.5423028

Google Scholar

[31] M. C. Padma, P. A. Vijaya, Entropy Based Texture Features Useful for Automatic Script Identification (2010).

Google Scholar

[32] B. V. Dhandra and M. Hangarge, Global and Local Features Based Handwritten Text Words and Numerals Script identification, in Proc. of ICCIMA (2007) 471-475.

DOI: 10.1109/iccima.2007.125

Google Scholar

[33] J. Gllavata and B. Freisleben, Script Recognition in Images with Complex Backgrounds, In Proc. IEEE International Symposium on Signal Processing and Information Technology (2005) 589-594.

DOI: 10.1109/isspit.2005.1577163

Google Scholar

[34] P.B. Pati and A.G. Ramakrishnan, HVS Inspired System for Script Identification in Indian Multi-Script Documents, Proc. Int'l Workshop Document Analysis Systems (2006) 380-389.

DOI: 10.1007/11669487_34

Google Scholar

[35] D. Zhao, P. Shivakumara, Shijian Lu, C. L. Tan, New Spatial-Gradient-Features for Video Script Identification, Document Analysis Systems (DAS) (2012) 38 – 42.

DOI: 10.1109/das.2012.57

Google Scholar

[36] V. Ablavsky and M.R. Stevens, Automatic Feature Selection with Applications to Script Identification of Degraded Documents, Proc. Int. Conf. Document Analysis and Recognition (2003) 750-754.

DOI: 10.1109/icdar.2003.1227762

Google Scholar

[37] S. Chanda, O. R. Terrades and U. Pal, SVM Based Scheme for Thai and English Script Identification, In Proc. ICDAR (2007) 551-555.

DOI: 10.1109/icdar.2007.4378770

Google Scholar

[38] Kolkata, Word-wise Sinhala Tamil and English Script Identification Using Gaussian Kernel SVM, 19th International Conference on Pattern Recognition (ICPR 2008).

DOI: 10.1109/icpr.2008.4761823

Google Scholar

[39] S. Chanda, K. Franke, U. Pal, Identification of Indic Scripts on Torn-Documents, Document Analysis and Recognition (ICDAR) (2011) 713 – 717.

DOI: 10.1109/icdar.2011.149

Google Scholar

[40] N. Sharma, S. Chanda, U. Pal, Blumenstein, Word-wise Script Identification from Video Frames, Document Analysis and Recognition (ICDAR) (2013) 867 – 871.

DOI: 10.1109/icdar.2013.177

Google Scholar

[41] J. J. Pan, Y. Y. Tang, A rotation-robust script identification based on BEMD and LBP, In Proc. ICWAPR (2011) 165-170.

DOI: 10.1109/icwapr.2011.6014479

Google Scholar

[42] S. B. Patil and N.V. SubbaReddy, Neural network based system for script identification in Indian documents (2002) 83-97.

DOI: 10.1007/bf02703314

Google Scholar

[43] K. Roy, U. Pal, and B. B. Chaudhuri, Neural Network Based Word-Wise Handwritten Script Identification System for Indian Postal Automation, Proc. Int'l Conf. Intelligent Sensing and Information Processing (2005) 240-245.

DOI: 10.1109/icisip.2005.1529455

Google Scholar

[44] K. Roy, S. K. Das, S. M. Obaidullah, Script Identification from Handwritten Document, Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG) (2011) 66 – 69.

DOI: 10.1109/ncvpripg.2011.22

Google Scholar

[45] D. Ghosh, T. Dube and A. P. Shivaprasad, Script Recognition-A Review, IEEE Transactions on PAMI (2010) 2142-2161.

Google Scholar

[46] Trung Quy Phan, P. Shivakumara, Zhang Ding, Shijian Lu, C. L. Tan, Video Script Identification based on Text Lines, Document Analysis and Recognition (ICDAR) (2011) 1240 – 1244.

DOI: 10.1109/icdar.2011.250

Google Scholar

[47] L. Li and C. L. Tan, Script Identification of Camera-based Images, In Proc. ICPR (2008) 111-117.

Google Scholar