Research on Uighur Web Pages Classification

Article Preview

Abstract:

Uighur web pages classification is meaningful for the Uighur information processing. In this paper, we propose a classification approach for Uighur web pages. It utilizes the combination of two methods to classify the Uighur web pages into the predefined classes. One is the classification method based on Column Navigator of web page. The other is the content classification method based on the classes feature dictionary. Based on the proposed approach, we design the classification system of Uighur web pages. The experimental results present that the system has better performance for Uighur web pages classification. It is useful and helpful for the construction of high-quality Uighur corpus, Uighur information retrieval as well as Uighur text mining.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

657-662

Citation:

Online since:

February 2014

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Chen yan and Woxur Silamu, Uyghur Online Input Technology and Its Implementation Based on Web Page. Modern electronics technique, 2012(33): 132-133.

Google Scholar

[2] Gulila Adongbieke and Mijit Ablimit, Research on Uighur Word Segmentation. Journal of Chinese information processing, 2004(18): 61-65.

Google Scholar

[3] Aisaitijiang and Aibaidula, The Design of Uighur Search Engine and It's Carrying - Out. Journal of XINJIANG education institute, 2004(20) : 102-106.

Google Scholar

[4] Cheng Lizhen and Kamil Moydin, Uigur Storage and Participle on the Uigur Web Information Retrieval System. Journal of Xinjiang University, 2006(23): 90-92.

Google Scholar

[5] Helqem. aisa and Winira. musajan, The Research of Automated Text classification in the Uyghur Information Retrieval. Computer Knowledge And Technology, 2011(21): 192-193.

Google Scholar

[6] Wang zhen, Winira Musajan, Zhao lihong, The Research of Automatic Classification in Uyghur Kazak Kirgiz Multiliteral Search Engine. the Joint Conference of the third Minority youth natural language information processing conference and the Second National multi-lingual Knowledge Base construction conference, 2010: 106-110.

Google Scholar

[7] Li Yanjiao, Jiang Tonghai, Uyghur text classification model based on improved weighted Bayes. Computer Engineering and Design, 2012, 33(12).

Google Scholar

[8] Ahmatjan Ablat, Turdi Tohti, Askar Hamdulla, Uyghur text classification based on naive bayes and its performance analysis. Computer Application and Software, 2012, 29(12), pp.27-29.

Google Scholar

[9] Wu junsen and Turgun Ibrahim, Research of the Uighur Content- Based Text Retrieval System. Modern Computer, 2006(25): 90-92.

Google Scholar

[10] Liu jianming, Turgun Ibrahim, Hasan Umar, Research on Statistical Machine Translation Based Chinese-Uyghur Word Alignment. Computer Applications and Software, 2011(28): 57-59.

Google Scholar

[11] Wang Jiying, LOCHOVSKY FH, Data-rich section extraction from HTML pages. The 3rd International Conference on Web Information Systems Engineering. Singapore: IEEE Computer Society, 2002: 313- 322.

DOI: 10.1109/wise.2002.1181667

Google Scholar

[12] Ou Jianwen, Dong Shoubin, Cai Bin, Topic information extraction from template web pages. Journal of Tsing hua University (Science & Technology), 2005, 45(9 ): 1743- 1747.

Google Scholar

[13] Wang Qi, Tang Shiwei, Yang Dongqing, Wang Tengjiao, Dom-Based Automatic Extraction of Topical Information from Web pages. Journal of Computer Research and Development, 2004, 41(10), pp.1786-1795.

Google Scholar

[14] Wang Shu, Zhu Min, Zhang Ming, Niu Hao, Content extraction of Web pages based on characteristic symbols. Application Research of Computers, 2009, 12(6): 4539-4511.

Google Scholar

[15] Hu Jinzhu, Zhou Xin, Shu Jianxiu, Approach of pinpointing subject information in Web pages based on heuristic rules. Application Research of Computers, 2010, 27( 2): 494-497.

Google Scholar

[16] A. Sun, E. -P. Lim, and W. -K. Ng, Web classification using support vector machine. Proceedings of the 4th International Workshop on Web Information and Data Management (WIDM 2002), (2002).

DOI: 10.1145/584931.584952

Google Scholar

[17] Z. Xu, I. King, and M. R. Lyu, Web Page Classification with Heterogeneous Data Fusion. WWW 2007 / Poster Paper, (2007).

Google Scholar

[18] M. I. Devi, R. Rajaram, and K. Selvakuberan, Generating best features for web page classification. Webology, vol. 5(1), (2008).

Google Scholar

[19] Dong Danian. Modern Chinese classification dictionary. 2007, 11.

Google Scholar

[20] V. Vapnik, Statistical Learning theory, (1998).

Google Scholar