Chinese Semantic Word-Formation Analysis Using FKP-MCO Classifier Based on Layered and Weighted GED

Article Preview

Abstract:

For Chinese information processing, automatic classification based on a large-scale database for different patterns of semantic word-formation can remarkably improve the identification for the unregistered word, automatic lexicography, semantic analysis, and other applications. However, owing to noise, anomalies, nonlinear characteristics, class-imbalance, and other uncertainties in word-formation data, the predictive performance of multi-criteria optimization classifier (MCOC) and other traditional data mining approaches will rapidly degenerate. In this paper we put forward an novel MCOC with fuzzification, kernel, and penalty factors (FKP-MCOC) based on layered and weighted graph edit distance (GED): firstly the layered and weighted GEDs between each semantic word-formation graph and prototype graphs are calculated and used for the dissimilarity measure, then the normalized GEDs are embedded into a new feature vector space, and FKP-MCO classifier based on the feature vector space is built for predicting the patterns of semantic word-formation. Our experimental results of Chinese word-formation analysis and comparison with support vector machine (SVM) show that our proposed approach can increase the separation of different patterns, the predictive performance of semantic pattern of a new compound word.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3044-3050

Citation:

Online since:

January 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] C. Yuan: Study on Chinese word-formation: Applied linguistics Vol. 1 (2000), p.13.

Google Scholar

[2] Y. Yuan: The Fineness Hierarchy of Semantic Roles and its Application in NLP, Journal of Chinese Information Processing Vol. 21-4 (2007), p.10.

Google Scholar

[3] Y. Shao, Z. Sui, Y. Wu: Chinese Semantic Role Labeling Based on Lexical Semantic Features, Journal of Chinese Information Processing Vol. 23-6 (2009), p.3.

Google Scholar

[4] S. Kang et al.: A Study on the Construction of a Modern Chinese Semantic Corpus, Recent Advance of Chinese Computing Technologies, Proceedings of the 7th International Conference of Chinese Computing, Singapore (2008).

Google Scholar

[5] D. Sun, B. Li: A Study on the Lexical-semantic and Syntactic-semantic, Cohesion in Verb-core Structures, Applied Linguistics Vol. 1 (2009), p.134.

Google Scholar

[6] K. Borgwardt, C. Ong, S. SchÄonauer, S. Vishwanathan, A. Smola, and H.P. Kriegel: Protein function prediction via graph kernels, Bioinformatics Vol. 21-1 (2005), p.47.

DOI: 10.1093/bioinformatics/bti1007

Google Scholar

[7] A. Schenker, M. Last, H. Bunke, and A. Kandel: Classification of web documents using graph matching, Int. Journal of Pattern Recognition and Artificial Intelligence Vol. 18-3 (2004), p.475.

DOI: 10.1142/s0218001404003241

Google Scholar

[8] K. Riesen, H. Bunke: Approximate graph edit distance computation by means of bipartite graph matching, Image and Vision Computing Vol. 27-4 (2009), p.950.

DOI: 10.1016/j.imavis.2008.04.004

Google Scholar

[9] N. Cristianini, J. Shawe-Taylor: An introduction to Support Vector Machines and other kernel-based learning methods (Cambridge University Press, Cambridge, UK 2000).

DOI: 10.1017/cbo9780511801389

Google Scholar

[10] Y. Shi, M. Wise, M. Luo, and Y. Lin: Data Mining in Credit Card Portfolio Management: A Multiple Criteria Decision Making Approach, In: M. Koksalan and S. Zionts (eds) Advance in MCDM in the New Millennium, Springer, Berlin (2001), p.427.

DOI: 10.1007/978-3-642-56680-6_39

Google Scholar

[11] Z. Zhang, S. Kang, G. Gao: Dealing with uncertainties by means of fuzzification, kernel and penalty factors in multi-criteria optimization classifier, Applied Mathematics & Information Sciences: An International Journal Vol. 6-7S (2012).

Google Scholar