User Oriented Semi-Supervised Document Clustering

Article Preview

Abstract:

In many text mining applications, it is needed to cluster documents according to demand of users. However, Traditional documents clustering that use unsupervised learning are not able to meet this demand. In this paper, a new clustering approach that focuses on the problem is proposed. Main contributions include: (1) Expresses user requirement by topic with multiple attributes (2) Annotates topic semantic by ontology, calculate dissimilarity between topic semantics and build dissimilarity matrix. Experiments show that new approach is effective.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1523-1526

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Hotho A, Maedche A, Staab S. Ontology-Based text document clustering[C]. In proceeding of the Conf. on Intelligent Information Systems. Zakopane: Springer-Verlag, (2003).

Google Scholar

[2] Ying Zhao, George Karypis. Topic-driven Clustering for Document Dataset[C]. In proceeding of SIAM Data Mining Conference, (2005).

DOI: 10.1137/1.9781611972757.32

Google Scholar

[3] K. Wagstaff, C. Cardie, S. Rogers and S. Schroedl. Constrained k-means clustering with background knowledge[C]. In proceeding of ICML, (2001).

Google Scholar

[4] Sugato Basu, Mikhail Bilenko and Raymond J. Monney. A probabilistic framework for semi-supervised clustering[C]. In proceeding of the 10th Int'l Conference on Knowledge Discovery and Data Mining, (2004).

DOI: 10.1145/1014052.1014062

Google Scholar

[5] Benjamin C.M. Fung, Ke Wang and Martin Ester. Hierarchical Document Clustering Using Frequent Itemsets[C]. In proceeding of the SIAM conference on Data Mining 2003 (SIAM'03).

DOI: 10.1137/1.9781611972733.6

Google Scholar

[6] Chihli Huang, Stefan Wermter and Peter Smith. Hybrid Neural Document Clustering Using Guided Self-Organization and WordNet[J]. IEEE Intelligent Systems, (2004).

DOI: 10.1109/mis.2004.1274914

Google Scholar

[7] Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua Li, Qiang Yang, Zheng Chen, Enhancing text clustering by leveraging Wikipedia semantics, In the Proceedings of SIGIR (2008).

DOI: 10.1145/1390334.1390367

Google Scholar