Improving Frequent-Term Based Text Clustering with Word Belief Network

Yong Zhang; Rui Fang Liu; Rui Yang Luo

doi:10.4028/www.scientific.net/AMM.411-414.207

Paper Titles

The Research on Dynamics Evolution Patterns of Online Public Opinion
p.186

EID System's Privacy Protection Enhancement Design
p.192

Structured Document Model in Digital Community
p.199

A Virtual Machine Scheduling Strategy Based on Grouping Genetic Algorithm in Cloud Environment
p.203

Improving Frequent-Term Based Text Clustering with Word Belief Network
p.207

Discussion Based on Document Management System of Chemical Pump of Sinopec
p.215

The Creation of Ergonomics Database Using Ergo&Log^© Analytical Application
p.219

An Assessment and Intervention of Sub-Health Management Information System Based on Lightweight Java EE Framework
p.223

Computer Aided Organic Synthesis Based on Graph Grammars
p.227

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 411-414Improving Frequent-Term Based Text Clustering with...

Improving Frequent-Term Based Text Clustering with Word Belief Network

Abstract:

The algorithm of frequent-term based text clustering (FTC) can be applied to news topic clustering system, in order to help users locate interested topics and articles quickly. But it is difficult to set support threshold for mining association rules. This paper tries to build a word belief network, which satisfies basic rules of small worlds. So we can improve FTC algorithm with characteristics of small worlds and implement text clustering quickly. The paper puts forward an idea that adopts inverted index into this algorithm, which simplifies programming and improves operation efficiency. The experimental results verified that the system could find current hot news topics efficiently and users could locate their interested document collection.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 411-414)

Pages:

207-214

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.411-414.207

Citation:

Cite this paper

Online since:

September 2013

Authors:

Yong Zhang, Rui Fang Liu, Rui Yang Luo

Keywords:

FTC Algorithm, Inverted Index, Small Worlds, Topic Clustering, Word Belief Network

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Xiaoyun Chen. The Key Techniques Research on Text Mining [D]. (2005).

Google Scholar

[2] Jun Guo. Web Search [M], 2009, pp.23-25.

Google Scholar

[3] Tao Liu, Shengping Liu, Zheng Chen, Wei-Ying Ma. An Evaluation on Feature Selection for Text Clustering. Proceedings of International Conference on Machine Learning-ICML 2003. p.488–495.

Google Scholar

[4] Florian Beil, Martin Ester, Xiaowei Xu. Frequent Term-Based Text Clustering. Proceedings of ACM SIGKDD 2002, pp.436-442.

DOI: 10.1145/775047.775110

Google Scholar

[5] Watts, J.W., Strogatz, S.H. 1998. Nature 393: 440-442.

Google Scholar

[6] Maoting Gao, Zhengou Wang. Comparing Dimension Reduction Methods of Text Feature Matrix [J]. COMPUTER ENGINEERING AND APPLICATIONS. (2006).

Google Scholar

[7] Chin-Chen Chang, Yu-Chiang Li, Jung-San Lee. An Efficient Algorithm for Incremental Mining of Association Rules. 15th International Workshop on RIDE-SDMA, Apr. 2005, pp.3-10.

DOI: 10.1109/ride.2005.6

Google Scholar

[8] Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining [M], 2011, pp.276-285.

Google Scholar

[9] Tom M. Mitchel, Machine Learning [M], 2003, pp.3-8.

Google Scholar

[10] Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze. Introduction to Information Retrieval [M], 2010, pp.26-32.

Google Scholar

[11] http: /www. datatang. com/data/12272.

Google Scholar

[12] http: /xapian. org/docs/apidoc/html/annotated. html. XAPIAN API.

Google Scholar

[13] http: /www. sogou. com/labs/dl/tdte. html.

Google Scholar