Document Clustering Based on Fuzzy Similarity

Jing Li Zhou; Xue Jun Nie; Lei Hua Qin; Jian Feng Zhu

doi:10.4028/www.scientific.net/AMM.29-32.2620

Paper Titles

STEP-Based Feature Modeller for STEP Compliant CNC
p.2597

Research on Forecasting Method Based on Genetic Algorithms and Support Vector Machines
p.2603

Secure Image Management Based on Deluge Protocol in Wireless Sensor Networks
p.2608

Creep Characteristics Analysis of Oil Shale and its Application
p.2614

Document Clustering Based on Fuzzy Similarity
p.2620

An Improved Viscous-Spring Artificial Boundary Model
p.2627

Study on a New Sine Raster Generating Algorithm for Phase-Based Matching
p.2633

Bioinformatics Analysis of Gibberella moniliformis Phosphoenolpyruvate Carboxykinase Gene
p.2639

Bioinformatic Analysis of Chitin Deacetylase in Rhizopus oryzae
p.2644

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 29-32Document Clustering Based on Fuzzy Similarity

Document Clustering Based on Fuzzy Similarity

Abstract:

This paper proposes a novel fuzzy similarity measure based on the relationships between terms and categories. A term-category matrix is presented to represent such relationships and each element in the matrix denotes a membership degree that a term belongs to a category, which is computed using term frequency inverse document frequency and fuzzy relationships between documents and categories. Fuzzy similarity takes into account the situation that one document belongs to multiple categories and is computed using fuzzy operators. The experimental results show that the proposed fuzzy similarity surpasses other common similarity measures not only in the reliable derivation of document clustering results, but also in document clustering accuracies.

You might also be interested in these eBooks

Applied Mechanics And Mechanical Engineering

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 29-32)

Pages:

2620-2626

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.29-32.2620

Citation:

Cite this paper

Online since:

August 2010

Authors:

Jing Li Zhou, Xue Jun Nie, Lei Hua Qin, Jian Feng Zhu

Keywords:

Document Clustering, Fuzzy Similarity, Mutual Information (MI)

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] C. Carpineto, S. Osiski, G. Romano and D. Weiss. A Survey of Web Clustering Engines. ACM Computing Surveys, v 41, n 3, p.17: 1-17: 38, July (2009).

DOI: 10.1145/1541880.1541884

Google Scholar

[2] E. Fersini, E. Messina and F. Archetti. A Probabilistic Relational Approach for Web Document Clustering. Information Processing and Management, v 46, n 2, pp.117-130, March (2010).

DOI: 10.1016/j.ipm.2009.08.003

Google Scholar

[3] S. Nirkhi and K. N. Hande. A Survey on Clustering Algorithm for Web Applications. In: Proceedings of the 2008 International Conference on Semantic Web and Web Services (SWWS 2008), pp.124-129, July (2008).

Google Scholar

[4] C. Carpineto , S. Mizzaro, G. Romano and M. Snidero. Mobile information retrieval with search results clustering: Prototypes and evaluations. Journal of the American Society for Information Science and Technology, v 60, n 5, pp.877-95, May (2009).

DOI: 10.1002/asi.21036

Google Scholar

[5] P. Jonghun, C. Byung-Cheon and K. Kwanho. A Vector Space Approach to Tag Cloud Similarity Ranking. Information Processing Letters, v 110, n 7. pp.1-8, March, (2010).

Google Scholar

[6] P. H. A. Sneath and R. R. Sokal. Numerical Taxonomy-The Principles and Practice of Numerical Classification. W H Freeman & Co (Sd) , SanFrancisco, June (1973).

Google Scholar

[7] L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, NewYork, March (1990).

Google Scholar

[8] L.A. Zadeh. Fuzzy Sets. Information and Control 8 (1965) 338-353.

Google Scholar

[9] J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers , Norwell, MA, USA , (1981).

Google Scholar

[10] N.R. Pal, K. Pal, J.M. Keller and J.C. Bezdek. A Possibilistic Fuzzy c-Means Clustering Algorithm. Journal of Process Control, v 16, n 10, pp.1055-73, Dec. (2006).

DOI: 10.1109/tfuzz.2004.840099

Google Scholar

[11] K. Kummamuru, A. Dhawale and R. Krishnapuram. Fuzzy Co-Clustering of Documents and Keywords, in: Proceedings of the 12th IEEE International Conference on Fuzzy Systems (Cat. No. 03CH37442), vol. 2, pp.772-7, (2003).

DOI: 10.1109/fuzz.2003.1206527

Google Scholar

[12] C. -H. Oh, K. Honda and H. Ichihashi. Fuzzy Clustering for Categorical Multivariate Data. in: Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pp.2154-9 vol. 4, (2001).

DOI: 10.1109/nafips.2001.944403

Google Scholar

[13] K. Honda, H. Ichihashi, F. Masulli and S. Rovetta. Linear Fuzzy Clustering with Selection of Variables Using Graded Possibilistic Approach. IEEE Transactions on Fuzzy Systems, v 15, n 5, pp.878-889, Oct. (2007).

DOI: 10.1109/tfuzz.2006.889946

Google Scholar

[14] D. H. Widyantoro and J. Yen. A Fuzzy Similarity Approach in Text Classification Task. Ninth IEEE International Conference on Fuzzy Systems. FUZZ- IEEE 2000 (Cat. No. 00CH37063), pp.653-658 vol. 2, (2000).

DOI: 10.1109/fuzzy.2000.839070

Google Scholar