Document Clustering Based on Fuzzy Similarity
This paper proposes a novel fuzzy similarity measure based on the relationships between terms and categories. A term-category matrix is presented to represent such relationships and each element in the matrix denotes a membership degree that a term belongs to a category, which is computed using term frequency inverse document frequency and fuzzy relationships between documents and categories. Fuzzy similarity takes into account the situation that one document belongs to multiple categories and is computed using fuzzy operators. The experimental results show that the proposed fuzzy similarity surpasses other common similarity measures not only in the reliable derivation of document clustering results, but also in document clustering accuracies.
J. L. Zhou et al., "Document Clustering Based on Fuzzy Similarity", Applied Mechanics and Materials, Vols. 29-32, pp. 2620-2626, 2010