Research on Improved Clustering Algorithm on Web Usage Mining Based on Scientific Analysis of Web Materials

Bin Li; Jin Yang; Cai Ming Liu; Jian Dong Zhang; Yan Zhang

doi:10.4028/www.scientific.net/AMM.63-64.863

Paper Titles

Research of Face Identification Method Based on Geometric Feature Extraction and the Enlargement of Image Interpolation with Scientific Image Materials
p.846

Intermittent Fault Diagnosis Method of Power System Based on HMM-SVM Characteristics
p.850

Extended Web Services Model Design for Non-Functional Requirements
p.855

Classification of Knowledge Discovery Methods
p.859

Research on Improved Clustering Algorithm on Web Usage Mining Based on Scientific Analysis of Web Materials
p.863

A Structure of Integrated Network Business Management System Base on RIA
p.868

Active Disturbance Rejection Controller Based on Neural Network in the Permanent Magnetic Synchronous Motor Servo System
p.874

Quantitative Analysis and Study of Coal Mine Underground Environment Gases Based on the FTIR
p.878

Reliability Analysis of Fatigue Crack Growth with JC Method Based on Scientific Materials
p.882

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 63-64Research on Improved Clustering Algorithm on Web...

Research on Improved Clustering Algorithm on Web Usage Mining Based on Scientific Analysis of Web Materials

Abstract:

Clustering analysis is an important method to research the Web user’s browsing behavior and identify the potential customers on Web usage mining. The traditional user clustering algorithms are not quite accurate. In this paper, we give two improved user clustering algorithms, which are based on the associated matrix of the user’s hits in the process of browsing website. To this matrix, an improved Hamming distance matrix is generated by defining the minimum norm or the generalized relative Hamming distance between any two vectors. Then, similar user clustering are obtained by setting the threshold value. At the last step of our algorithm, the clustering results are confirmed by defining the clustering’s Similar Index and setting sub-algorithm. Finally, the testing examples show that the new algorithms are more accurate than the old one, and the real log data presents that the improved algorithms are practical.

You might also be interested in these eBooks

Advanced Research on Mechanical Engineering, Industry and Manufacturing Engineering

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 63-64)

Pages:

863-867

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.63-64.863

Citation:

Cite this paper

Online since:

June 2011

Authors:

Bin Li, Jin Yang, Cai Ming Liu, Jian Dong Zhang, Yan Zhang

Keywords:

Generalized Hamming Distance, User Clustering Algorithm, Web Usage Mining

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] J. Srivastava, etal, Web usage mining: Discovery and applications of usage patterns from web data, SIGKDD Explorations, Vol. 1-2( 2000), p. l2-23.

DOI: 10.1145/846183.846188

Google Scholar

[2] B. Mobasher and R. Cooley, Creating adaptive Web sites through Usage-based clustering of URLs, Proc of the 1999 IEEE Knowledge and Data Engineering Exchange workshop, New York: IEEE Press (1999), pp.32-37.

DOI: 10.1109/kdex.1999.836525

Google Scholar

[3] G. Paliouras, et al, Clustering the users of large web sites into communities, Proc of the 17th Int Conf on Machine Learning, San Mateo: Morgan Kaufmann (2000), p. 7l9-728.

Google Scholar

[4] Y. L. Yang, X. D. Guan and J. Y. You, Mining the page Clustering Based on the Content of Web Pages and the Site Topology, Journal of Software, Vol. 13-3(2002), pp.467-469.

Google Scholar

[5] Q. B. Song and J. Y. Shen, An Efficient And Multi-Purpose Algorithm For Mining Web Logs, Journal of Computer Research & Development, Vol. 38-3(2001), pp.328-333.

Google Scholar

[6] D. Beeferman and A. Berger, Agglomerative Clustering of a SearchEngine Query Log, Proceedings of the 6 ACM SIGKDD International Conference. Boston: ACM Press (2000), p.407-4l5.

DOI: 10.1145/347090.347176

Google Scholar

[7] X. Y. Li and J. S. Yuan, Efficient Clustering Algorithm Used for Web Search, Computer Engineering, Vol. 32-20( 2006), pp.38-39.

Google Scholar

[8] Y. Fu, K. Sandhu and M. Shih, A generalization-based approach to clustering of Web usage session, , Web Usage Analysis and User Profiling. New York: Springer-Verlag(2000), pp.21-38.

DOI: 10.1007/3-540-44934-5_2

Google Scholar

[9] P. Kumar, P. R. Krishna and R. S. Bapi, et al, Rough Clustering of Sequential Data, Data & Knowledge Engineering, Vol. 63-2(2007), pp.183-199.

DOI: 10.1016/j.datak.2007.01.003

Google Scholar