p.1732
p.1736
p.1741
p.1747
p.1752
p.1757
p.1764
p.1770
p.1775
Near-Replicas of Web Pages Eliminating Repetitive Algorithms Based on MD5
Abstract:
The development of the internet and exponential growth of network information produce a large number of duplicated pages on the network, reducing the retrieval of recall and precision and affecting the retrieval efficiency. The accuracy of the web, therefore, influences the quality of search engine. On the basis of the structural text description, this paper proposes an improved eliminating repetitive algorithm method, which is based on MD5 of Near-replicas. It proves that the method has a good effect on improving the recall and the precision through experiment.
Info:
Periodical:
Pages:
1752-1756
Citation:
Online since:
June 2012
Authors:
Price:
Сopyright:
© 2012 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: