p.2585
p.2589
p.2593
p.2598
p.2602
p.2607
p.2612
p.2617
p.2621
Web Object Mining Using Entropy Increasing Rate
Abstract:
In this paper, we proposed a new method of web objects extraction based on entropy theory, which takes both tag structure and content pattern into consideration for object detection. Firstly, it calculates content entropy of each node in HTML tag tree. Then, it uses entropy increasing rate to capture characteristics of object region and identify the minimal sub-tree that contains objects. Finally, a set of heuristics is employed for more accurate extraction. Experimental evaluation shows it can enhance the overall effectiveness of object mining.
Info:
Periodical:
Pages:
2602-2606
Citation:
Online since:
November 2011
Authors:
Keywords:
Price:
Сopyright:
© 2012 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: