Semi-Supervised Classification with Co-Training for Deep Web
The main problems in Web Pages classification are lack of labeled data, as well as the cost of labeling the unlabeled data. In this paper we discuss the application of semi-supervised machine learning method co-training on classification of Deep Web query interfaces to boost the performance of a classifier. Then, Bayes and Maxim Entropy algorithm are co-operated to incorporate labeled data with unlabeled data in training process incrementally. Our experiment results show the novel approach has a promising performance.
W. Fang and Z. M. Cui, "Semi-Supervised Classification with Co-Training for Deep Web", Key Engineering Materials, Vols. 439-440, pp. 183-188, 2010