Active Learning Based on Diversity Maximization

Article Preview

Abstract:

In many practical data mining applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, as one type of the paradigms for addressing the problem of combining labeled and unlabeled data to boost the performance, active learning has attracted much attention. In this paper, we propose a new active learning approach based on diversity maximization. Different from the well-known co-testing algorithm, our method does not require two different views. The comparative studies with other active learning methods demonstrate the effectiveness of the proposed approach.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2548-2552

Citation:

Online since:

August 2013

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Li M, Zhou ZH. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans. on Systems, Man and Cybernetics- Part A: Systems and Humans, 2007, 37(6): 1088-1098.

DOI: 10.1109/tsmca.2007.904745

Google Scholar

[2] Nigam K, McCallum AK, Thrun S, Mitchell T. Text classi_cation from labeled and unlabeled documents using em. Machine Learning, 2000, 39(2-3): 103-134.

DOI: 10.21236/ada350490

Google Scholar

[3] Kiritchenko S, Matwin S. Email classi_cation with co-training. In: Proc. of the 2001 Conference of the Centre for Advanced Studies on Collaborative Research (CASCON'01). IBM Press, 2001. 8-19.

Google Scholar

[4] Z. -H. Zhou. Learning with unlabeled data and its application to image retrieval. In: Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence (PRICAI'06), Guilin, China, LNAI 4099, 2006, pp.5-10.

Google Scholar

[5] Muslea, I., Minton, S., & Knoblock, C. A. Selective sampling with redundant views. Proceedings of the 17th National Conference on Artificial Intelligence . Austin, TX. 2000, 621-626.

Google Scholar

[6] Muslea, I., Minton, S., & Knoblock, C. A. Active learning with multiple views. Journal of Artificial Intelligence Research, 2006, 27, 203-233.

DOI: 10.1613/jair.2005

Google Scholar

[7] Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3–12. ACM/Springer, (1994).

DOI: 10.1007/978-1-4471-2099-5_1

Google Scholar

[8] Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the International Conference on Machine Learning (ICML), pages 148–156. Morgan Kaufmann, (1994).

DOI: 10.1016/b978-1-55860-335-6.50026-x

Google Scholar

[9] Seung, M. Opper, and H. Sompolinsky. Query by committee. In Proceedings of the ACM Workshop on Computational Learning Theory, pages 287–294, (1992).

DOI: 10.1145/130385.130417

Google Scholar

[10] Settles, M. Craven, and S. Ray. Multiple-instance active learning. In Advances in Neural Information Processing Systems (NIPS), volume 20, pages 1289–1296. MIT Press, 2008b.

Google Scholar

[11] Settles and M. Craven. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1069–1078. ACL Press, (2008).

DOI: 10.3115/1613715.1613855

Google Scholar

[12] C. Blake, E. Keogh, and C.J. Merz, UCI repository of machine learningdatabases, [http: /www. ics. uci. edu/»mlearn/MLRepository. html], Department of Information and Computer Science, University of California, Irvine, CA, (1998).

Google Scholar

[13] S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Proceedings of the 17th International Conference on Machine Learning, pages 999–1006, (2000).

Google Scholar

[14] Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October (1999).

Google Scholar