On Optimization of Multi-Class Logistic Regression Classifier

Article Preview

Abstract:

The classical multi-class logistic regression classifier uses Newton method to optimize its loss function and suffers the expensive computations and the un-stable iteration process. In our work, we apply two state-of-art optimization techniques including conjugate gradient (CG) and BFGS to train multi-class logistic regression and compare them with Newton method on the classification accuracy of 20 datasets experimentally. The results show that CG and BFGS achieves better classification accuracy than the Newton method. Moreover, CG and BFGS have the lower time complexity, in contrast with Newton method. Finally, we also observe that CG and BFGS demonstrate similar performance.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 694-697)

Pages:

2746-2750

Citation:

Online since:

May 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] R. Johnson and D. Wichern, in: Methods of Multivariate Statistics, Wiley (2005).

Google Scholar

[2] Vladimir N.Vapnik, in: The Nature of Statistical Learning Theory, Springer (2000).

Google Scholar

[3] Christopher M. Bishop, in: Pattern Recognition and Machine Learning. Springer, New York. (2006)

Google Scholar

[4] J. Nocedal and S. J. Wright, in: Numerical Optimization, Springer (2006).

Google Scholar

[5] Xiao-Bo Jin, Cheng-Lin Liu, Xinwen Hou, in: Regularized Margin-based Conditional Log-likelihood Loss for Prototype Learning,43(7), 2428-2438 (2010).

DOI: 10.1016/j.patcog.2010.01.013

Google Scholar

[6] Shalev-Shwartz, S., Singer, Y., and Srebro, N. in: Pegasos: Primal Estimated Sub-gradient Solver for SVM. ICML (2007).

DOI: 10.1145/1273496.1273598

Google Scholar

[7] Bordes, A., Bottou, L., and Gallinari, P. in: SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent. JMLR, p.1737–1754 (2010).

Google Scholar

[8] Q.V. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, A.Y. Ng, in: on Optimization methods for Deep Learning. ICML (2011).

Google Scholar

[9] Vishwanathan, S. V. N., Schraudolph, N. N., Schmidt, M. W., and Murphy, K. P. in: Accelerated training of conditional random fields with stochastic gradient methods. ICML (2007).

DOI: 10.1145/1143844.1143966

Google Scholar

[10] Frank, A. & Asuncion, A. in: UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science (2010).

Google Scholar