The Research of High-Dimensional Big Data Dimension Reduction Strategy

Article Preview

Abstract:

With the increase of data dimension, many low dimensional mining algorithms cannot get satisfactory results. With the increase of data dimension, it can produce a large amount of redundant information; this information will greatly reduce the efficiency of mining, increasing the complexity of the mining algorithm. Feature selection is an efficient way to solve the problem; it can remove a lot of irrelevant and redundant features. In this paper, on the basis of Lars algorithm applying differential evolution thought to the extraction of feature subset, puts forward a new method of feature selection, DE - Lars algorithm. Experiments prove that DE - Lars algorithm enhances the precision of reducing dimension of space, effectively solve the problem of "Curse of Dimensionality ".

You might also be interested in these eBooks

Info:

Periodical:

Pages:

121-126

Citation:

Online since:

January 2015

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] R. Bellman. Adaptive Control Processes: A Guided Tour [M]. Princeton University Press, Princeton, New Jersey, (1961).

DOI: 10.1002/nav.3800080314

Google Scholar

[2] D.W. Scott. Multivariate Density Estimation [M]. Wiley&Sons, (1992).

Google Scholar

[3] Smialowski R. Frishman D., Kramer S. Pitfalls of supervised feature selection [J]. Bioinformatics, 2010, 26(3): 440-443.

DOI: 10.1093/bioinformatics/btp621

Google Scholar

[4] Fomian G. An extensive empirical study of feature selection metrics for text classification [J]. The Journal of machine learning research, 2003, 3: 1289-1305.

Google Scholar

[5] Rainer Storn, Kenneth Price and Jouni Lampinen. Differential Evolution: A Practical Approach to Global Optimization[M]. Berlin: Springer-Verlag, (2005).

Google Scholar

[6] Rainer Storn. On the usage of differential evolution for function optimization [C]. Proceedings of the North Amer. Fuzzy Inf. Process. Soc, NAFIPS, Berkeley, CA, 1996: 519-523P.

Google Scholar