The Research of High Efficient Data Mining Algorithms for Massive Data Sets

Article Preview

Abstract:

Data mining means to extract information and knowledge that potentially useful while still unknown in advance, from a large quantity of implicit incomplete, random data. With the quick advancement of modern information technology, people are accumulating data volume on the increase sharply, often at the speed of TB. How to extract meaningful information from large amounts of data has become a big problem must be tackled. In view of the huge amounts of data mining, distributed parallel processing and incremental processing is valid solution.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3901-3904

Citation:

Online since:

May 2014

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] J.S. Park, M-S. Chen, S.Y. Philip. An Effective Hash-Based Algorithm for Mining Association Rules. IBM Thomas J. Watson Research Center, (1995).

Google Scholar

[2] J. Han, J. pei, Y. Yin. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, 8, 53-87, (2004).

DOI: 10.1023/b:dami.0000005258.31418.83

Google Scholar

[3] S. Ghemawat, H. Gobioff, S. Leung. The Google filesystem, In Proc. of ACM Symposium on Operating Systems Principles, Lake George, NY, Oct 2003, pp.29-43.

DOI: 10.1145/945445.945450

Google Scholar

[4] J. Dean and S. Ghemawat. Distributed programming with Mapreduce. In: Oram A, Wilson G, eds. Beautiful Code. Sebastopol: O'Reilly Media, Inc., 2007. 371-384.

Google Scholar