PTList: Mining XML Data Stream Using Paging Schema

Article Preview

Abstract:

Aiming at unlimited growing XML data stream and large XML document, we present PTList, mining frequent subtrees in XML using paging schema. PTList pages XML data stream, manages cross-page nodes and frequent candidate subtrees growing across page, mines frequent subtrees page-by-page, selects frequent subtree according to the page minimum support, and prunes branches based on the decaying factor. PTList mines XML data stream in the limit of the error of support, improves the memory utilization, and speeds up the mining process.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 403-408)

Pages:

1888-1891

Citation:

Online since:

November 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] MJ Zaki. Efficiently mining frequent trees in a forest. The 8th ACM SIGKDD InMl Conf on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, (2002).

DOI: 10.1145/775047.775058

Google Scholar

[2] Zhu Yongtai, et al. ESPM—An algorithm to mine frequent subtrees. Journal of Computer Research and Development, 2004(10): 1720~1726 (in Chinese).

Google Scholar

[3] Ma Haibing, et al. Pattern Growth Method for Mining Embedded Frequent Trees. Pattern Recognition and Artificial Intelligence, 2006(02): 208~214 (in Chinese).

Google Scholar

[4] M.C. Hsieh, Y.H. Wu, A.L. Chen. Discovering Frequent Tree Patterns over Data Stream. In Proc of SIAM International Conference on Data Mining, (2006).

DOI: 10.1137/1.9781611972764.74

Google Scholar

[5] Jeong Hee Hwang, Mi Sug Gu. Finding Frequent Structures in XML Stream Data. Computational Science and Its Applications, 2009. ICCSA '09.

DOI: 10.1109/iccsa.2009.17

Google Scholar

[6] Albert Bifet, Geo Holmes, Richard Kirkby, and Bernhard Pfahringer. Moa: Massive online analysis. Journal of Machine Learning Research, 11: 1601~1604, (2010).

Google Scholar

[7] SAX. http: /www. saxproject. org.

Google Scholar

[8] NIAGARA query engine. http: /www. cs. wisc. edu/niagara/data. html.

Google Scholar