Mining Repetitive Sequential Patterns without Overlapping from Sequence Database

Article Preview

Abstract:

Taking repetitive property into consideration can help the analyst to capture more useful information. However, most of the existing algorithms of repetitive sequence mining are used for DNA or genome, and there are very few researches to mine such patterns from sequence database. So in this paper, we (1) propose a method to clearly determine the times that a sequence appears in a data sequence; (2) propose a method to ensure the support range of repetitive sequence still within [0,100%] so as to let users set up minimum support threshold in a traditional way; and (3) propose an algorithm, RptGSP, to efficiently mine such repetitive patterns in sequence database by improving the classic algorithm GSP. Experimental results show that RptGSP is very efficient.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2097-2100

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] B. Ding, D. Lo and J. Han. Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database. IEEE Computer Society. ICDE '09: Proceedings of the 2009 IEEE International Conference on Data Engineering, (2009), pp.1024-1035.

DOI: 10.1109/icde.2009.104

Google Scholar

[2] L . Brooke. Heidenfelder, D. Michael . Topal. Effects of sequence on repeat expansion during DNA replication. Nucleic Acids Research, Vol. 31, NO. 24 (2003), pp.7159-7164.

DOI: 10.1093/nar/gkg920

Google Scholar

[3] M. Zhang, B. Kao, D. Cheung and K. Yip. Mining periodic patterns with gap requirement from sequences. SIGMOD (2005), p.7-es.

DOI: 10.1145/1066157.1066228

Google Scholar

[4] Y. Tong, L. Zhao, D. Yu and et al. Mining Compressed Repetitive Gapped Sequential Patterns Efficiently. LNAI 5678, ADMA (2009), pp.652-660.

DOI: 10.1007/978-3-642-03348-3_68

Google Scholar

[5] E. Lee, W. Kim, J. Ryu and U. Kim. Efficient Weighted Mining of Repetitive Subsequences. SWS '09, Web Society (2009), p.66 – 70.

DOI: 10.1109/sws.2009.5271708

Google Scholar

[6] C. Ma and W. Shen. Clustering Navigation Patterns using Closed Repetitive Gapped Subsequence. Logistics Systems and Intelligent Management(2010), p.1660 – 1663.

DOI: 10.1109/iclsim.2010.5461254

Google Scholar