Maximal Frequent Item Sequences Mining
This work proposes a new fast algorithm finding maximal frequent item sequences from transaction database. Itemset is defined as item sequence (IS) for mining. Two lists called ISL (Item Sequence List) and FISL (Frequent Item Sequence List) are created by scanning database once for dividing n-IS into two categories depending on whether the IS to achieve minimum support number (n is the number of attributes). Sub item sequences (SIS) whose n-superset is in ISL are generated by recursion to make sure that each k-SIS appeared before its (k+1)-superset. As current k-SIS being joined to FISL, its (k-1)-SIS are pruned (k range from 2 to n-1). At last, all SISs whose n-superset is in FISL are pruned from FISL. We compare our new algorithm and FP-Growth by experiment to prove its superiority.
L. J. Zhou and Z. Zhang, "Maximal Frequent Item Sequences Mining", Advanced Materials Research, Vols. 108-111, pp. 1211-1216, 2010