Where academic tradition
meets the exciting future

Probabilistic Iterative Expansion of Candidates in Mining Frequent Itemsets

Attila Gyenesei, Jukka Teuhola, Probabilistic Iterative Expansion of Candidates in Mining Frequent Itemsets. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations , 90, 43 - 49, 2003.

Abstract:

A simple new algorithm is suggested for frequent itemset mining, using item probabilities as the basis for generating candidates. The method first finds all the frequent items, and then generates an estimate of the frequent sets, assuming item independence. The candidates are stored in a trie where each path from the root to a node represents one candidate itemset. The method expands the trie iteratively, until all frequent itemsets are found. Expansion is based on scanning through the data set in each iteration cycle, and extending the subtries based on observed node frequencies. Trie probing can be restricted to only those nodes which possibly need extension. The number of candidates is usually quite moderate; for dense datasets 2-4 times the number of final frequent itemsets, for non-dense sets somewhat more. In practical experiments the method has been observed to make clearly fewer passes than the well-known Apriori method. As for speed, our non-optimised implementation is in some cases faster, in some others slower than the comparison methods.

BibTeX entry:

@INPROCEEDINGS{inpGyTe03a,
  title = {Probabilistic Iterative Expansion of Candidates in Mining Frequent Itemsets},
  booktitle = {Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations },
  author = {Gyenesei, Attila and Teuhola, Jukka},
  volume = {90},
  pages = {43 - 49},
  year = {2003},
}

Belongs to TUCS Research Unit(s): Algorithmics and Computational Intelligence Group (ACI)

Edit publication