IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Mining Top-K Frequent Itemsets from Data Streams

Citation Type: Journal Article

Publication Year: 2006

Abstract: Frequent pattern mining on data streams is of interest recently. However, it is not easy for users to determine a proper frequency threshold. It is more reasonable to ask users to set a bound on the result size. We study the problem of mining top K frequent itemsets in data streams. We introduce a method based on the Chernoff bound with a guarantee of the output quality and also a bound on the memory usage. We also propose an algorithm based on the Lossy Counting Algorithm. In most of the experiments of the two proposed algorithms, we obtain perfect solutions and the memory space occupied by our algorithms is very small. Besides, we also propose the adapted approach of these two algorithms in order to handle the case when we are interested in mining the data in a sliding window. The experiments show that the results are accurate.

User Submitted?: No

Authors: Fu, Ada Wai-Chee; Wong, Raymond Chi-Wing

Periodical (Full): Data Mining and Knowledge Discovery

Issue: 2

Volume: 13

Pages: 193-217

Data Collections: IPUMS USA

Topics: Methodology and Data Collection

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop