Full Citation
Title: Separating Structure from Interestingness
Citation Type: Book, Section
Publication Year: 2004
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented. In this paper we propose a general approach to build condensed representations of pattern collections. The approach is based on separating the structure of the pattern collection from the interestingness values of the patterns. We study also the concrete case of representing the frequent sets and their (approximate) frequencies following this approach: we discuss the trade-offs in representing the frequent sets by the maximal frequent sets, the minimal infrequent sets and their combinations, and investigate the problem approximating the frequencies from samples by giving new upper bounds on sample complexity based on frequent closed sets and describing how convex optimization can be used to improve and score the obtained samples.
User Submitted?: No
Authors: Mielikinen, Taneli
Editors: Honghua Dai, Ramakrishnan Srikant Chengqi Zhang
Pages:
Volume Title: Advances in Knowledge Discovery and Data Mining: 8th Pacific-Asia Conference, PAKDD 2004, Sydney, Australia, May 26-28, 2004. Proceedings
Publisher: Springer-Verlag
Publisher Location: Heidelberg
Volume:
Edition:
Data Collections: IPUMS USA
Topics: Methodology and Data Collection
Countries: