Full Citation
Title: An Algorithm for Mining Implicit Itemset Pairs based on Differences of Correlations
Citation Type: Journal Article
Publication Year: 2005
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Given a transaction database as a global set of transactions and its local database obtained by some conditioning of the global database, we consider pairs of itemsets whose degrees of correlation are higher in the local database than in the global one. A problem of finding paired itemsets with high correlation in one database is already known as Discovery of Correlation, and has been studied as the highly correlated itemsets are characteristic in the database. However, even noncharacteristic paired itemsets are also meaningful provided the degree of correlation increases significantly in the local database compared with the global one. They can be implicit and hidden evidences showing that something particular to the local database occurs, even though they were not previously realized to be characteristic. From this viewpoint, we have proposed measurement of the significance of paired itemsets by the difference of two correlations before and after the conditioning of the global database, and have defined a notion of DC pairs, whose degrees of difference of correlation are high. Since the measurement of DC pairs is nonmonotonic, DC pair mining problem is difficult. For our difficult problem, we have presented some algorithm for mining DC pairs. The algorithm can efficiently find DC pair to some degree, however we have to improve the algorithm in order to tackle more complicated problem. We discuss some method for an improvement of our system.
Url: https://eprints.lib.hokudai.ac.jp/dspace/handle/2115/5590
User Submitted?: No
Authors: Taniguchi, Tsuyoshi; Haraguchi, Makoto
Periodical (Full): Discovery Science
Issue:
Volume: 3735
Pages: 227-240
Data Collections: IPUMS USA
Topics: Population Data Science
Countries: United States