IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Factors Affecting the Performance of Parallel Mining of Minimal Unique Item sets On Diverse Architectures

Citation Type: Working Paper

Publication Year: 2008

Abstract: Three parallel implementations of a divide and conquer search algorithm (called SUDA2) for finding minimal unique itemsets are compared. The identification of minimal unique itemsets is used by national statistics agencies for statistical disclosure assessment. The first parallel implementation adapts SUDA2 to a Symmetric Multi-Processor (SMP) cluster using the Message Passing Interface (MPI), which we call an MPI cluster; the second optimises the code for the Cray MTA2 (a shared-memory, multi-threaded architecture); and the third uses a heterogeneous group of workstations connected by LAN. Each implementation considers the parallel structure of SUDA2, and how the subsearch computation times and sequence of subsearches affect load balancing. All three approaches scale with the number of processors, enabling SUDA2 to handle larger problems than before. For example, the MPI implementation is able to achieve nearly two orders of magnitude improvement with 132 processors. Performance results are given for a number of datasets.Key words: Performance; Itemset Mining; Divide and Conquer Algorithm; Load Balancing; Parallel Architecture

User Submitted?: No

Authors: Haglin, D J.; Manning, A M.; Mayes, K R.; al., et; Feo, J.; Gurd, J R.

Series Title:

Publication Number:

Institution:

Pages:

Publisher Location:

Data Collections: IPUMS USA

Topics: Other

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop