Full Citation
Title: Mining atypical groups for a target quantitative attribute
Citation Type: Conference Paper
Publication Year: 2008
ISBN: 978-1-4244-1673-8
ISSN:
DOI: 10.1109/ICCIS.2008.4670867
NSFID:
PMCID:
PMID:
Abstract: An important task in data analysis is the understanding of unexpected or atypical behaviors in a group of individuals. Which categories of individuals earn the higher salaries or, on the contrary, which ones earn the lower salaries? We present the problem of how data concerning atypical groups can be mined compared with a target quantitative attribute, like for instance the attribute ldquosalaryrdquo, and in particular for the high and low values of a user-defined interval. Our search therefore focuses on conjunctions of attributes whose distribution differs significantly from the learning set for the intervalpsilas high and low values of the target attribute. Such atypical groups can be found by adapting an existing measure, the intensity of inclination. This measure frees us from the transformation step of quantitative attributes, that is to say the step of discretization followed by a complete disjunctive coding. Thus, we propose an algorithm for mining such groups using pruning rules in order to reduce the complexity of the problem. This algorithm has been developed and integrated into the WEKA software for knowledge extraction. Finally we give an example of data extraction from the American census database IPUMS.
Url: http://ieeexplore.ieee.org/document/4670867/
User Submitted?: No
Authors: Guillaume, Sylvie; Guillochon, Florian
Conference Name: 2008 IEEE Conference on Cybernetics and Intelligent Systems
Publisher Location: Chengdu, China
Data Collections: IPUMS USA
Topics: Other
Countries: