BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: A Hierarchical Approach to Anomalous Subgroup Discovery

Citation Type: Miscellaneous

Publication Year: 2023

Abstract: Understanding peculiar and anomalous behavior of machine learning models for specific data subgroups is a fundamental building block of model performance and fairness evaluation. The analysis of these data subgroups can provide useful insights into model inner working and highlight its potentially discriminatory behavior. Current approaches to subgroup exploration ignore the presence of hierarchies in the data, and can only be applied to discretized attributes. The discretization process required for continuous attributes may significantly affect the identification of relevant subgroups. We propose a hierarchical subgroup exploration technique to identify anomalous subgroup behavior at multiple granularity levels, along with a technique for the hierarchical discretization of data attributes. The hierarchical discretization produces, for each continuous attribute, a hierarchy of intervals. The subsequent hierarchical exploration can exploit data hierarchies, selecting for each attribute the optimal granularity to identify subgroups that are both anomalous, and with enough elements to be statistically and practically significant. Compared to non-hierarchical approaches, we show that our hierarchical approach is more powerful in identifying anomalous subgroups and more stable with respect to discretization and exploration parameters.

Url: https://luca.dealfaro.com/papers/23/ICDE_Divergence_Discretization_Optimization.pdf

User Submitted?: No

Authors: Pastor, Eliana; Baralis, Elena; Santa Cruz Santa Cruz, Uc

Publisher: Politecnico di Torino

Data Collections: IPUMS CPS

Topics: Population Data Science

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop