IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Proportional k-Interval Discretization for Naive-Bayes Classifiers

Citation Type: Conference Paper

Publication Year: 2001

DOI: 10.1007/3-540-44795-4_48

Abstract: This paper argues that two commonly-used discretization approaches, fixed k-interval discretization and entropy-based discretization have sub-optimal characteristics for naive-Bayes classification. This analysis leads to a new discretization method, Proportional k-Interval Discretization (PKID), which adjusts the number and size of discretized intervals to the number of training instances, thus seeks an appropriate trade-off between the bias and variance of the probability estimation for naive-Bayes classifiers. We justify PKID in theory, as well as test it on a wide cross-section of datasets. Our experimental results suggest that in comparison to its alternatives, PKID provides naive-Bayes classifiers competitive classification performance for smaller datasets and better classification performance for larger datasets.

Url: https://link.springer.com/chapter/10.1007/3-540-44795-4_48

User Submitted?: No

Authors: Ying, Yang; Webb, Geoffrey, I

Conference Name: 12th European Conference on Machine Learning Freiburg

Publisher Location: Germany

Data Collections: IPUMS USA

Topics: Methodology and Data Collection, Other

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop