Full Citation
Title: Proportional k-Interval Discretization for Naive-Bayes Classifiers
Citation Type: Conference Paper
Publication Year: 2001
ISBN:
ISSN:
DOI: 10.1007/3-540-44795-4_48
NSFID:
PMCID:
PMID:
Abstract: This paper argues that two commonly-used discretization approaches, fixed k-interval discretization and entropy-based discretization have sub-optimal characteristics for naive-Bayes classification. This analysis leads to a new discretization method, Proportional k-Interval Discretization (PKID), which adjusts the number and size of discretized intervals to the number of training instances, thus seeks an appropriate trade-off between the bias and variance of the probability estimation for naive-Bayes classifiers. We justify PKID in theory, as well as test it on a wide cross-section of datasets. Our experimental results suggest that in comparison to its alternatives, PKID provides naive-Bayes classifiers competitive classification performance for smaller datasets and better classification performance for larger datasets.
Url: https://link.springer.com/chapter/10.1007/3-540-44795-4_48
User Submitted?: No
Authors: Ying, Yang; Webb, Geoffrey, I
Conference Name: 12th European Conference on Machine Learning Freiburg
Publisher Location: Germany
Data Collections: IPUMS USA
Topics: Methodology and Data Collection, Other
Countries: