Conference proceeding
Precise Bayes Classifier: Summary of Results
2021 IEEE International Conference on Data Mining (ICDM), Vol.2021-, pp.649-658
12/2021
DOI: 10.1109/ICDM51629.2021.00076
Abstract
The Bayes Classifier is shown to have the minimal classification error, in addition to interpretable predictions. However, it requires the knowledge of underlying distributions of the predictors to be usable. This requirement is almost never satisfied. Naive Bayes classifiers and variants estimate this classifier by assuming the independence among predictors. This restrictive assumption hinders both the accuracy of these classifiers and their interpretability, as the calculated probabilities become less reliable. Moreover, it is argued in the literature that interpretability comes at the expense of accuracy and vice versa. In this paper, we are motivated by the accurate and interpretable nature of the Bayes Classifier. We propose Precise Bayes, which is a computationally efficient estimation of the Bayes Classifier based on a new formulation. Our method makes no assumptions, neither on independence nor on underlying distributions. We devise a new theoretical minimal error rate for our formulation and show that the error rate of Precise Bayes approaches this limit with increasing number of samples learned. Moreover, the calculated posterior probabilities, are actual empirical probabilities calculated by counting the observations and outcomes. This makes the predictions made by Precise Bayes fully explainable. Our evaluations on generated datasets and real datasets validate our theoretical claims on prediction error rate and computational efficiency.
Details
- Title: Subtitle
- Precise Bayes Classifier: Summary of Results
- Creators
- Amin Vahedian - University of Wisconsin–WhitewaterXun Zhou - University of Iowa
- Resource Type
- Conference proceeding
- Publication Details
- 2021 IEEE International Conference on Data Mining (ICDM), Vol.2021-, pp.649-658
- Publisher
- IEEE
- DOI
- 10.1109/ICDM51629.2021.00076
- ISSN
- 1550-4786
- eISSN
- 2374-8486
- Language
- English
- Date published
- 12/2021
- Academic Unit
- Business Analytics
- Record Identifier
- 9984380484002771
Metrics
2 Record Views