Preprint
Optimal Compression for Minimizing Classification Error Probability: an Information-Theoretic Approach
ArXiv.org
11/03/2022
DOI: 10.48550/arXiv.2211.02012
Abstract
We formulate the problem of performing optimal data compression under the
constraints that compressed data can be used for accurate classification in
machine learning. We show that this translates to a problem of minimizing the
mutual information between data and its compressed version under the constraint
on error probability of classification is small when using the compressed data
for machine learning. We then provide analytical and computational methods to
characterize the optimal trade-off between data compression and classification
error probability. First, we provide an analytical characterization for the
optimal compression strategy for data with binary labels. Second, for data with
multiple labels, we formulate a set of convex optimization problems to
characterize the optimal tradeoff, from which the optimal trade-off between the
classification error and compression efficiency can be obtained by numerically
solving the formulated optimization problems. We further show the improvements
of our formulations over the information-bottleneck methods in classification
performance.
Details
- Title: Subtitle
- Optimal Compression for Minimizing Classification Error Probability: an Information-Theoretic Approach
- Creators
- Jingchao GaoAo TangWeiyu Xu
- Resource Type
- Preprint
- Publication Details
- ArXiv.org
- DOI
- 10.48550/arXiv.2211.02012
- ISSN
- 2331-8422
- Language
- English
- Date posted
- 11/03/2022
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9984311558202771
Metrics
13 Record Views