An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers

Hui Xie; Jirong Yi; Weiyu Xu; Raghu Mudumbai

doi:10.1109/ISIT.2019.8849757

Back

Conference proceeding

An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers

Hui Xie, Jirong Yi, Weiyu Xu and Raghu Mudumbai

2019 IEEE International Symposium on Information Theory (ISIT), Vol.2019-, pp.1977-1981

07/2019

DOI: 10.1109/ISIT.2019.8849757

View Online

Abstract

We present a simple hypothesis about a compression property of artificial intelligence (AI) classifiers and present theoretical arguments to show that this hypothesis successfully accounts for the observed fragility of AI classifiers to small adversarial perturbations. We also propose a new method for detecting when small input perturbations cause classifier errors, and show theoretical guarantees for the performance of this detection method. We present experimental results with a voice recognition system to demonstrate this method. The ideas in this paper are motivated by a simple analogy between AI classifiers and the standard Shannon model of a communication system. 1

Artificial intelligence

Decoding

Noise measurement

Perturbation methods

Speech recognition

Standards

Training

Details

Title: Subtitle: An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers
Creators: Hui Xie - University of Iowa
Jirong Yi - University of Iowa
Weiyu Xu - University of Iowa
Raghu Mudumbai - University of Iowa
Resource Type: Conference proceeding
Publication Details: 2019 IEEE International Symposium on Information Theory (ISIT), Vol.2019-, pp.1977-1981
Publisher: IEEE
DOI: 10.1109/ISIT.2019.8849757
ISSN: 2157-8095
eISSN: 2157-8117
Language: English
Date published: 07/2019
Academic Unit: Electrical and Computer Engineering
Record Identifier: 9984197525902771

Metrics

9 Record Views