Logo image
Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning
Conference proceeding

Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning

Jirong Yi, Raghu Mudumbai and Weiyu Xu
Conference record - Asilomar Conference on Signals, Systems, & Computers, pp.183-187
10/27/2024
DOI: 10.1109/IEEECONF60004.2024.10942758

View Online

Abstract

We address the theoretical problem of designing an optimal adversarial attack on a decision system to maximize degradation in system performance, as measured by the mutual information between the degraded signal and the target label. Motivated by adversarial examples in machine learning classifiers, we use an information-theoretic approach to establish conditions where adversarial vulnerability is inevitable. We derive optimal adversarial attacks for both discrete and continuous signals and demonstrate that minimizing mutual information becomes significantly harder with multiple redundant copies of the input signal. This finding supports the "feature compression" hypothesis as a basis for the adversar-ial vulnerability of deep learning classifiers. We also present computational results validating our theoretical findings.
Information Theory Machine Learning adversarial attack Computers Deep learning Degradation Mutual information Rate distortion System performance

Details

Metrics

11 Record Views
Logo image