Conference proceeding
TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, Vol.36(4), pp.3877-3884
AAAI Conference on Artificial Intelligence
06/28/2022
DOI: 10.1609/aaai.v36i4.20303
Abstract
This paper focuses on a newly challenging setting in hard-label adversarial attacks on text data by taking the budget information into account. Although existing approaches can successfully generate adversarial examples in the hard-label setting, they follow an ideal assumption that the victim model does not restrict the number of queries. However, in real-world applications the query budget is usually tight or limited. Moreover, existing hard-label adversarial attack techniques use the genetic algorithm to optimize discrete text data by maintaining a number of adversarial candidates during optimization, which can lead to the problem of generating low-quality adversarial examples in the tight-budget setting. To solve this problem, in this paper, we propose a new method named TextHoaxer by formulating the budgeted hard-label adversarial attack task on text data as a gradient-based optimization problem of perturbation matrix in the continuous word embedding space. Compared with the genetic algorithm-based optimization, our solution only uses a single initialized adversarial example as the adversarial candidate for optimization, which significantly reduces the number of queries. The optimization is guided by a new objective function consisting of three terms, i.e., semantic similarity term, pair-wise perturbation constraint, and sparsity constraint. Semantic similarity term and pair-wise perturbation constraint can ensure the high semantic similarity of adversarial examples from both comprehensive text-level and individual word-level, while the sparsity constraint explicitly restricts the number of perturbed words, which is also helpful for enhancing the quality of generated text. We conduct extensive experiments on eight text datasets against three representative natural language models, and experimental results show that TextHoaxer can generate high-quality adversarial examples with higher semantic similarity and lower perturbation rate under the tight-budget setting.
Details
- Title: Subtitle
- TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text
- Creators
- Muchao Ye - Pennsylvania State UniversityChenglin Miao - University of GeorgiaTing Wang - Pennsylvania State UniversityFenglong Ma - Pennsylvania State University
- Resource Type
- Conference proceeding
- Publication Details
- THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, Vol.36(4), pp.3877-3884
- Publisher
- Assoc Advancement Artificial Intelligence
- Series
- AAAI Conference on Artificial Intelligence
- DOI
- 10.1609/aaai.v36i4.20303
- ISSN
- 2159-5399
- eISSN
- 2374-3468
- Number of pages
- 8
- Grant note
- 1951729; 1953813; 1953893 / National Science Foundation; National Science Foundation (NSF) 1953893; 1953813 / Division Of Computer and Network Systems; Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE) 1951729 / Div Of Information & Intelligent Systems; Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE)
- Language
- English
- Date published
- 06/28/2022
- Academic Unit
- Computer Science
- Record Identifier
- 9984696706302771
Metrics
3 Record Views