TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text

Muchao Ye; Chenglin Miao; Ting Wang; Fenglong Ma

doi:10.1609/aaai.v36i4.20303

Back

TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text

Conference proceeding

Open access

TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text

Muchao Ye, Chenglin Miao, Ting Wang and Fenglong Ma

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, Vol.36(4), pp.3877-3884

AAAI Conference on Artificial Intelligence

06/28/2022

DOI: 10.1609/aaai.v36i4.20303

Files and links (1)

url

https://doi.org/10.1609/aaai.v36i4.20303View

Published (Version of record) Open Access

Abstract

This paper focuses on a newly challenging setting in hard-label adversarial attacks on text data by taking the budget information into account. Although existing approaches can successfully generate adversarial examples in the hard-label setting, they follow an ideal assumption that the victim model does not restrict the number of queries. However, in real-world applications the query budget is usually tight or limited. Moreover, existing hard-label adversarial attack techniques use the genetic algorithm to optimize discrete text data by maintaining a number of adversarial candidates during optimization, which can lead to the problem of generating low-quality adversarial examples in the tight-budget setting. To solve this problem, in this paper, we propose a new method named TextHoaxer by formulating the budgeted hard-label adversarial attack task on text data as a gradient-based optimization problem of perturbation matrix in the continuous word embedding space. Compared with the genetic algorithm-based optimization, our solution only uses a single initialized adversarial example as the adversarial candidate for optimization, which significantly reduces the number of queries. The optimization is guided by a new objective function consisting of three terms, i.e., semantic similarity term, pair-wise perturbation constraint, and sparsity constraint. Semantic similarity term and pair-wise perturbation constraint can ensure the high semantic similarity of adversarial examples from both comprehensive text-level and individual word-level, while the sparsity constraint explicitly restricts the number of perturbed words, which is also helpful for enhancing the quality of generated text. We conduct extensive experiments on eight text datasets against three representative natural language models, and experimental results show that TextHoaxer can generate high-quality adversarial examples with higher semantic similarity and lower perturbation rate under the tight-budget setting.

Computer Science

Technology

Computer Science, Artificial Intelligence

Science & Technology

Details

Title: Subtitle: TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text
Creators: Muchao Ye - Pennsylvania State University
Chenglin Miao - University of Georgia
Ting Wang - Pennsylvania State University
Fenglong Ma - Pennsylvania State University
Resource Type: Conference proceeding
Publication Details: THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, Vol.36(4), pp.3877-3884
Publisher: Assoc Advancement Artificial Intelligence
Series: AAAI Conference on Artificial Intelligence
DOI: 10.1609/aaai.v36i4.20303
ISSN: 2159-5399
eISSN: 2374-3468
Number of pages: 8
Grant note: 1951729; 1953813; 1953893 / National Science Foundation; National Science Foundation (NSF) 1953893; 1953813 / Division Of Computer and Network Systems; Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE) 1951729 / Div Of Information & Intelligent Systems; Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE)
Language: English
Date published: 06/28/2022
Academic Unit: Computer Science
Record Identifier: 9984696706302771

Metrics

3 Record Views

13 Times Cited - Web of Science