Logo image
PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text
Conference proceeding   Open access

PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

Muchao Ye, Jinghui Chen, Chenglin Miao, Han Liu, Ting Wang and Fenglong Ma
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.3093-3104
ACM Conferences
KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
08/06/2023
DOI: 10.1145/3580305.3599461
url
https://doi.org/10.1145/3580305.3599461View
Published (Version of record) Open Access

Abstract

Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box -- the inner structure of the victim model is unknown; hard label -- the attacker only has access to the top-1 prediction results; and semantic preservation - the perturbation needs to preserve the original semantics. In this paper, we present PAT, a novel adversarial attack method employed under all these constraints. Specifically, PAT explicitly models the adversarial and non-adversarial prototypes and incorporates them to measure semantic changes for replacement selection in the hard-label black-box setting to generate high-quality samples. In each iteration, PAT finds original words that can be replaced back and selects better candidate words for perturbed positions in a geometry-aware manner guided by this estimation, which maximally improves the perturbation construction and minimally impacts the original semantics. Extensive evaluation with benchmark datasets and state-of-the-art models shows that PAT outperforms existing text adversarial attacks in terms of both attack effectiveness and semantic preservation. Moreover, we validate the efficacy of PAT against industry-leading natural language processing platforms in real-world settings.
Security and privacy

Details

Metrics

Logo image