Journal article
AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network
Pattern recognition, Vol.122, p.108291
02/2022
DOI: 10.1016/j.patcog.2021.108291
Abstract
•A novel FG-SBIR model with Attention-enhanced Network (AE-Net) is established, which pays more attention to the fine-grained details of the sketches and images.•We introduce three modules, i.e., the Residual Channel Attention module, Local Self-attention mechanism, and Spatial Sequence Transformer to mine the fine-grained details of the sketches and images in all dimensions.•Mutual Loss is proposed to improve the traditional Triplet Loss and restrain the distance relations among the sketches/images in a single modality.
In this paper, we investigate the task of Fine-grained Sketch-based Image Retrieval (FG-SBIR), which uses hand-drawn sketches as input queries to retrieve the relevant images at the fine-grained instance level. The sketches and images come from different modalities, thus the similarity computation needs to consider both fine-grained and cross-modal characteristics. Existing solutions only focus on fine-grained details or spatial contexts, while ignoring the channel context and spatial sequence information. To mitigate such challenging problems, we propose a novel deep FG-SBIR model, which aims at inferring attention maps along channel dimension and spatial dimension, improving modules of channel attention and spatial attention, and exploring Transformer to enhance the model’s ability for constructing and understanding spatial sequence information. We focus not only on the correlation information between two modalities of sketch and image, but also on the discrimination information inside the single modality. Mutual Loss is especially proposed to enhance the traditional triplet loss, and promote the internal discrimination ability of the model on a single modality. Extensive experiments show that our AE-Net obtains promising results on Sketchy, which is the largest public dataset available for FG-SBIR at present.
Details
- Title: Subtitle
- AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network
- Creators
- Yangdong Chen - Fudan UniversityZhaolong Zhang - Fudan UniversityYanfei Wang - Fudan UniversityYuejie Zhang - Fudan UniversityRui Feng - Fudan UniversityTao Zhang - Shanghai University of Finance and EconomicsWeiguo Fan - University of Iowa
- Resource Type
- Journal article
- Publication Details
- Pattern recognition, Vol.122, p.108291
- Publisher
- Elsevier Ltd
- DOI
- 10.1016/j.patcog.2021.108291
- ISSN
- 0031-3203
- eISSN
- 1873-5142
- Language
- English
- Date published
- 02/2022
- Academic Unit
- Business Analytics
- Record Identifier
- 9984380544702771
Metrics
6 Record Views