Journal article
Balanced single-shot object detection using cross-context attention-guided network
Pattern recognition, Vol.122, p.108258
02/2022
DOI: 10.1016/j.patcog.2021.108258
Abstract
•A light but effective Cross-context Attention-guided Network is proposed to maximize the balance between accuracy and speed for real-world object detection applications.•Cross-context Attention Mechanism (CCAM) is established to take the multi-region context information into consideration simultaneously, such as cross-region, adjacent-region, channel region, and spatial region.•Three attention mechanisms are introduced to guide the network optimization for better learning region focusing.
In real-world application scenarios, object detection usually encounters two technical challenges, i.e., high accuracy and high speed. Although the latest detection frameworks based on anchor-free detection have achieved outstanding performance, they cannot be widely used in real-world scenarios due to their model complexity and slow speed. In this paper, inspired by cross-context attention mechanism of human visual systems, we propose a light but effective single-shot detection framework using Cross-context Attention-guided Network (CCAGNet) to balance the accuracy and speed. CCAGNet uses attention-guided mechanism to highlight the interaction of object-synergy regions, and suppresses non-object-synergy regions by combining Cross-context Attention Mechanism (CCAM), Receptive Field Attention Mechanism (RFAM), and Semantic Fusion Attention Mechanism (SFAM). The main contribution of our work includes establishing a novel attention mechanism that takes the context information of channel, spatial, cross- and adjacent-regions into consideration simultaneously. Extensive experiments demonstrate the feasibility and effectiveness of our method on the public benchmark datasets. To the best of our knowledge, CCAGNet obtains the state-of-the-art performance on both PascalVOC and MSCOCO with the excellent trade-off between accuracy and speed among single-shot detectors. Especially, the Average Precision (AP) metric is significantly improved by 17.0% on small object detection on MSCOCO.
Details
- Title: Subtitle
- Balanced single-shot object detection using cross-context attention-guided network
- Creators
- Shuyu Miao - Fudan UniversityShanshan Du - Fudan UniversityRui Feng - Fudan UniversityYuejie Zhang - Fudan UniversityHuayu Li - Fudan UniversityTianbi Liu - Fudan UniversityLin Zheng - Antea GroupWeiguo Fan - University of Iowa
- Resource Type
- Journal article
- Publication Details
- Pattern recognition, Vol.122, p.108258
- Publisher
- Elsevier Ltd
- DOI
- 10.1016/j.patcog.2021.108258
- ISSN
- 0031-3203
- eISSN
- 1873-5142
- Language
- English
- Date published
- 02/2022
- Academic Unit
- Business Analytics
- Record Identifier
- 9984380390502771
Metrics
11 Record Views