DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images

Ya-Ling LI, Yong FENG*, Ming-Liang ZHOU, Xian-cai XIONG, Yong-heng WANG, Bao-hua QIANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

2 Citations (Scopus)

Abstract

Unmanned aerial vehicles are increasingly popular due to their ease of operation, low noise, and portability. However, existing object detection methods perform poorly in detecting small targets in densely arranged, sparsely distributed aerial images. To tackle this issue, we enhanced the general object detection method YOLOv5 and introduced a multi-scale detection method called Detach-Merge Attention YOLO (DMA-YOLO). Specifically, we proposed a Detach-Merge Convolution (DMC) module and embedded it into the backbone network to maximize feature retention. Furthermore, we embedded the Bottleneck Attention Module (BAM) into the detection head to suppress interference from complex background information without significantly increasing computational complexity. To represent and process multi-scale features more effectively, we have integrated an extra detection head and enhanced the neck network into the Bi-directional Feature Pyramid Network (BiFPN) structure. Finally, we adopted the SCYLLA-IoU (SIoU) as a loss function to expedite the convergence rate of our model and enhance the precision of detection results. A series of experiments on the VisDrone2019 and UAVDT datasets have illustrated the effectiveness of DMA-YOLO. Code is available at https://github.com/Yaling-Li/DMA-YOLO.

Original languageEnglish
Pages (from-to)4505-4518
Number of pages14
JournalVisual Computer
Volume40
Issue number6
DOIs
Publication statusPublished - Jun 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023.

Keywords

  • Aerial images
  • Attention mechanism
  • Object detection
  • YOLOv5

Fingerprint

Dive into the research topics of 'DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images'. Together they form a unique fingerprint.

Cite this