Sparse LiDAR and Binocular Stereo Fusion Network for 3D Object Detection

Weiqing YAN, Kaiqi SU, Jinlai REN*, Runmin CONG, Shuai LI, Shuigen WANG

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

3D object detection is an essential task in autonomous driving and virtual reality. Existing approaches largely rely on expensive LiDAR sensors for accurate depth information to have high performance. While much lower-cost stereo cameras have been introduced as a promising alternative, there is still a notable performance gap. In this paper, we explore the idea to leverage sparse LiDAR and stereo images obtained by low-cost sensors for 3D object detection. We propose a novel multi-modal attention fusion end-to-end learning framework for 3D object detection, which effectively integrate the complementarities of sparse LiDAR and stereo images. Instead of directly fusing LiDAR and stereo modalities, we introduce a deep attention feature fusion module, which enables interactions between intermediate layers of LiDAR and stereo image paths by exploring the interdependencies of channel features. These fused features connect higher layer features after upsampling and lower layer features from the stereo image pathway and sparse LiDAR pathway. Hence, the fused features have high-level semantics with higher resolution, which is beneficial for the following object detection network. We provide detailed experiments on KITTI benchmark and achieve state-of-the-art performance compared with the low-cost based methods.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision : 5th Chinese Conference, PRCV 2022, Shenzhen, China, November 4–7, 2022 Proceedings, Part III
EditorsShiqi YU, Zhaoxiang ZHANG, Pong C. YUEN, Junwei HAN, Tieniu TAN, Yike GUO, Jianhuang LAI, Jianguo ZHANG
PublisherSpringer Science and Business Media Deutschland GmbH
Pages41-55
Number of pages15
ISBN (Electronic)9783031189135
ISBN (Print)9783031189128
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022 - Shenzhen, China
Duration: 4 Nov 20227 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13536
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022
Country/TerritoryChina
CityShenzhen
Period4/11/227/11/22

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

Keywords

  • 3D object detection
  • Low cost
  • Sparse LiDAR
  • Stereo images

Fingerprint

Dive into the research topics of 'Sparse LiDAR and Binocular Stereo Fusion Network for 3D Object Detection'. Together they form a unique fingerprint.

Cite this