Projects per year
Abstract
For the video salient object detection (VSOD) task, how to excavate the information from the appearance modality and the motion modality has always been a topic of great concern. The two-stream structure, including an RGB appearance stream and an optical flow motion stream, has been widely used as a typical pipeline for VSOD tasks, but the existing methods usually only use motion features to unidirectionally guide appearance features or adaptively but blindly fuse two modality features. However, these methods underperform in diverse scenarios due to the uncomprehensive and unspecific learning schemes. In this paper, following a more secure modeling philosophy, we deeply investigate the importance of appearance modality and motion modality in a more comprehensive way and propose a VSOD network with up and down parallel symmetry, named PSNet. Two parallel branches with different dominant modalities are set to achieve complete video saliency decoding with the cooperation of the Gather Diffusion Reinforcement (GDR) module and Cross-modality Refinement and Complement (CRC) module. Finally, we use the Importance Perception Fusion (IPF) module to fuse the features from two parallel branches according to their different importance in different scenarios. Experiments on four dataset benchmarks demonstrate that our method achieves desirable and competitive performance.
Original language | English |
---|---|
Pages (from-to) | 402-414 |
Number of pages | 13 |
Journal | IEEE Transactions on Emerging Topics in Computational Intelligence |
Volume | 7 |
Issue number | 2 |
Early online date | 18 Nov 2022 |
DOIs | |
Publication status | Published - Apr 2023 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Funding
This work was supported in part by the National Key R&D Program of China under Grant 2021ZD0112100, in part by Beijing Nova Program under Grant Z201100006820016, in part by the National Natural Science Foundation of China under Grants 62002014, U1936212, 62120106009, and 62001302, in part by Beijing Natural Science Foundation under Grant 4222013, in part by the Hong Kong Innovation and Technology Commission (Inno HK Project CIMDA), in part by the Hong Kong GRF-RGC General Research Fund under Grants 11209819 (CityU 9042816) and 11203820 (CityU 9042598), in part by Young Elite Scientist Sponsorship Program by the China Association for Science and Technology under Grant 2020QNRC001, in part by CAAI-Huawei MindSpore Open Fund, and in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2019A1515111205.
Keywords
- Importance perception
- Parallel symmetric structure
- Salient object detection
- Video sequence
Fingerprint
Dive into the research topics of 'PSNet: Parallel Symmetric Network for Video Salient Object Detection'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Adaptive Dynamic Range Enhancement Oriented to High Dynamic Display (面向高動態顯示的自適應動態範圍增強)
KWONG, S. T. W. (PI), KUO, C.-C. J. (CoI), WANG, S. (CoI) & ZHANG, X. (CoI)
Research Grants Council (HKSAR)
1/01/21 → 31/12/24
Project: Grant Research
-
Intelligent Ultra High Definition Video Encoder Optimization for Future Versatile Video Coding (用于未来多功能视频编码的智能超高清视频编码器优化)
KWONG, S. T. W. (PI), ZHOU, M. (CoI), KUO, C.-C. J. (CoI) & WANG, S. (CoI)
Research Grants Council (HKSAR)
1/01/20 → 30/06/23
Project: Grant Research