Saliency detection is a fundamental and challenging task in computer vision, which aims at distinguishing the most conspicuous objects or regions in an image. Existing deep-learning methods mainly rely on the entire image to learn the global context information for saliency detection, which loses the spatial relation and results in ambiguity in predicting saliency maps. In this paper, we propose a novel deep sub-region network (DSR-Net) equipped with a sequence of sub-region dilated blocks (SRDB) by aggregating multi-scale salient context information of multiple sub-regions, such that the global context information from the whole image and local contexts from sub-regions are fused together, making the saliency prediction more accurate. Our SRDB separates the input feature map at different layers of a convolutional neural network (CNN) into different sub-regions and then designs a parallel ASPP module to refine feature maps at each sub-region. Experiments on the five widely-used saliency benchmark datasets demonstrate that our network outperforms recent state-of-the-art saliency detectors quantitatively and qualitatively on all the benchmarks.
|Number of pages||14|
|Journal||IEEE Transactions on Circuits and Systems for Video Technology|
|Early online date||20 Apr 2020|
|Publication status||Published - Feb 2021|
Bibliographical noteThis work was supported by National Natural Science Foundation of China (Grant No. 61671399), National Natural Science Foundation of China (Grant No.
61902275), the Fundamental Research Funds for the Central Universities (Grant No. 20720190012), the Interdisciplinary Research Scheme of the Dean’s Research Fund 2018-19 (FLASS/DRF/IDS-3)of The Education University of Hong Kong, and HKIBS Research Seed Fund 2019/20 (190-009) of Lingnan University, Hong Kong.
- Saliency detection
- deep subregion learning
- parallel atrous spatial pyramid pooling (ASPP) modules
- region dilated blocks