TY - JOUR
T1 - Multi-task SE-Network for Image Splicing Localization
AU - ZHANG, Yulan
AU - ZHU, Guopu
AU - WU, Ligang
AU - KWONG, Sam
AU - ZHANG, Hongli
AU - ZHOU, Yicong
PY - 2022/7
Y1 - 2022/7
N2 - Image splicing can be easily used for illegal activities such as falsifying propaganda for political purposes and reporting false news, which may result in negative impacts on society. Hence, it is highly required to detect spliced images and localize the spliced regions. In this work, we propose a multi-task squeeze and excitation network (SE-Network) for splicing localization. The proposed network consists of two streams, namely label mask stream and edge-guided stream, both of which adopt convolutional encoder-decoder architecture. The information from the edge-guided stream is transmitted to the label mask stream for enhancing the discrimination of features between the spliced and host regions. This work has three main contributions. First, image edges, along with label masks and mask edges, are exploited to supply more comprehensive supervision for the localization of spliced regions. Second, the low-level feature maps extracted from shallow layers are fused with the high-level feature maps from deep layers to provide more reliable feature for splicing localization. Finally, several squeeze and excitation attention modules are incorporated into the network to recalibrate the fused features to enhance the feature expression. Extensive experiments show that the proposed multi-task SE-Network outperforms existing splicing localization methods evidently on two synthetic splicing datasets and four benchmark splicing datasets.
AB - Image splicing can be easily used for illegal activities such as falsifying propaganda for political purposes and reporting false news, which may result in negative impacts on society. Hence, it is highly required to detect spliced images and localize the spliced regions. In this work, we propose a multi-task squeeze and excitation network (SE-Network) for splicing localization. The proposed network consists of two streams, namely label mask stream and edge-guided stream, both of which adopt convolutional encoder-decoder architecture. The information from the edge-guided stream is transmitted to the label mask stream for enhancing the discrimination of features between the spliced and host regions. This work has three main contributions. First, image edges, along with label masks and mask edges, are exploited to supply more comprehensive supervision for the localization of spliced regions. Second, the low-level feature maps extracted from shallow layers are fused with the high-level feature maps from deep layers to provide more reliable feature for splicing localization. Finally, several squeeze and excitation attention modules are incorporated into the network to recalibrate the fused features to enhance the feature expression. Extensive experiments show that the proposed multi-task SE-Network outperforms existing splicing localization methods evidently on two synthetic splicing datasets and four benchmark splicing datasets.
KW - Image forensics
KW - image splicing localization
KW - low-level feature fusion
KW - multi-task learning
KW - squeeze and excitation attention module
UR - http://www.scopus.com/inward/record.url?scp=85118571361&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2021.3123829
DO - 10.1109/TCSVT.2021.3123829
M3 - Journal Article (refereed)
SN - 1051-8215
VL - 32
SP - 4828
EP - 4840
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 7
ER -