TY - JOUR
T1 - Multi-level Feature Fusion Network for Shadow Removal Detection
AU - FU, Xiwen
AU - ZHU, Guopu
AU - ZHANG, Hongli
AU - ZHANG, Xinpeng
AU - HO, Anthony T. S.
AU - KWONG, Sam
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2025/2/19
Y1 - 2025/2/19
N2 - By now, many works have been done on shadow removal for image manipulation. As a result, detecting shadow removal has become a critical part to reveal the traces of image manipulation. However, there are only a few works conducted on shadow removal detection, and these works cannot accurately localize the image regions where the shadows have been removed. In this paper, we present a novel model called Multi-level Feature Fusion Network (MFF-Net) for shadow removal detection. MFF-Net consists of two parts: a dual-branch feature extraction encoder and a dense prediction decoder. The encoder anchors the approximate position of the manipulated regions, while the decoder progressively fills in the details of the estimated shadow masks by integrating multi-level information. In the encoder part, a global modeling branch is constructed to capture long-range dependencies, while a local feature extraction branch is designed to extract local structural information. The features extracted by these two branches are integrated using a feature fusion module. In the decoder part, a multi-scale feature upsampling module is proposed to upsample the input features and integrate them with the low-level features obtained from the encoder part. Meanwhile, the cross attention mechanism is introduced to guide the multi-level feature fusion process. Finally, the features of different resolutions are employed to estimate the shadow masks in a coarse-to-fine manner. Extensive experiments on shadow removal detection demonstrate the superiority of MFF-Net over the state-of-the-art methods. The source code of MFF-Net is publicly available at https://github.com/HITFuxiwen/MFF-Net.
AB - By now, many works have been done on shadow removal for image manipulation. As a result, detecting shadow removal has become a critical part to reveal the traces of image manipulation. However, there are only a few works conducted on shadow removal detection, and these works cannot accurately localize the image regions where the shadows have been removed. In this paper, we present a novel model called Multi-level Feature Fusion Network (MFF-Net) for shadow removal detection. MFF-Net consists of two parts: a dual-branch feature extraction encoder and a dense prediction decoder. The encoder anchors the approximate position of the manipulated regions, while the decoder progressively fills in the details of the estimated shadow masks by integrating multi-level information. In the encoder part, a global modeling branch is constructed to capture long-range dependencies, while a local feature extraction branch is designed to extract local structural information. The features extracted by these two branches are integrated using a feature fusion module. In the decoder part, a multi-scale feature upsampling module is proposed to upsample the input features and integrate them with the low-level features obtained from the encoder part. Meanwhile, the cross attention mechanism is introduced to guide the multi-level feature fusion process. Finally, the features of different resolutions are employed to estimate the shadow masks in a coarse-to-fine manner. Extensive experiments on shadow removal detection demonstrate the superiority of MFF-Net over the state-of-the-art methods. The source code of MFF-Net is publicly available at https://github.com/HITFuxiwen/MFF-Net.
KW - Image manipulation
KW - feature fusion
KW - image forensics
KW - shadow removal detection
UR - http://www.scopus.com/inward/record.url?scp=85218789770&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2025.3543526
DO - 10.1109/TCSVT.2025.3543526
M3 - Journal Article (refereed)
SN - 1051-8215
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -