TSAN : Synthesized View Quality Enhancement via Two-Stream Attention Network for 3D-HEVC

Zhaoqing PAN, Weijie YU, Jianjun LEI, Nam LING, Sam KWONG

Research output: Journal PublicationsJournal Article (refereed)peer-review

37 Citations (Scopus)


In three-dimensional video system, the texture and depth videos are jointly encoded, and then the Depth Image Based Rendering (DIBR) is utilized to realize view synthesis. However, the compression distortion of texture and depth videos, as well as the disocclusion problem in DIBR degrade the visual quality of the synthesized view. To address this problem, a Two-stream Attention Network (TSAN)-based synthesized view quality enhancement method is proposed for 3D-High Efficiency Video Coding (3D-HEVC) in this paper. First, the shortcomings of the view synthesis technique and traditional convolutional neural networks are analyzed. Then, based on these analyses, a TSAN with two information extraction streams is proposed for enhancing the quality of the synthesized view, in which the global information extraction stream learns the contextual information, and the local information extraction stream extracts the texture information from the rendered image. Third, a Multi-Scale Residual Attention Block (MSRAB) is proposed, which can efficiently detect features in different scales, and adaptively refine features by considering interdependencies among spatial dimensions. Extensive experimental results show that the proposed synthesized view quality enhancement method achieves significantly better performance than the state-of-the-art methods.
Original languageEnglish
Pages (from-to)345-358
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number1
Early online date5 Feb 2021
Publication statusPublished - Jan 2022
Externally publishedYes


  • 3D-HEVC
  • convolutional neural networks
  • quality enhancement
  • Video coding
  • view synthesis


Dive into the research topics of 'TSAN : Synthesized View Quality Enhancement via Two-Stream Attention Network for 3D-HEVC'. Together they form a unique fingerprint.

Cite this