Abstract
A future video is the 2D projection of a 3D scene with predicted camera and object motion. Accurate future video prediction inherently requires understanding of 3D motion and geometry of a scene. In this paper, we propose a RGBD scene forecasting model with 3D motion decomposition. We predict ego-motion and foreground motion that are combined to generate a future 3D dynamic scene, which is then projected into a 2D image plane to synthesize future motion, RGB images and depth maps. Optional semantic maps can be integrated. Experimental results on KITTI and Driving datasets show that our model outperforms other state-of-the-arts in forecasting future RGBD dynamic scenes.
Original language | English |
---|---|
Title of host publication | Proceedings : 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 |
Publisher | IEEE Computer Society |
Pages | 7665-7674 |
Number of pages | 10 |
ISBN (Electronic) | 9781728132938 |
DOIs | |
Publication status | Published - Jun 2019 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Keywords
- Image and Video Synthesis
- RGBD sensors and analytics