Towards Implicit Text-Guided 3D Shape Generation

Zhengzhe LIU, Yi WANG, Xiaojuan QI*, Chi-Wing FU*

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

53 Citations (Scopus)

Abstract

In this work, we explore the challenging task of generating 3D shapes from text. Beyond the existing works, we propose a new approach for text-guided 3D shape generation, capable of producing high-fidelity shapes with colors that match the given text description. This work has several technical contributions. First, we decouple the shape and color predictions for learning features in both texts and shapes, and propose the word-level spatial transformer to correlate word features from text with spatial features from shape. Also, we design a cyclic loss to encourage consistency between text and shape, and introduce the shape IMLE to diversify the generated shapes. Further, we extend the framework to enable text-guided shape manipulation. Extensive experiments on the largest existing text-shape benchmark [10] manifest the superiority of this work. The code and the models are available at https://github.com/liuzhengzhe/Towards-Implicit-Text-Guided-Shape-Generation.
Original languageEnglish
Title of host publicationProceedings : 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
PublisherIEEE Computer Society
Pages17875-17885
Number of pages11
ISBN (Electronic)9781665469463
DOIs
Publication statusPublished - 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Funding

This work is supported by the Research Grands Council of the Hong Kong Special Administrative Region (Project No. 14201921 and 27209621).

Keywords

  • 3D from single images
  • Vision + language

Fingerprint

Dive into the research topics of 'Towards Implicit Text-Guided 3D Shape Generation'. Together they form a unique fingerprint.

Cite this