ISS: Image As Stepping Stone for Text-Guided 3D Shape Generation

Zhengzhe LIU, Peng DAI, Ruihui LI, Xiaojuan QI, Chi-Wing FU

Research output: Other Conference ContributionsConference Paper (other)Researchpeer-review

7 Citations (Scopus)

Abstract

Text-guided 3D shape generation remains challenging due to the absence of large paired text-shape dataset, the substantial semantic gap between these two modalities, and the structural complexity of 3D shapes. This paper presents a new framework called Image as Stepping Stone (ISS) for the task by introducing 2D image as a stepping stone to connect the two modalities and to eliminate the need for paired text-shape data. Our key contribution is a two-stage feature-space-alignment approach that maps CLIP features to shapes by harnessing a pre-trained single-view reconstruction (SVR) model with multi-view supervisions: first map the CLIP image feature to the detail-rich shape space in the SVR model, then map the CLIP text feature to the shape space and optimize the mapping by encouraging CLIP consistency between the input text and the rendered images. Further, we formulate a text-guided shape stylization module to dress up the output shapes with novel structures and textures. Beyond existing works on 3D shape generation from text, our new approach is general for creating shapes in a broad range of categories, without requiring paired text-shape data. Experimental results manifest that our approach outperforms the state-of-the-arts and our baselines in terms of fidelity and consistency with text. Further, our approach can stylize the generated shapes with both realistic and fantasy structures and textures. Codes are available at https://github.com/liuzhengzhe/ISS- Image- as- Stepping- Stone- for- Text- Guided- 3D- Shape- Generation.
Original languageEnglish
Number of pages12
Publication statusPublished - 2023
Externally publishedYes
Event11th International Conference on Learning Representations - Kigali, Rwanda
Duration: 1 May 20235 May 2023

Conference

Conference11th International Conference on Learning Representations
Abbreviated titleICLR 2023
Country/TerritoryRwanda
CityKigali
Period1/05/235/05/23

Bibliographical note

Publisher Copyright:
© 2023 11th International Conference on Learning Representations, ICLR 2023. All rights reserved.

Funding

The work has been supported in part by the Research Grants Council of the Hong Kong Special Administrative Region (Project no. CUHK 14206320), General Research Fund of Hong Kong (No. 17202422), Hong Kong Research Grant Council - Early Career Scheme (Grant No. 27209621), and National Natural Science Foundation of China (No. 62202151).

Fingerprint

Dive into the research topics of 'ISS: Image As Stepping Stone for Text-Guided 3D Shape Generation'. Together they form a unique fingerprint.

Cite this