Abstract
The progression in large-scale 3D generative models has been impeded by significant resource requirements for training and challenges like inefficient representations. This paper introduces Make-A-Shape, a novel 3D generative model trained on a vast scale, using 10 million publicly-available shapes. We first innovate the wavelet-tree representation to encode high-resolution SDF shapes with minimal loss, leveraging our newly-proposed subband coefficient filtering scheme. We then design a subband coefficient packing scheme to facilitate diffusion-based generation and a subband adaptive training strategy for effective training on the large-scale dataset. Our generative framework is versatile, capable of conditioning on various input modalities such as images, point clouds, and voxels, enabling a variety of downstream applications, e.g., unconditional generation, completion, and conditional generation. Our approach clearly surpasses the existing baselines in delivering high-quality results and can efficiently generate shapes within two seconds for most conditions.
Original language | English |
---|---|
Title of host publication | Proceedings of the 41st International Conference on Machine Learning, ICML 2024 |
Editors | Ruslan SALAKHUTDINOV, Zico KOLTER, Katherine HELLER, Adrian WELLER, Nuria OLIVER, Jonathan SCARLETT, Felix BERKENKAMP |
Publisher | ML Research Press |
Pages | 20660-20681 |
Number of pages | 22 |
Publication status | Published - 2024 |
Externally published | Yes |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Publisher | ML Research Press |
Volume | 235 |
ISSN (Print) | 2640-3498 |
Bibliographical note
Publisher Copyright:Copyright 2024 by the author(s)
Funding
This work is supported by the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No.: CUHK 14201921].