Skip to main navigation Skip to search Skip to main content

Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

  • Runsong ZHU
  • , Shi QIU
  • , Zhengzhe LIU
  • , Ka-Hei HUI
  • , Qianyi WU
  • , Pheng-Ann HENG
  • , Chi-Wing FU

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Lifting multi-view 2D instance segmentation to a radiance field has proven to be effective to enhance 3D understanding. Existing methods for addressing multi-view inconsistency in 2D segmentations either use linear assignment for end-to-end lifting, resulting in inferior results, or adopt a two-stage solution, which is limited by complex pre- or post-processing. In this work, we design a new object-aware lifting approach to realize an end-to-end lifting pipeline based on the 3D Gaussian representation, such that we can jointly learn Gaussian-level point features and a global object-level codebook across multiple views. To start, we augment each Gaussian point with an additional Gaussian-level feature learned using a contrastive loss to encode instance information. Importantly, we introduce a learnable \textit{object-level codebook} to account for individual objects in the scene for an explicit object-level understanding and associate the encoded object-level features with the Gaussian-level point features for segmentation predictions. Further, we formulate the association learning module and the noisy label filtering module for effective and robust codebook learning. We conduct experiments on three benchmarks: LERF-Masked, Replica, and Messy Rooms datasets. Both qualitative and quantitative results manifest that our approach clearly outperforms existing methods in terms of segmentation quality and time efficiency.
Original languageEnglish
Title of host publicationProceedings : 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Pages3656-3665
Number of pages10
ISBN (Electronic)9798331543648
DOIs
Publication statusPublished - Jun 2025
Event2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025 - Nashville, United States
Duration: 11 Jun 202515 Jun 2025

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
ISSN (Print)1063-6919

Conference

Conference2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025
Country/TerritoryUnited States
CityNashville
Period11/06/2515/06/25

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Funding

This work is supported by the InnoHK Clusters of the Hong Kong SAR Government via the Hong Kong Centre for Logistics Robotics; and the Research Grants Council of the Hong Kong Special Administrative Region, China, under Project CUHK 14200824.

Keywords

  • 3d scene segmentation
  • end-to-end
  • gaussian splatting

Fingerprint

Dive into the research topics of 'Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting'. Together they form a unique fingerprint.

Cite this