Skip to main navigation Skip to search Skip to main content

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

  • Youqiang ZHENG
  • , Weiping TU*
  • , Li XIAO
  • , Xinmeng XU
  • *Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Neural speech coding is a rapidly developing topic, where state-of-the-art approaches now exhibit superior compression performance than conventional methods. Despite significant progress, existing methods still have limitations in preserving and reconstructing fine details for optimal reconstruction, especially at low bitrates. In this study, we introduce SuperCodec, a neural speech codec that achieves state-of-the-art performance at low bitrates. It employs a novel back projection method with selective feature fusion for augmented representation. Specifically, we propose to use Selective Up-sampling Back Projection (SUBP) and Selective Down-sampling Back Projection (SDBP) modules to replace the standard up- and down-sampling layers at the encoder and decoder, respectively. Experimental results show that our method outperforms the existing neural speech codecs operating at various bitrates. Specifically, our proposed method can achieve higher quality reconstructed speech at 1 kbps than Lyra V2 at 3.2 kbps and Encodec at 6 kbps.
Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024: Proceedings
PublisherIEEE
Pages566-570
Number of pages5
ISBN (Electronic)9798350344851
ISBN (Print)9798350344868
DOIs
Publication statusPublished - 2024
Externally publishedYes
EventICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISSN (Print)1520-6149

Conference

ConferenceICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Bibliographical note

The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University

Keywords

  • back-projection
  • neural codec
  • speech coding

Fingerprint

Dive into the research topics of 'SuperCodec: A Neural Speech Codec with Selective Back-Projection Network'. Together they form a unique fingerprint.

Cite this