CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding

  • Youqiang ZHENG
  • , Li XIAO
  • , Weiping TU*
  • , Yuhong YANG
  • , Xinmeng XU
  • *Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Recently, speech codecs based on neural networks have proven to perform better than traditional methods. However, redundancy in traditional parameter quantization is visible within the codec architecture of combining the traditional codec with the neural vocoder. In this paper, we propose a novel framework named CQNV, which combines the coarsely quantized parameters of a traditional parametric codec to reduce the bitrate with a neural vocoder to improve the quality of the decoded speech. Furthermore, we introduce a parameters processing module into the neural vocoder to enhance the application of the bitstream of traditional speech coding parameters to the neural vocoder, further improving the reconstructed speech's quality. In the experiments, both subjective and objective evaluations demonstrate the effectiveness of the proposed CQNV framework. Specifically, our proposed method can achieve higher quality reconstructed speech at 1.1 kbps than Lyra and Encodec at 3 kbps.
Original languageEnglish
Title of host publication24th Annual Conference of the International Speech Communication Association, Interspeech 2023: Proceedings
PublisherInternational Speech Communication Association
Pages171-175
Number of pages5
Volume2023-August
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event24th Annual Conference of the International Speech Communication Association, Interspeech 2023 - Convention Centre Dublin, Dublin, Ireland
Duration: 20 Aug 202324 Aug 2023

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech Communication Association
ISSN (Print)2308-457X

Conference

Conference24th Annual Conference of the International Speech Communication Association, Interspeech 2023
Country/TerritoryIreland
CityDublin
Period20/08/2324/08/23

Bibliographical note

Publisher Copyright:
© 2023 International Speech Communication Association. All rights reserved.

Funding

This work was supported in part by the Special Fund of Hubei Luojia Laboratory (No. 220100019), the Hubei Province Technological Innovation Major Project(No. 2021BAA034) and the Fundamental Research Funds for the Central Universities (No. 2042023kf1033).

Keywords

  • coarse quantization
  • low bitrate
  • neural vocoder
  • speech coding

Fingerprint

Dive into the research topics of 'CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding'. Together they form a unique fingerprint.

Cite this