Abstract
Recently, speech codecs based on neural networks have proven to perform better than traditional methods. However, redundancy in traditional parameter quantization is visible within the codec architecture of combining the traditional codec with the neural vocoder. In this paper, we propose a novel framework named CQNV, which combines the coarsely quantized parameters of a traditional parametric codec to reduce the bitrate with a neural vocoder to improve the quality of the decoded speech. Furthermore, we introduce a parameters processing module into the neural vocoder to enhance the application of the bitstream of traditional speech coding parameters to the neural vocoder, further improving the reconstructed speech's quality. In the experiments, both subjective and objective evaluations demonstrate the effectiveness of the proposed CQNV framework. Specifically, our proposed method can achieve higher quality reconstructed speech at 1.1 kbps than Lyra and Encodec at 3 kbps.
| Original language | English |
|---|---|
| Title of host publication | 24th Annual Conference of the International Speech Communication Association, Interspeech 2023: Proceedings |
| Publisher | International Speech Communication Association |
| Pages | 171-175 |
| Number of pages | 5 |
| Volume | 2023-August |
| DOIs | |
| Publication status | Published - 2023 |
| Externally published | Yes |
| Event | 24th Annual Conference of the International Speech Communication Association, Interspeech 2023 - Convention Centre Dublin, Dublin, Ireland Duration: 20 Aug 2023 → 24 Aug 2023 |
Publication series
| Name | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
|---|---|
| Publisher | International Speech Communication Association |
| ISSN (Print) | 2308-457X |
Conference
| Conference | 24th Annual Conference of the International Speech Communication Association, Interspeech 2023 |
|---|---|
| Country/Territory | Ireland |
| City | Dublin |
| Period | 20/08/23 → 24/08/23 |
Bibliographical note
Publisher Copyright:© 2023 International Speech Communication Association. All rights reserved.
Funding
This work was supported in part by the Special Fund of Hubei Luojia Laboratory (No. 220100019), the Hubei Province Technological Innovation Major Project(No. 2021BAA034) and the Fundamental Research Funds for the Central Universities (No. 2042023kf1033).
Keywords
- coarse quantization
- low bitrate
- neural vocoder
- speech coding
Fingerprint
Dive into the research topics of 'CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver