Abstract
Speech enhancement is a fundamental way to separate and generate clean speech from adverse environment where the received speech is seriously corrupted by noise. This paper applies a novel progressive network for speech enhancement by using multi-stage structure, where each stage contains a channel attention block followed by dilated encoder-decoder convolutional network with gated linear units. In addition, each stage generates a prediction that is refined by a supervised attention block. What is more, a fusion block is inserted between original inputs and outputs of previous stage. Multi-stage architecture is introduced to sequentially invoke multiple deep-learning networks, and its key ingredient is the information exchange between different stages. Thus, a more flexible and robust outputs can be generated. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art models in terms of both PESQ and STOI scores.
| Original language | English |
|---|---|
| Title of host publication | 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021: Proceedings |
| Publisher | International Speech Communication Association |
| Pages | 2263-2267 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781713836902 |
| DOIs | |
| Publication status | Published - Sept 2021 |
| Externally published | Yes |
| Event | 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021 - Brno, Czech Republic Duration: 30 Aug 2021 → 3 Sept 2021 |
Publication series
| Name | International Conference on Spoken Language Processing, Proceedings |
|---|---|
| Publisher | International Speech Communication Association |
| ISSN (Electronic) | 2958-1796 |
Conference
| Conference | 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021 |
|---|---|
| Country/Territory | Czech Republic |
| City | Brno |
| Period | 30/08/21 → 3/09/21 |
Bibliographical note
Publisher Copyright:Copyright © 2021 ISCA.
Keywords
- Channel attention
- Cross-stage feature fusion
- Encoder-decoder convolutional network
- Multi-stage progressive network
- speech enhancement
- Supervised attention
Fingerprint
Dive into the research topics of 'Multi-stage progressive speech enhancement network'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver