Multi-stage progressive speech enhancement network

  • Xinmeng XU
  • , Yang WANG
  • , Dongxiang XU
  • , Yiyuan PENG
  • , Cong ZHANG
  • , Jie JIA
  • , Binbin CHEN

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

Speech enhancement is a fundamental way to separate and generate clean speech from adverse environment where the received speech is seriously corrupted by noise. This paper applies a novel progressive network for speech enhancement by using multi-stage structure, where each stage contains a channel attention block followed by dilated encoder-decoder convolutional network with gated linear units. In addition, each stage generates a prediction that is refined by a supervised attention block. What is more, a fusion block is inserted between original inputs and outputs of previous stage. Multi-stage architecture is introduced to sequentially invoke multiple deep-learning networks, and its key ingredient is the information exchange between different stages. Thus, a more flexible and robust outputs can be generated. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art models in terms of both PESQ and STOI scores.
Original languageEnglish
Title of host publication22nd Annual Conference of the International Speech Communication Association, Interspeech 2021: Proceedings
PublisherInternational Speech Communication Association
Pages2263-2267
Number of pages5
ISBN (Electronic)9781713836902
DOIs
Publication statusPublished - Sept 2021
Externally publishedYes
Event22nd Annual Conference of the International Speech Communication Association, Interspeech 2021 - Brno, Czech Republic
Duration: 30 Aug 20213 Sept 2021

Publication series

NameInternational Conference on Spoken Language Processing, Proceedings
PublisherInternational Speech Communication Association
ISSN (Electronic)2958-1796

Conference

Conference22nd Annual Conference of the International Speech Communication Association, Interspeech 2021
Country/TerritoryCzech Republic
CityBrno
Period30/08/213/09/21

Bibliographical note

Publisher Copyright:
Copyright © 2021 ISCA.

Keywords

  • Channel attention
  • Cross-stage feature fusion
  • Encoder-decoder convolutional network
  • Multi-stage progressive network
  • speech enhancement
  • Supervised attention

Fingerprint

Dive into the research topics of 'Multi-stage progressive speech enhancement network'. Together they form a unique fingerprint.

Cite this