Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer

  • Jizhen LI
  • , Xinmeng XU
  • , Weiping TU*
  • , Yuhong YANG
  • , Rong ZHU
  • *Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Recent speech enhancement methods based on convolutional neural networks (CNNs) and transformer have been demonstrated to efficaciously capture time-frequency (T-F) information on spectrogram. However, the correlation of each channels of speech features is failed to explore. Theoretically, each channel map of speech features obtained by different convolution kernels contains information with different scales demonstrating strong correlations. To fill this gap, we propose a novel dual-branch architecture named channel-aware dual-branch conformer (CADB-Conformer), which effectively explores the long range time and frequency correlations among different channels, respectively, to extract channel relation aware time-frequency information. Ablation studies conducted on DNS-Challenge 2020 dataset demonstrate the importance of channel feature leveraging while showing the significance of channel relation aware T-F information for speech enhancement. Extensive experiments also show that the proposed model achieves superior performance than recent methods with an attractive computational costs.
Original languageEnglish
Title of host publication25th Annual Conference of the International Speech Communication Association, Interspeech 2024: Proceedings
PublisherInternational Speech Communication Association
Pages1720-1724
Number of pages5
DOIs
Publication statusPublished - 2024
Externally publishedYes

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN (Print)2308-457X

Bibliographical note

Publisher Copyright:
© 2024 International Speech Communication Association. All rights reserved.

Funding

This work was supported in party by the National Nature Science Foundation of China (No.62071342, No.62171326), the Special Fund of Hubei Luojia Laboratory (No.220100019), the Hubei Province Technological Innovation Major Project(No.2021BAA034) and the Fundamental Research Funds for the Central Universities (No.2042023kf1033).

Keywords

  • attention mechanism
  • dual-branch architecture
  • inter-channel
  • speech enhancement

Fingerprint

Dive into the research topics of 'Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer'. Together they form a unique fingerprint.

Cite this