Skip to main navigation Skip to search Skip to main content

AN EFFICIENT AND INTERPRETABLE SPEECH ENHANCEMENT NETWORK VIA DEEP DICTIONARY LEARNING

  • Xinmeng XU
  • , Yiqun ZHANG
  • , Weiping TU*
  • , Yuhong YANG
  • *Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Speech enhancement is a vital and highly ill-posed problem for many speech downstream tasks. While currently existing deep learning based speech enhancement methods have held state-of-the-art results, they still possess apparent shortcomings in that most of the deep learning based models lack interpretability. This deficiency results in unsatisfied speech enhancement performance in many sophisticated scenarios. To tackle this problem, we integrate dictionary learning and sparse coding into deep learning networks for speech enhancement and present a deep dictionary learning based speech enhancement network (DicLSENet). Specifically, the proposed DicLSENet strictly follows the principle of dictionary learning, learns the priors for both representation coefficients and dictionaries, and adaptively adjusts the dictionary for each input. Experimental results show that the proposed model outperforms state-of-the-art fully deep learning based methods with attractive computational costs.
Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024: Proceedings
PublisherIEEE
Pages10481-10485
Number of pages5
ISBN (Electronic)9798350344851
ISBN (Print)9798350344868
DOIs
Publication statusPublished - 2024
Externally publishedYes

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Funding

This work is supported by National Nature Science Foundation of China (No. 62071342, No.62171326), the Special Fund of Hubei Luojia Laboratory (No. 220100019), the Hubei Province Technological Innovation Major Project (No. 2021BAA034) and the Fundamental Research Funds for the Central Universities (No.2042023kf1033).

Keywords

  • dictionary learning
  • sparse coding
  • Speech enhancement

Fingerprint

Dive into the research topics of 'AN EFFICIENT AND INTERPRETABLE SPEECH ENHANCEMENT NETWORK VIA DEEP DICTIONARY LEARNING'. Together they form a unique fingerprint.

Cite this