Abstract
Speech enhancement is a vital and highly ill-posed problem for many speech downstream tasks. While currently existing deep learning based speech enhancement methods have held state-of-the-art results, they still possess apparent shortcomings in that most of the deep learning based models lack interpretability. This deficiency results in unsatisfied speech enhancement performance in many sophisticated scenarios. To tackle this problem, we integrate dictionary learning and sparse coding into deep learning networks for speech enhancement and present a deep dictionary learning based speech enhancement network (DicLSENet). Specifically, the proposed DicLSENet strictly follows the principle of dictionary learning, learns the priors for both representation coefficients and dictionaries, and adaptively adjusts the dictionary for each input. Experimental results show that the proposed model outperforms state-of-the-art fully deep learning based methods with attractive computational costs.
| Original language | English |
|---|---|
| Title of host publication | 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024: Proceedings |
| Publisher | IEEE |
| Pages | 10481-10485 |
| Number of pages | 5 |
| ISBN (Electronic) | 9798350344851 |
| ISBN (Print) | 9798350344868 |
| DOIs | |
| Publication status | Published - 2024 |
| Externally published | Yes |
Publication series
| Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
|---|---|
| ISSN (Print) | 1520-6149 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Funding
This work is supported by National Nature Science Foundation of China (No. 62071342, No.62171326), the Special Fund of Hubei Luojia Laboratory (No. 220100019), the Hubei Province Technological Innovation Major Project (No. 2021BAA034) and the Fundamental Research Funds for the Central Universities (No.2042023kf1033).
Keywords
- dictionary learning
- sparse coding
- Speech enhancement
Fingerprint
Dive into the research topics of 'AN EFFICIENT AND INTERPRETABLE SPEECH ENHANCEMENT NETWORK VIA DEEP DICTIONARY LEARNING'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver