Abstract
We propose MiT-Net, a novel mix-transformer neural network with a pyramid encoder operating in the time domain, for the task of acoustic echo cancellation. The MiT-Net formulates acoustic echo cancellation as a supervised speech separation problem, in which near-end speech is separated from a single microphone recording and sent to the far end, and consists of two key components. First, we apply a pyramid encoder, which adopts the coarse-to-fine structure, to extract the latent correlations between double-end signals and to fuse them in a multiscale manner. Second, we propose a mix-transformer, a combination of local and global attention in a parallel way, to leverage local and global speech information for separation. Experimental results show that the proposed method outperforms recent AEC methods in terms of objective evaluation metrics. In addition, exploring the correlation between speech local and global features by using the mix-transformer significantly improves the system performance and shows more robustness than the conventional transformer.
| Original language | English |
|---|---|
| Title of host publication | ICASSP 2023: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings |
| Publisher | IEEE |
| Number of pages | 5 |
| ISBN (Electronic) | 9781728163277 |
| DOIs | |
| Publication status | Published - 2023 |
| Externally published | Yes |
| Event | ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023 |
Publication series
| Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
|---|---|
| Volume | 2023-June |
| ISSN (Print) | 1520-6149 |
Conference
| Conference | ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
|---|---|
| Country/Territory | Greece |
| City | Rhodes Island |
| Period | 4/06/23 → 10/06/23 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Funding
This work was supported in part by the National Nature Science Foundation of China (No. 62071342, No.62171326), the Special Fund of Hubei Luojia Laboratory (No. 220100019), the Hubei Province Technological Innovation Major Project (No. 2021BAA034) and the Fundamental Research Funds for the Central Universities (No. 2042022kf0001).
Keywords
- Acoustic echo cancellation
- local and global information
- mix-transformer
- multiscale fusion
Fingerprint
Dive into the research topics of 'Improving Acoustic Echo Cancellation by Mixing Speech Local and Global Features with Transformer'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver