Singapore Institute of Technology
Browse

On the Nature and Potential of Deep Noise Suppression Embeddings

journal contribution
posted on 2025-06-02, 02:13 authored by Ian McLoughlinIan McLoughlin, Zhongqiang Ding, Bowen ZhangBowen Zhang, Evelyn Kurniawati, Benjamin Premkumar Annamalai, Sasiraj Somarajan, Song Yan

Deep noise suppression (DNS) and AI-based speech denoising architectures learn a regression task of transforming noisy speech into clean speech. Logically, the task can be accomplished by either learning noise characteristics to identify and remove noise or by learning speech characteristics to strengthen speech with respect to noise. Architectures that employ the former approach require a good noise model, whereas the latter architectures require a strong speech model. Denoising can then be accomplished by using a noise model to mask noisy parts of the input, or by using the speech model to enhance the speech parts of the input, with both guided by appropriate training data and loss functions. Modern DNS systems are powerful and compact networks that use psychoacoustically inspired objective functions to learn their internal representations. We demonstrate that they effectively combine both approaches. This is despite having neither noise or speech labels in the training data, hence these latent representations are unsupervised. This paper explores embeddings from two recent high performance DNS architectures, to determine how they model both noise and speech across layers. Results reveal strong clustering for both speech and noise, plus significant speaker characteristic separation. This leads to a new understanding that both architectures can learn in unsupervised fashion to have speaker and noise discrimination abilities. These have strong potential to be exploited for related speaker and noise-based machine learning tasks.

History

Journal/Conference/Book title

Circuits, Systems, and Signal Processing

Publication date

2025-05-17

Version

  • Post-print

Rights statement

This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s00034-025-03138-1

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC