An effective perturbation based semi-supervised learning method for sound event detection

Zheng, Xu; Song, Yan; Yan, Jie; Dai, Li-Rong; Liu, L.; McLoughlin, Ian; Liu, Lin

doi:10.21437/Interspeech.2020-2329

zheng20_interspeech.pdf (300.67 kB)

An effective perturbation based semi-supervised learning method for sound event detection

conference contribution

posted on 2024-04-03, 05:49 authored by Xu Zheng, Yan Song, Jie Yan, Li-Rong Dai, Liu, L., Ian McLoughlinIan McLoughlin, Lin Liu

Mean teacher based methods are increasingly achieving state-of-the-art performance for large-scale weakly labeled and unlabeled sound event detection (SED) tasks in recent DCASE challenges. By penalizing inconsistent predictions under different perturbations, mean teacher methods can exploit large-scale unlabeled data in a self-ensembling manner. In this paper, an effective perturbation based semi-supervised learning (SSL) method is proposed based on the mean teacher method. Specifically, a new independent component (IC) module is proposed to introduce perturbations for different convolutional layers, designed as a combination of batch normalization and dropblock operations. The proposed IC module can reduce correlation between neurons to improve performance. A global statistics pooling based attention module is further proposed to explicitly model inter-dependencies between the time-frequency domain and channels, using statistics information (e.g. mean, standard deviation, max) along different dimensions. This can provide an effective attention mechanism to adaptively re-calibrate the output feature map. Experimental results on Task 4 of the DCASE2018 challenge demonstrate the superiority of the proposed method, achieving about 39.8% F1-score, outperforming the previous winning system’s 32.4% by a significant margin.

History

Journal/Conference/Book title

Annual Conference of the International Speech Communication Association, INTERSPEECH, October 25–29, 2020, Shanghai, China.

Publication date

2020-10-25

Rights statement

Zheng, X., Song, Y., Yan, J., Dai, L.-R., McLoughlin, I., Liu, L. (2020) An Effective Perturbation Based Semi-Supervised Learning Method for Sound Event Detection. Proc. Interspeech 2020, 841-845, doi: 10.21437/Interspeech.2020-2329.

Usage metrics

Keywords

sound event detection semi-supervised learning independent component analysis statistics pooling

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

An effective perturbation based semi-supervised learning method for sound event detection

History

Journal/Conference/Book title

Publication date

Rights statement

Usage metrics

Categories

Keywords

Licence

Exports