File(s) not publicly available
Variance Normalised Features for Language and Dialect Discrimination
This paper proposes novel features for automated language and dialect identification that aim to improve discriminative power by ensuring that each element of the feature vector has a normalised contribution to inter-class variance. The method firstly computes inter- and intra-class frequency variance statistics and then distributes the overall spectral variance across spectral regions which are sized to contain near-equal-variance difference. Spectral features are average pooled within regions to obtain variance normalised features (VNFs). The proposed VNFs are low complexity drop-in replacements for MFCC, SDC, PLP or other input features used for speech-related tasks. In this paper, they are evaluated in three types of system, against MFCCs, for two data-constrained language and dialect identification tasks. VNFs demonstrate good results, comfortably outperforming MFCCs at most dimension sizes, and yielding particularly good performance for the most challenging data-constrained 3s utterance length in the LID task.