File(s) not publicly available
Triplet-center loss based deep embedding learning method for speaker verification
In this work, we introduce an effective loss function, i.e., triplet-center loss, to improve the performance of deep embedding learning methods for speaker verification (SV). The triplet-center loss is combination of triplet loss and center loss so that it shares superiorities of these two loss functions. Comparing with the widely used softmax loss, the main advantage of triplet-center loss is that it learns a center for each class, and it requires distances between samples and centers from the same class are closer than those from different classes. To evaluate the performance of triplet-center loss, we conduct extensive experiments on noisy and unconstrained dataset, i.e., Voxceleb. The results show that triplet-center loss significantly improves the performance of SV. Specifically, it reduces equal error rate (EER) from softmax loss by 11.6%, 10.4% in cosine scoring and PLDA backend, respectively.