Singapore Institute of Technology
Browse

Leveraging Large Language Models for Speech De-Identification

Download (596.94 kB)
journal contribution
posted on 2025-05-28, 07:46 authored by Rong TongRong Tong, Priyanshu Dhingra, Satyam Agrawal, Chandra Sekar VeerappanChandra Sekar Veerappan, Eng-Siong Chng

This paper presents a novel approach to address the scarcity of labeled data in speech de-identification, a critical task for protecting personal privacy. By leveraging a large language model, we propose a fully automated data augmentation strategy that generates synthetic speech text data enriched with diverse personally identifiable information (PII) entities. This augmented dataset is then used to train the speech-deidentifcation models, significantly improving its performance on spoken language. To further enhance de-identification accuracy, we explore both pipeline and end-to-end models. While the pipeline approach sequentially applies speech recognition and NER, the end-to-end model jointly learns these tasks. Our experimental results demonstrate the effectiveness of our data augmentation strategy and the superiority of the end-to-end model in improving PII detection accuracy and robustness.

History

Journal/Conference/Book title

International Journal of Asian Language Processing (IJALP)

Publication date

2025-02

Version

  • Published

Corresponding author

Rong Tong

Project ID

  • 15875 (R-R12-A405-0009) Automatic speech de-identification on Singapore English speech

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC