Singapore Institute of Technology
Browse

Speech de-identification data augmentation leveraging large language model

This item contains files with download restrictions
conference contribution
posted on 2024-10-28, 03:01 authored by Priyanshu Dhingra, Satyam Agrawal, Chandra Sekar VeerappanChandra Sekar Veerappan, Ho Thi-Nga, Eng-Siong Chng, Rong TongRong Tong

This work addresses the challenge of limited real-world speech data in speech de-identification, the process of removing Personally Identifiable Information (PII). We formulate speech de-identification as a named entity recognition (NER) task specifically for spoken English. To overcome data scarcity and enhance NER performance, we propose a data augmentation approach. This approach leverages a large language model to generate synthetic speech style text data enriched with diverse PII entities. The generated data undergoes an iterative process using a customized NER model for semi-automatic PII annotation. Our analysis demonstrates the effectiveness of this data augmentation strategy in significantly improving NER performance on spoken language text. Furthermore, to gain deeper insights into the specific errors made during NER, we employ performance analysis using alternative evaluation metrics.

Funding

Acrf tier 1

History

Journal/Conference/Book title

2024 International Conference on Asian Language Processing (IALP)

Publication date

2024-08-04

Version

  • Pre-print

Rights statement

© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Corresponding author

tong.rong@singaporetech.edu.sg

Project ID

  • 15875 Automatic speech de-identification on Singapore English speech
  • 16081 Multimodal visual acuity testing with speech and touch panel

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC