Singapore Institute of Technology
Browse
- No file added yet -

SpeeDF - A Speech De-identification Framework

Download (235.28 kB)
conference contribution
posted on 2024-09-13, 08:08 authored by Chandra Sekar VeerappanChandra Sekar Veerappan, Priyanshu Dhingra, Zhengkui WangZhengkui Wang, Rong TongRong Tong

This paper proposes SpeeDF, a novel three-step framework for anonymizing speech data, particularly focusing on Singaporean English (Singlish). SpeeDF tackles the challenge of protecting less-studied Personally Identifiable Information (PII) like NRIC and passport numbers, which often go overlooked by traditional de-identification methods. Unlike approaches focused solely on entity extraction, SpeeDF leverages a combination of automatic speech recognition (ASR), named entity recognition (NER), and information anonymization. This comprehensive approach ensures thorough PII redaction while preserving the naturalness and usability of the anonymized speech data for research and various downstream applications.

History

Publication date

2024-12-02

Corresponding author

Rong Tong

Project ID

  • 15875 Automatic speech de-identification on Singapore English speech

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC