Singapore Institute of Technology
Browse

A Flat-Span Contrastive Learning Method for Nested Named Entity Recognition

Download (1.57 MB)
journal contribution
posted on 2025-05-28, 08:24 authored by Rong TongRong Tong, kun Zhang, Liu Yaodi, Chenxi Cai, Dianying Chen, Xiaohe Wu

In Natural Language Processing (NLP), it is common for one entity to contain another entity, i.e., nested entities. However, the most commonly used methods can only handle flat entities but not nested entities. To solve this problem, this paper proposes a flat-span contrastive learning method for nested Named Entity Recognition (NER), which consists of two sub-modules: a flat NER module and a candidate span classification module. The flat NER module is used to recognize the outermost entities, and we use star-transformer to capture the long-range dependencies of sentences, and the Conditional Random Field (CRF) to decode the outermost entity spans, contrastive learning is introduced, and the InfoNEC loss function is used to increase the difference between entity spans and nonentity spans. Finally, to improve the model performance and reduce error propagation, we jointly train the flat NER and candidate span classification modules through multi-task learning. Experimental results on the GENIA, GermEval2014, and JNLPBA datasets thoroughly verify the effectiveness of our model, and the ablation experiments further demonstrate the effectiveness of the model components. In the candidate span classification module, we generate all possible candidate spans based on the outermost entities by enumeration to better distinguish entity spans from nonentity.

History

Journal/Conference/Book title

International Journal of Asian Language ProcessingVol. 35, No. 01, 2450013 (2025)

Publication date

2025-02

Version

  • Pre-print

Corresponding author

Kun Zhang

Project ID

  • 15875 (R-R12-A405-0009) Automatic speech de-identification on Singapore English speech

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC