A Two-Stage Boundary-Enhanced Contrastive Learning approach for nested named entity recognition
In Natural Language Processing (NLP), entities often contain other entities. However, most current Named Entity Recognition (NER) methods can only recognize flat entities and ignore nested entities. To solve this problem, we propose a Two-Stage Boundary-Enhanced Contrastive Learning (TSBECL) model for nested NER. This method comprises a flat NER module for identifying the outermost entities and a candidate span classification module. We design a word embedding contrastive learning method to effectively balance the semantic information of static word embedding and dynamic BERT embedding. Considering the improvement of entity recognition performance by displaying boundary information in the flat NER module, we use a Gate Recurrent Unit (GRU) to predict the head and tail of entities. In the candidate span classification module, all possible candidate spans in the inner layer are generated based on the recognized outermost entity. To improve the classification ability of candidate spans, we design a contrastive learning method suitable for enhancing the similarity between candidate spans and corresponding entity types. We also use random oversampling technology for data enhancement to alleviate the class imbalance problem between candidate spans. Finally, multi-task learning is used to capture the dependence of outermost entities and internal candidate spans, avoid the impact of error propagation, and improve the performance of entity recognition. Experimental results show that the performance of the model proposed in this paper is better than or at least the same as that of the baseline models. The F values on GENIA, ACE2005, GermEval2014, and JNLPBA datasets are 82.78%, 87.51%, 77.74%, and 75.74%, respectively.
History
Journal/Conference/Book title
Expert Systems with Applications Volume 271, 1 May 2025, 126707Publication date
2025-01-28Version
- Pre-print