山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (2): 96-102.doi: 10.6040/j.issn.1672-3961.0.2022.353
肖伟1,2,郑更生1,2*,陈钰佳1,2
XIAO Wei1,2, ZHENG Gengsheng1,2*, CHEN Yujia1,2
摘要: 针对命名实体识别数据集中存在某些实体类别样本过少,使模型学习该类别特征能力较差,导致整体性能较低的问题,提出结合自训练模型的命名实体识别方法。利用已有的命名实体识别数据集训练一个教师模型,通过改进的文本相似度函数搜寻与原数据集最相似的无标签文本,利用教师模型对无标签文本生成伪标签,并将伪标签与有标签数据集混合重新训练一个学生模型用于下游的命名实体识别任务。试验结果表明,相较基线模型,该方法在公共数据集MSRA、CONLL03和法律实体识别数据集上取得更优的性能。
中图分类号:
[1] ZOPH B, GHIASI G, LIN T Y, et al. Rethinking pre-training and self-training[J]. Advances in Neural Information Processing Systems, 2020, 33: 3833-3845. [2] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26: 3113-3119. [3] PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation[C] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Stroudsburg, PA, USA: ACL, 2014: 1532-1543. [4] LU J, YE M, TANG Z, et al. A novel method for Chinese named entity recognition based on character vector[C] //Proceedings of the International Conference on Collaborative Computing: Networking, Applications and Worksharing. Berlin, Germany: Springer, 2015: 141-150. [5] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C] //Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: ACL, 2016: 260-270. [6] MA X, HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C] //Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2016: 1064-1074. [7] CHIU J P C, NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 357-370. [8] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of NAACL-HLT 2019. Stroudsburg, PA, USA: ACL, 2019: 4171-4186. [9] 孙弋, 梁兵涛. 基于BERT和多头注意力的中文命名实体识别方法[J]. 重庆邮电大学学报(自然科学版), 2023, 35(1): 110-118. SUN Yi, LIANG Bingtao. Chinese named entity recognition method based on BERT and multi-head attention[J]. Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition), 2023, 35(1):110-118. [10] 耿汝山, 陈艳平, 唐瑞雪, 等. 跨度语义增强的命名实体识别方法[J]. 西安交通大学学报, 2022, 56(7): 118-126. GENG Rushan, CHEN Yanping, TANG Ruixue, et al. Named entity recognition approach with span semantic enhancement[J]. Academic Journal of Xi'an Jiaotong University, 2022, 56(7): 118-126. [11] LI Xiaoya, FENG Jingrong, MENG Yuxian, et al. A unified MRC framework for named entity recognition [C] //Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2020: 5849-5859. [12] 周裕林, 陈艳平, 黄瑞章, 等. 一种采用机器阅读理解模型的中文分词方法[J]. 西安交通大学学报, 2022, 56(8): 95-103. ZHOU Yulin, CHEN Yanping, HUANG Ruizhang, et al. Machine reading comprehension for Chinese word segmentation[J]. Academic Journal of Xi'an Jiaotong University, 2022, 56(8): 95-103. [13] YAO Huaxiu, ZHANG Chuxu, WEI Ying, et al. Graph few-shot learning via knowledge transfer[C] // Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park, CA, USA: AAAI, 2020: 6656-6663. [14] ZHANG Jianhong, ZHANG Manli, LU Zhiwu, et al. AdarGCN: adaptive aggregation GCN for few-shot learning[C] //Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway, NJ, USA: IEEE, 2021: 3482-3491. [15] BAO Yujia, WU Menghua, CHANG Shiyu, et al. Few-shot text classification with distributional signatures[C] //Proceedings of ICLR 2020 Conference. California, NJ, USA: ICLR, 2020: 1-20. [16] CHEN J, LIU Q, LIN H, et al. Few-shot named entity recognition with self-describing networks[EB/OL].(2022-3-23)[2022-06-23]. https://arxiv.org/abs/2203.12252. [17] NIU Yilin, JIAO Fangkai, ZHOU Mantong, et al. A self-training method for machine reading comprehension with soft evidence extraction[C] //Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2020: 3916-3927. [18] XIE Q Z, LUONG M T, HOVY E, et al. Self-training with noisy student improves imagenet classification[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2020: 10687-10698. [19] MENG Yu, ZHANG Yunyi, HUANG Jiaxin, et al. Distantly-supervised named entity recognition with noise-robust learning and language model augmented self-training[C] //Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: ACL, 2021: 10367-10378. [20] WANG Y, MUKHERJEE S, LIU X, et al. LiST: lite prompted self-training makes parameter-efficient few-shot learners[C] //Proceedings of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2022: 2262-2281. [21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: Curran Associates Inc, 2017: 6000-6010. [22] LEVOW G A. The third international Chinese language processing bakeoff: word segmentation and named entity recognition[C] //Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA, USA: ACL, 2006: 108-117. [23] 最高人民法院司改办, 中国中文信息学会. 中国法律智能技术评测[EB/OL].(2016-04-14)[2022-08-03]. http://cail.cipsc.org.cn/. [24] LIU Y H, OTT M, GOYAL N, et al. Roberta: a robustly optimized BERT pretraining approach[EB/OL].(2019-07-26)[2022-06-23]. https://arxiv.org/abs/1907.11692. |
[1] | 宋佳芮,陈艳平,王凯,黄瑞章,秦永彬. 基于Affix-Attention的命名实体识别语义补充方法[J]. 山东大学学报 (工学版), 2023, 53(2): 70-76. |
[2] | 袁钺,王艳丽,刘勘. 基于空洞卷积块架构的命名实体识别模型[J]. 山东大学学报 (工学版), 2022, 52(6): 105-114. |
[3] | 田轶群,林荣恒. 基于知识图谱的查询显示系统的设计与实现[J]. 山东大学学报 (工学版), 2022, 52(2): 67-73. |
[4] | 张岩,李英冰,郑翔. 基于微博数据的台风“山竹”舆情演化时空分析[J]. 山东大学学报 (工学版), 2020, 50(5): 118-126. |
[5] | 张海军,陈映辉. 语义分析及向量化大数据跨站脚本攻击智检[J]. 山东大学学报 (工学版), 2020, 50(2): 118-128. |
[6] | 谢志峰,吴佳萍,马利庄. 基于卷积神经网络的中文财经新闻分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 34-39. |
[7] | 董乃鹏 赵合计 SCHOMMER Christoph. 作者写作特征提取引擎[J]. 山东大学学报(工学版), 2009, 39(5): 27-31. |
[8] | 崔宝今 林鸿飞 张霄. 基于半监督学习的蛋白质关系抽取研究[J]. 山东大学学报(工学版), 2009, 39(3): 16-21. |
|