您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (2): 44-49.doi: 10.6040/j.issn.1672-3961.0.2019.313

• 机器学习与数据挖掘 • 上一篇    下一篇

一种基于深度神经网络的句法要素识别方法

陈艳平1,2(),冯丽1,3,*(),秦永彬1,2,黄瑞章1,2   

  1. 1. 贵州大学计算机科学与技术学院,贵州 贵阳550025
    2. 数据融合与分析实验室(贵州大学),贵州 贵阳550025
    3. 贵州省智能人机交互工程技术研究中心,贵州 贵阳550025
  • 收稿日期:2019-06-16 出版日期:2020-04-20 发布日期:2020-04-16
  • 通讯作者: 冯丽 E-mail:ypench@gmail.com;gzu_fl931126@163.com
  • 作者简介:陈艳平(1980—),男,贵州安顺人,博士,副教授,主要研究方向为数据融合分析,自然语言处理,知识发现.E-mail: ypench@gmail.com
  • 基金资助:
    国家自然科学基金联合基金重点项目(U1836205);国家自然科学基金重大研究计划项目(91746116);贵州省重大应用基础研究项目(黔科合JZ字[2014]2001);贵州省科技重大专项计划(黔科合重大专项字[2017]3002);贵州省自然科学基金(黔科合基础[2018]1035)

A syntactic element recognition method based on deep neural network

Yanping CHEN1,2(),Li FENG1,3,*(),Yongbin QIN1,2,Ruizhang HUANG1,2   

  1. 1. School of Computer Science and Technology, Guizhou University, Guiyang 550025, Guizhou, China
    2. Data Fusion and Analysis Laboratory (Guizhou University), Guiyang 550025, Guizhou, China
    3. Guizhou Intelligent Human-Computer Interaction Engineering Technology Research Center, Guiyang 550025, Guizhou, China
  • Received:2019-06-16 Online:2020-04-20 Published:2020-04-16
  • Contact: Li FENG E-mail:ypench@gmail.com;gzu_fl931126@163.com
  • Supported by:
    国家自然科学基金联合基金重点项目(U1836205);国家自然科学基金重大研究计划项目(91746116);贵州省重大应用基础研究项目(黔科合JZ字[2014]2001);贵州省科技重大专项计划(黔科合重大专项字[2017]3002);贵州省自然科学基金(黔科合基础[2018]1035)

摘要:

为改进传统特征方法很难获取中文句子中结构信息的问题,提出一种基于深度神经网络的句法要素识别模型。采用Bi-LSTM网络从原始数据中自动抽取句子中的结构信息和语义信息,利用Attention机制自动计算抽象语义特征的分类权重,通过CRF层对输出标签进行约束,输出最优的标注序列。经过对比验证,该模型能有效识别句子中的句法要素,在标注数据集上F1达到84.85%。

关键词: 句法要素, 信息抽取, 深度神经网络

Abstract:

It was difficult to obtain structural information in Chinese sentences by the traditional feature method. To solve the problem, according to characteristics of Chinese sentence, a Bi-LSTM-Attention-CRF model was proposed based on deep neural network. A Bi-LSTM network was used to automatically extract structural information and semantic information from raw input sentences. Attention mechanism was adopted to weight abstract semantic features for classification. An optimized label sequence was output through the CRF layer. Comparing with other methods, our model could effectively identify syntactic elements in sentences. The performance reached to 84.85% in F1 score in the evaluation data sets.

Key words: syntactic elements, information extraction, deep neural network

中图分类号: 

  • TP391

图1

Bi-LSTM-Attention-CRF句法要素识别模型"

表1

模型性能"

Model All Type SUB ADV RAI LOC TEM
P/% R/% F1/% P/% R/% F1/% P/% R/% F1/% P/% R/% F1/% P/% R/% F1/% P/% R/% F1/%
CRF 86.61 80.34 83.36 83.75 72.93 77.97 74.35 66.06 69.96 85.79 80.21 82.91 84.33 75.30 79.56 78.61 71.20 74.73
Bi-LSTM-CRF 86.25 83.06 84.62 82.14 79.66 80.88 70.49 68.47 69.46 89.39 83.69 86.45 84.15 73.60 78.52 81.77 78.17 79.92
Bi-LSTM-Attention-CRF 86.22 83.52 84.85 82.19 78.74 80.43 71.88 71.00 71.44 87.65 81.97 87.71 86.23 86.80 86.51 81.87 75.41 78.51
1 FILLMORE C J . Toward a modern theory of case[M]. Washington D.C., USA: Department of Health, Education and Welfare, 1968.
2 吴帅, 潘海珍. 基于隐马尔可夫模型的中文分词[J]. 现代计算机(专业版), 2018, (33): 25- 28.
WU Shuai , PAN Haizhen . Chinese word segmentation based on hidden markov model[J]. Modern Computer (Professional), 2018, (33): 25- 28.
3 姚茂建, 李晗静, 吕会华. 基于马尔科夫模型的聋生阅读输入分析[J]. 北京联合大学学报, 2018, 32 (3): 86- 92.
YAO Maojian , LI Hanjing , LV Huihua . Analysis of reading input of deaf students based on markov model[J]. Journal of Beijing Union University, 2018, 32 (3): 86- 92.
4 刘晨玥, 李兵, 吴卫星. 基于罪名相关成分标注的刑事裁判文书概要信息提取[J]. 山东科技大学学报(自然科学版), 2018, 37 (4): 92- 101.
LIU Chenyue , LI Bing , WU Weixing . Extraction of summary information of criminal judgment documents based on the labeling of relevant components of charges[J]. Journal of Shandong University of Science and Technology (Natural Science Edition), 2018, 37 (4): 92- 101.
5 COHN T, BLUNSOM P. Semantic role labeling with tree conditional random fields[C]//Proceedings of the9th Conference on Computational Natural Language Learning. Stroudsburg, USA: ACM Press, 2005: 169-172.
6 YU J D , FAN X Z , PANG W B , et al. Semantic role labeling based on conditional random fields[J]. Journal of Southeast University (English Edition), 2007, 23 (3): 361- 364.
7 王臻, 常宝宝, 穗志方. 基于分层输出神经网络的汉语语义角色标注[J]. 中文信息学报, 2014, 28 (6): 56- 61.
doi: 10.3969/j.issn.1003-0077.2014.06.008
WANG Zhen , CHANG Baobao , SUI Zhifang . Chinese semantic role labeling based on hierarchical output neural network[J]. Chinese Journal of Information, 2014, 28 (6): 56- 61.
doi: 10.3969/j.issn.1003-0077.2014.06.008
8 ZHOU Jie, XU Wei. End-to-end learning of semantic role labeling using recurrent neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Beijing, China: Association for Computational Linguistics, 2015: 1127-1137.
9 ROTH M, LAPATA M. Neural semantic role labeling with dependency path embeddings[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: Association for computational linguistics, 2016: 1192-1202.
10 SHA Lei, JIANG Tingsong, CHANG Baobao, et al. Capturing argument relationship for Chinese semantic role labeling[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, USA: the Association for computational linguistics, 2016: 2011-2016.
11 GUO Jiang, CHE Wanxiang, WANG Haifeng, et al. A unified architecture for semantic role labeling and relation classification[C]//Proceedings of the 26th International Conference on Computational Linguistics. Osaka, Japan: the Association for Computational Linguistics, 2016: 1264-1274.
12 王瑞波, 李济洪, 李国臣, 等. 基于Dropout正则化的汉语框架语义角色识别[J]. 中文信息学报, 2017, 31 (1): 147- 154.
WANG Ruibo , LI Jihong , LI Guochen , et al. Chinese framework semantic role recognition based on dropout regularization[J]. Chinese Journal of Information, 2017, 31 (1): 147- 154.
13 HE Luheng, LEE Kenton, LEWIS Mike, et al. Deep semantic role labeling: what works and what s next[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada: Association for Computational Linguistics, 2017: 473-483.
14 MASASHI Yoshikawa, HIROSHI Noji, YUJI Matsu-moto. A* CCG Parsing with a supertag and dependency factored model[C]// Conference on Empirical Methods in Natural Language Processing. Vancouver, Canada: Association for Computational Linguistics, 2017: 277-287.
15 TAN Zhixing, WANG Mingxuan, XIE Jun, et al. Deep semantic role labeling with self-attention[J/OL]. arXiv: 1712.01586, 2017. https://arxiv.org/pdf/1712.01586.pdf.
16 STRUBELL Emma, VERGA Patrick, ANDOR Daniel., et al. Linguistically-informed self-attention for semantic role labeling[J/OL]. arXiv: 1804.08199v3[cs.CL], 2018. https://arxiv.org/pdf/1804.08199.pdf.
17 张苗苗, 刘明童, 张玉洁, 等. 融合Gate过滤机制与深度Bi-LSTM-CRF的汉语语义角色标注[J]. 情报工程, 2018, 4 (2): 45- 53.
ZHANG Miaomiao , LIU Mingtong , ZHANG Yujie , et al. Gate filtering mechanism and deep Bi-LSTM-CRF semantic role labeling in Chinese are integrated[J]. Intelligence Engineering, 2018, 4 (2): 45- 53.
18 XUE Nianwen, MARTHA Stone Palmer. Automatic semantic role labeling for Chinese verbs[C]// Proceedings of the 19th International Joint Conference on Artificial Intelligence. San Francisco, USA: Morgan Kaufmann, 2005.
19 王明轩, 刘群. 基于深度神经网络的语义角色标注[J]. 中文信息学报, 2018, 32 (2): 50- 57.
doi: 10.3969/j.issn.1003-0077.2018.02.006
WANG Mingxue , LIU Qun . Semantic role labeling based on deep neural network[J]. Chinese Journal of Information, 2018, 32 (2): 50- 57.
doi: 10.3969/j.issn.1003-0077.2018.02.006
20 YOU Liping, LIU Kaiying. Building Chinese FrameNet database[C]// Proceedings of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering. Wuhan, China: Chinese Information Society of China, 2005: 323-328.
[1] 陈宁宁,赵建伟,周正华. 基于校正神经网络的视频追踪算法[J]. 山东大学学报 (工学版), 2020, 50(2): 17-26.
[2] 沈冬冬,周风余,栗梦媛,王淑倩,郭仁和. 基于集成深度神经网络的室内无线定位[J]. 山东大学学报 (工学版), 2018, 48(5): 95-102.
[3] 唐乐爽,田国会,黄彬. 一种基于DSmT推理的物品融合识别算法[J]. 山东大学学报(工学版), 2018, 48(1): 50-56.
[4] 刘帆,陈泽华,柴晶. 一种基于深度神经网络模型的多聚焦图像融合方法[J]. 山东大学学报(工学版), 2016, 46(3): 7-13.
[5] 赵志宏1,2 ,黄蕾2 ,刘峰2 ,陈振宇1,2 .

Deep Web搜索技术进展综述

[J]. 山东大学学报(工学版), 2009, 39(2): 15-20.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 徐晓丹, 段正杰, 陈中育. 基于扩展情感词典及特征加权的情感挖掘方法[J]. 山东大学学报(工学版), 2014, 44(6): 15 -18 .
[2] 贾超,赵建宇,徐帮树,岳长城,李树忱 . 清水隧道围岩软土振动液化研究[J]. 山东大学学报(工学版), 2008, 38(1): 83 -87 .
[3] 丑武胜 王朔. 大刚度环境下力反馈主手自适应算法研究[J]. 山东大学学报(工学版), 2010, 40(1): 1 -5 .
[4] 张宁 李术才 李明田 杨磊. 新型岩石相似材料的研制[J]. 山东大学学报(工学版), 2009, 39(4): 149 -154 .
[5] 曹刚 董朝阳 黄洁宝 薛禹胜. 应用FACTS装置实现电力系统区间震荡阻尼控制[J]. 山东大学学报(工学版), 2009, 39(3): 31 -36 .
[6] 张志钢,张承慧,赵洪国,焉杰. 观测时滞连续系统的白噪声H2估计[J]. 山东大学学报(工学版), 2009, 39(3): 56 -61 .
[7] 郝明辉,王锡平,王敏,周慎杰 .

考虑偶应力影响的有限大板单边裂纹计算

[J]. 山东大学学报(工学版), 2008, 38(2): 92 -95 .
[8] 黄延敏1,2,主沉浮1*,陈淑祥2*,宋翠2,许超2. 微纳米聚丙烯保鲜盒中微纳米银向食品模拟液中的迁移研究[J]. 山东大学学报(工学版), 2010, 40(2): 110 -112 .
[9] 任小花 崔兆杰. 煤气化高浓度含酚废水萃取/反萃取脱酚技术研究[J]. 山东大学学报(工学版), 2010, 40(1): 93 -97 .
[10] 杨晶,岳钦艳,李颖,李仁波,高宝玉 . 改性活性炭纤维在含磷废水中的应用[J]. 山东大学学报(工学版), 2008, 38(1): 92 -95 .