山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (6): 68-75.doi: 10.6040/j.issn.1672-3961.0.2020.236
马常霞1,张晨2
MA Changxia1, ZHANG Chen2
摘要: 基于预训练和注意机制的意图分类和语义槽填充,提出一种结合双向长短时记忆(bidirectional long short-term memory, BiLSTM)、条件随机场(conditional random fields, CRF)和注意机制的双向编码(bidirectional encoder representations from transformers, BERT)具有双向编码表示和注意机制的联合模型。该模型无需过多依赖手工标签数据和领域特定的知识或资源,避免了目前普遍存在的弱泛化能力。在自主公交信息查询系统语料库上进行的试验表明,该模型意图分类的准确性和语义槽填充F1值分别达到98%和96.3%,均产生有效改进。
中图分类号:
| [1] CHEN H, LIU X, YIN D, et al. A survey on dialogue systems[J]. ACM SIGKDD Explorations Newsletter, 2017, 19(2):25-35. [2] MCCALLUM A, FREITAG D, PEREIRA F. Maximum entropy Markov models for information extraction and segmentation[C] //Proc of the Seventeenth International Conference on Machine Learning. Stanford, USA: ICML, 2000:591-598. [3] MCCALLUM A, WEI L. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons[C] // Proc of NAACL. Singapore: ACL, 2003:188-191. [4] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C] // Proc of the Eighteenth International Conference on Machine Learning. San Francisco, USA: ICML, 2001:282-289. [5] KOICHI T, NIGEL C. Use of support vector machines in extended named entity recognition[C] //Proc of the 6th Conference on Natural Language Learning. Stroudsburg, USA: ACM, 2002: 1-7. [6] DAVID N, SATOSHI S. A survey of named entity recognition and classification[J]. International Journal of Linguistics and Language Resources, 2007, 30(1):3-26. [7] LEV R, DAN R. Design challenges and misconceptions in named entity recognition[C] //Proc of CoNLL-2009. Madison, USA: ACL, 2009:147-155. [8] HU Zhiting, MA Xuezhe, LIU Zhengzhong, et al. Harnessing deep neural networks with logic rules[C] //Proc of ACL-2016. Berlin, Germany: ACL, 2016:2410-2420. [9] MA Xuezhe, EDUARD HOVY. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C] // Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: ACL, 2016:1064-1074. [10] CHIU JASON, NICHOLS ERIC. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4(7):357-370. [11] LIU Liyuan, SHANG Jingbo, XU FRANK, et al. Empower sequence labeling with task-aware neural language model[C] //Proc of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018:5253-5260. [12] SURENDRAN D, LEVOW G A. Dialog act tagging with support vector machines and hidden Markov models[C] //Proc of INTERSPEECH. Pittsburgh, USA: ICSLP, 2006:1950-1953. [13] ALI S A, SULAIMAN N, MUSTAPHA A N. Improving accuracy of intention-based response classification using decision tree[J]. Information Technology Journal, 2009, 8(6):923-928. [14] NIIMI Y, OKU T, NISHIMOTO T, et al. A rule-based approach to extraction of topics and dialog acts in a spoken dialog system[C] //Proc of EUROSPEECH 2001 Scandinavia, European Conference on Speech Communication and Technology, INTERSPEECH Event. Aalborg, Denmark: ISCA, 2001:2185-2188. [15] MIKOLOV T, MARTIN K, LUKAS B, et al. Recurrent neural network-based language model[C] //Proc of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH. Chiba, Japan: IEEE, 2010:1045-1048. [16] SEPP H, JUEGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [17] FELIX G, JURGEN S, FRED C. Learning to forget: Continual prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471. [18] YAO Kaisheng, PENG Baolin, ZHANG Yu, et al. Spoken language understanding using long short-term memory neural networks[C] //Proc of Spoken Language Technology Workshop. California, USA: IEEE, 2014:189-194. [19] SHI Yangyang, YAO Kaisheng, TIAN Le, et al. Deep lstm based feature mapping for query classification[C] //Proc of NAACL-HLT. California, USA: ACL, 2016:1501-1511. [20] ZHANG Xiaodong, WANG Houfeng. A joint model of intent determination and slot filling for spoken language understanding[C] //Proc of 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAL, 2016:2993-2999. [21] MATTHEW P, MARK N, MOHIT I, et al. Deep contextualized word representations[C] // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Volume 1(Long Papers)In NAACL-HLT. Louisiana, USA: ACL, 2018: 2227-2237. [22] MIKOLOV T, ILYA S, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 10: 3113-3119. [23] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [R]. OpenAI, 2018. [24] DEVLIN J, CHANG MW, LEE K, et al. Bert: Pretraining of deep bidirectional transformers for language understanding[C] //Proc of NAACL-HLT. Minneapolis, USA: ACL, 2019: 4171-4186. [25] ASHISH V, NOAM S, NIKI P, et al. Attention is all you need[C] //Proc of 31st Conference on Neural Information Processing Systems. New York, USA: NIPS, 2017:6000-6010. [26] CHRIS D, MIGUEL B, WANG Ling, et al. Transition-based dependency parsing with stack long short-term memory[C] //Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015: 334-343. |
| [1] | 邓彬, 张宗包, 赵文猛, 罗新航, 吴秋伟. 基于云边协同和图神经网络的电动汽车充电站负荷预测方法[J]. 山东大学学报 (工学版), 2025, 55(5): 62-69. |
| [2] | 李二超, 张智钊. 在线动态订单需求车辆路径规划[J]. 山东大学学报 (工学版), 2024, 54(5): 62-73. |
| [3] | 杨巨成, 魏峰, 林亮, 贾庆祥, 刘建征. 驾驶员疲劳驾驶检测研究综述[J]. 山东大学学报 (工学版), 2024, 54(2): 1-12. |
| [4] | 肖伟, 郑更生, 陈钰佳. 结合自训练模型的命名实体识别方法[J]. 山东大学学报 (工学版), 2024, 54(2): 96-102. |
| [5] | 胡钢, 王乐萌, 卢志宇, 王琴, 徐翔. 基于节点多阶邻居递阶关联贡献度的重要性辨识[J]. 山东大学学报 (工学版), 2024, 54(1): 1-10. |
| [6] | 李家春,李博文,常建波. 一种高效且轻量的RGB单帧人脸反欺诈模型[J]. 山东大学学报 (工学版), 2023, 53(6): 1-7. |
| [7] | 樊禹江,黄欢欢,丁佳雄,廖凯,余滨杉. 基于云模型的老旧小区韧性评价体系[J]. 山东大学学报 (工学版), 2023, 53(5): 1-9, 19. |
| [8] | 李颖,王建坤. 基于监督图正则化和信息融合的轻度认知障碍分类方法[J]. 山东大学学报 (工学版), 2023, 53(4): 65-73. |
| [9] | 吴艳丽,刘淑薇,何东晓,王晓宝,金弟. 刻画多种潜在关系的泊松-伽马主题模型[J]. 山东大学学报 (工学版), 2023, 53(2): 51-60. |
| [10] | 余明骏,刁红军,凌兴宏. 基于轨迹掩膜的在线多目标跟踪方法[J]. 山东大学学报 (工学版), 2023, 53(2): 61-69. |
| [11] | 刘行,杨璐,郝凡昌. 基于多特征融合的手指静脉图像检索方法[J]. 山东大学学报 (工学版), 2023, 53(2): 118-126. |
| [12] | 刘方旭,王建,魏本征. 基于多空间注意力的小儿肺炎辅助诊断算法[J]. 山东大学学报 (工学版), 2023, 53(2): 135-142. |
| [13] | 于艺旋,杨耕,耿华. 连续复合运动的多模态层次化关键帧提取方法[J]. 山东大学学报 (工学版), 2023, 53(2): 42-50. |
| [14] | 黄华娟,程前,韦修喜,于楚楚. 融合Jaya高斯变异的自适应乌鸦搜索算法[J]. 山东大学学报 (工学版), 2023, 53(2): 11-22. |
| [15] | 张豪,李子凌,刘通,张大伟,陶建华. 融合社会学因素的模糊贝叶斯网技术预测模型[J]. 山东大学学报 (工学版), 2023, 53(2): 23-33. |
|