您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (6): 68-75.doi: 10.6040/j.issn.1672-3961.0.2020.236

• • 上一篇    

中文对话理解中基于预训练的意图分类和槽填充联合模型

马常霞1,张晨2   

  1. 1.江苏海洋大学计算机工程学院, 江苏 连云港 222005;2. 新加坡国立大学电气与计算机工程学院, 新加坡 117583
  • 发布日期:2020-12-15
  • 作者简介:马常霞(1975— ),女,副教授,博士,主要研究方向为机器学习,模式识别,自然语言理解. E-mail:machangxia2002cn@aliyun.com
  • 基金资助:
    新加坡国家研究基金LTA城市机动性大挑战计划STKinetics自动公交试验(UM01/002)

Pre-trained based joint model for intent classification and slot filling in Chinese spoken language understanding

MA Changxia1, ZHANG Chen2   

  1. 1. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, Jiangsu, China;
    2. Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore
  • Published:2020-12-15

摘要: 基于预训练和注意机制的意图分类和语义槽填充,提出一种结合双向长短时记忆(bidirectional long short-term memory, BiLSTM)、条件随机场(conditional random fields, CRF)和注意机制的双向编码(bidirectional encoder representations from transformers, BERT)具有双向编码表示和注意机制的联合模型。该模型无需过多依赖手工标签数据和领域特定的知识或资源,避免了目前普遍存在的弱泛化能力。在自主公交信息查询系统语料库上进行的试验表明,该模型意图分类的准确性和语义槽填充F1值分别达到98%和96.3%,均产生有效改进。

关键词: 意图分类, 槽填充, 预训练, 双向编码表示, 多头注意

Abstract: We explored a joint model for intent classification and slot filling based on pre-train and attention mechanism because intent classification and slot filling were correlative. We combined bidirectional long short-term memory(BiLSTM), conditional random fields(CRF)and bidirectional encoder representations from transformers(BERT), which supported bidirectional and self-attentional mechanism without relying heavily on hand-crafted features and domain-specific knowledge or resources, into the proposed model. We compared the performance of our proposed architecture with the state-of-the-art models. Experiments on dataset demonstrated that the proposed architecture outperformed the state-of-the-art approaches on both tasks. Furthermore, we presented a new dialogue corpus from autonomous bus information inquiry system(ABIIS), and our methods yielded effective improvement on intent classification accuracy and slot filling F1 compared with a state-of-the-art baseline.

Key words: intent classification, slot filling, pre-train, bidirectional encoder representation, multi-head attention

中图分类号: 

  • TP391
[1] CHEN H, LIU X, YIN D, et al. A survey on dialogue systems[J]. ACM SIGKDD Explorations Newsletter, 2017, 19(2):25-35.
[2] MCCALLUM A, FREITAG D, PEREIRA F. Maximum entropy Markov models for information extraction and segmentation[C] //Proc of the Seventeenth International Conference on Machine Learning. Stanford, USA: ICML, 2000:591-598.
[3] MCCALLUM A, WEI L. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons[C] // Proc of NAACL. Singapore: ACL, 2003:188-191.
[4] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C] // Proc of the Eighteenth International Conference on Machine Learning. San Francisco, USA: ICML, 2001:282-289.
[5] KOICHI T, NIGEL C. Use of support vector machines in extended named entity recognition[C] //Proc of the 6th Conference on Natural Language Learning. Stroudsburg, USA: ACM, 2002: 1-7.
[6] DAVID N, SATOSHI S. A survey of named entity recognition and classification[J]. International Journal of Linguistics and Language Resources, 2007, 30(1):3-26.
[7] LEV R, DAN R. Design challenges and misconceptions in named entity recognition[C] //Proc of CoNLL-2009. Madison, USA: ACL, 2009:147-155.
[8] HU Zhiting, MA Xuezhe, LIU Zhengzhong, et al. Harnessing deep neural networks with logic rules[C] //Proc of ACL-2016. Berlin, Germany: ACL, 2016:2410-2420.
[9] MA Xuezhe, EDUARD HOVY. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C] // Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: ACL, 2016:1064-1074.
[10] CHIU JASON, NICHOLS ERIC. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4(7):357-370.
[11] LIU Liyuan, SHANG Jingbo, XU FRANK, et al. Empower sequence labeling with task-aware neural language model[C] //Proc of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018:5253-5260.
[12] SURENDRAN D, LEVOW G A. Dialog act tagging with support vector machines and hidden Markov models[C] //Proc of INTERSPEECH. Pittsburgh, USA: ICSLP, 2006:1950-1953.
[13] ALI S A, SULAIMAN N, MUSTAPHA A N. Improving accuracy of intention-based response classification using decision tree[J]. Information Technology Journal, 2009, 8(6):923-928.
[14] NIIMI Y, OKU T, NISHIMOTO T, et al. A rule-based approach to extraction of topics and dialog acts in a spoken dialog system[C] //Proc of EUROSPEECH 2001 Scandinavia, European Conference on Speech Communication and Technology, INTERSPEECH Event. Aalborg, Denmark: ISCA, 2001:2185-2188.
[15] MIKOLOV T, MARTIN K, LUKAS B, et al. Recurrent neural network-based language model[C] //Proc of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH. Chiba, Japan: IEEE, 2010:1045-1048.
[16] SEPP H, JUEGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[17] FELIX G, JURGEN S, FRED C. Learning to forget: Continual prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471.
[18] YAO Kaisheng, PENG Baolin, ZHANG Yu, et al. Spoken language understanding using long short-term memory neural networks[C] //Proc of Spoken Language Technology Workshop. California, USA: IEEE, 2014:189-194.
[19] SHI Yangyang, YAO Kaisheng, TIAN Le, et al. Deep lstm based feature mapping for query classification[C] //Proc of NAACL-HLT. California, USA: ACL, 2016:1501-1511.
[20] ZHANG Xiaodong, WANG Houfeng. A joint model of intent determination and slot filling for spoken language understanding[C] //Proc of 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAL, 2016:2993-2999.
[21] MATTHEW P, MARK N, MOHIT I, et al. Deep contextualized word representations[C] // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Volume 1(Long Papers)In NAACL-HLT. Louisiana, USA: ACL, 2018: 2227-2237.
[22] MIKOLOV T, ILYA S, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 10: 3113-3119.
[23] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [R]. OpenAI, 2018.
[24] DEVLIN J, CHANG MW, LEE K, et al. Bert: Pretraining of deep bidirectional transformers for language understanding[C] //Proc of NAACL-HLT. Minneapolis, USA: ACL, 2019: 4171-4186.
[25] ASHISH V, NOAM S, NIKI P, et al. Attention is all you need[C] //Proc of 31st Conference on Neural Information Processing Systems. New York, USA: NIPS, 2017:6000-6010.
[26] CHRIS D, MIGUEL B, WANG Ling, et al. Transition-based dependency parsing with stack long short-term memory[C] //Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015: 334-343.
[1] 谢晓兰,王琦. 一种基于多目标的容器云任务调度算法[J]. 山东大学学报 (工学版), 2020, 50(4): 14-21.
[2] 蔡国永,贺歆灏,储阳阳. 基于空间注意力和卷积神经网络的视觉情感分析[J]. 山东大学学报 (工学版), 2020, 50(4): 8-13.
[3] 成科扬,孙爽,詹永照. 基于背景复杂度自适应距离阈值的修正SuBSENSE算法[J]. 山东大学学报 (工学版), 2020, 50(3): 38-44.
[4] 田枫,李欣,刘芳,李闯,孙小强,杜睿山. 基于多模态子空间学习的语义标签生成方法[J]. 山东大学学报 (工学版), 2020, 50(3): 31-37, 44.
[5] 马金平. 基于UART串口的多机通讯[J]. 山东大学学报 (工学版), 2020, 50(3): 24-30.
[6] 袁高腾,刘毅慧,黄伟,胡兵. 基于Gabor特征的乳腺肿瘤MR图像分类识别模型[J]. 山东大学学报 (工学版), 2020, 50(3): 15-23.
[7] 段江丽,胡新. 自然语言问答中的语义关系识别[J]. 山东大学学报 (工学版), 2020, 50(3): 1-7.
[8] 刘保成,朴燕,宋雪梅. 联合检测的自适应融合目标跟踪[J]. 山东大学学报 (工学版), 2020, 50(3): 51-57.
[9] 严云洋,杜晨锡,刘以安,高尚兵. 基于轻型卷积神经网络的火焰检测方法[J]. 山东大学学报 (工学版), 2020, 50(2): 100-107.
[10] 张胜男,王雷,常春红,郝本利. 基于三维剪切波变换和BM4D的图像去噪方法[J]. 山东大学学报 (工学版), 2020, 50(2): 83-90.
[11] 胡龙茂,胡学钢. 基于多维相似度和情感词扩充的相同产品特征识别[J]. 山东大学学报 (工学版), 2020, 50(2): 50-59.
[12] 陈艳平,冯丽,秦永彬,黄瑞章. 一种基于深度神经网络的句法要素识别方法[J]. 山东大学学报 (工学版), 2020, 50(2): 44-49.
[13] 闫威,张达敏,张绘娟,辛梓芸,陈忠云. 基于混合决策的改进鸟群算法[J]. 山东大学学报 (工学版), 2020, 50(2): 34-43.
[14] 宋士奇,朴燕,蒋泽新. 基于改进YOLOv3的复杂场景车辆分类与跟踪[J]. 山东大学学报 (工学版), 2020, 50(2): 27-33.
[15] 陈宁宁,赵建伟,周正华. 基于校正神经网络的视频追踪算法[J]. 山东大学学报 (工学版), 2020, 50(2): 17-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!