Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (6): 68-75.doi: 10.6040/j.issn.1672-3961.0.2020.236

Previous Articles     Next Articles

Pre-trained based joint model for intent classification and slot filling in Chinese spoken language understanding

MA Changxia1, ZHANG Chen2   

  1. 1. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, Jiangsu, China;
    2. Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore
  • Published:2020-12-15

Abstract: We explored a joint model for intent classification and slot filling based on pre-train and attention mechanism because intent classification and slot filling were correlative. We combined bidirectional long short-term memory(BiLSTM), conditional random fields(CRF)and bidirectional encoder representations from transformers(BERT), which supported bidirectional and self-attentional mechanism without relying heavily on hand-crafted features and domain-specific knowledge or resources, into the proposed model. We compared the performance of our proposed architecture with the state-of-the-art models. Experiments on dataset demonstrated that the proposed architecture outperformed the state-of-the-art approaches on both tasks. Furthermore, we presented a new dialogue corpus from autonomous bus information inquiry system(ABIIS), and our methods yielded effective improvement on intent classification accuracy and slot filling F1 compared with a state-of-the-art baseline.

Key words: intent classification, slot filling, pre-train, bidirectional encoder representation, multi-head attention

CLC Number: 

  • TP391
[1] CHEN H, LIU X, YIN D, et al. A survey on dialogue systems[J]. ACM SIGKDD Explorations Newsletter, 2017, 19(2):25-35.
[2] MCCALLUM A, FREITAG D, PEREIRA F. Maximum entropy Markov models for information extraction and segmentation[C] //Proc of the Seventeenth International Conference on Machine Learning. Stanford, USA: ICML, 2000:591-598.
[3] MCCALLUM A, WEI L. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons[C] // Proc of NAACL. Singapore: ACL, 2003:188-191.
[4] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C] // Proc of the Eighteenth International Conference on Machine Learning. San Francisco, USA: ICML, 2001:282-289.
[5] KOICHI T, NIGEL C. Use of support vector machines in extended named entity recognition[C] //Proc of the 6th Conference on Natural Language Learning. Stroudsburg, USA: ACM, 2002: 1-7.
[6] DAVID N, SATOSHI S. A survey of named entity recognition and classification[J]. International Journal of Linguistics and Language Resources, 2007, 30(1):3-26.
[7] LEV R, DAN R. Design challenges and misconceptions in named entity recognition[C] //Proc of CoNLL-2009. Madison, USA: ACL, 2009:147-155.
[8] HU Zhiting, MA Xuezhe, LIU Zhengzhong, et al. Harnessing deep neural networks with logic rules[C] //Proc of ACL-2016. Berlin, Germany: ACL, 2016:2410-2420.
[9] MA Xuezhe, EDUARD HOVY. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C] // Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: ACL, 2016:1064-1074.
[10] CHIU JASON, NICHOLS ERIC. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4(7):357-370.
[11] LIU Liyuan, SHANG Jingbo, XU FRANK, et al. Empower sequence labeling with task-aware neural language model[C] //Proc of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018:5253-5260.
[12] SURENDRAN D, LEVOW G A. Dialog act tagging with support vector machines and hidden Markov models[C] //Proc of INTERSPEECH. Pittsburgh, USA: ICSLP, 2006:1950-1953.
[13] ALI S A, SULAIMAN N, MUSTAPHA A N. Improving accuracy of intention-based response classification using decision tree[J]. Information Technology Journal, 2009, 8(6):923-928.
[14] NIIMI Y, OKU T, NISHIMOTO T, et al. A rule-based approach to extraction of topics and dialog acts in a spoken dialog system[C] //Proc of EUROSPEECH 2001 Scandinavia, European Conference on Speech Communication and Technology, INTERSPEECH Event. Aalborg, Denmark: ISCA, 2001:2185-2188.
[15] MIKOLOV T, MARTIN K, LUKAS B, et al. Recurrent neural network-based language model[C] //Proc of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH. Chiba, Japan: IEEE, 2010:1045-1048.
[16] SEPP H, JUEGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[17] FELIX G, JURGEN S, FRED C. Learning to forget: Continual prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471.
[18] YAO Kaisheng, PENG Baolin, ZHANG Yu, et al. Spoken language understanding using long short-term memory neural networks[C] //Proc of Spoken Language Technology Workshop. California, USA: IEEE, 2014:189-194.
[19] SHI Yangyang, YAO Kaisheng, TIAN Le, et al. Deep lstm based feature mapping for query classification[C] //Proc of NAACL-HLT. California, USA: ACL, 2016:1501-1511.
[20] ZHANG Xiaodong, WANG Houfeng. A joint model of intent determination and slot filling for spoken language understanding[C] //Proc of 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAL, 2016:2993-2999.
[21] MATTHEW P, MARK N, MOHIT I, et al. Deep contextualized word representations[C] // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Volume 1(Long Papers)In NAACL-HLT. Louisiana, USA: ACL, 2018: 2227-2237.
[22] MIKOLOV T, ILYA S, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 10: 3113-3119.
[23] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [R]. OpenAI, 2018.
[24] DEVLIN J, CHANG MW, LEE K, et al. Bert: Pretraining of deep bidirectional transformers for language understanding[C] //Proc of NAACL-HLT. Minneapolis, USA: ACL, 2019: 4171-4186.
[25] ASHISH V, NOAM S, NIKI P, et al. Attention is all you need[C] //Proc of 31st Conference on Neural Information Processing Systems. New York, USA: NIPS, 2017:6000-6010.
[26] CHRIS D, MIGUEL B, WANG Ling, et al. Transition-based dependency parsing with stack long short-term memory[C] //Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015: 334-343.
[1] DENG Bin, ZHANG Zongbao, ZHAO Wenmeng, LUO Xinhang, WU Qiuwei. Cloud-edge collaborative and graph neural network based load forecasting method for electric vehicle charging stations [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 62-69.
[2] LI Erchao, ZHANG Zhizhao. Online dynamic demand vehicle routing planning [J]. Journal of Shandong University(Engineering Science), 2024, 54(5): 62-73.
[3] YANG Jucheng, WEI Feng, LIN Liang, JIA Qingxiang, LIU Jianzheng. A research survey of driver drowsiness driving detection [J]. Journal of Shandong University(Engineering Science), 2024, 54(2): 1-12.
[4] XIAO Wei, ZHENG Gengsheng, CHEN Yujia. Named entity recognition method combined with self-training model [J]. Journal of Shandong University(Engineering Science), 2024, 54(2): 96-102.
[5] Gang HU, Lemeng WANG, Zhiyu LU, Qin WANG, Xiang XU. Importance identification method based on multi-order neighborhood hierarchical association contribution of nodes [J]. Journal of Shandong University(Engineering Science), 2024, 54(1): 1-10.
[6] Jiachun LI,Bowen LI,Jianbo CHANG. An efficient and lightweight RGB frame-level face anti-spoofing model [J]. Journal of Shandong University(Engineering Science), 2023, 53(6): 1-7.
[7] Yujiang FAN,Huanhuan HUANG,Jiaxiong DING,Kai LIAO,Binshan YU. Resilience evaluation system of the old community based on cloud model [J]. Journal of Shandong University(Engineering Science), 2023, 53(5): 1-9, 19.
[8] Ying LI,Jiankun WANG. The classification of mild cognitive impairment based on supervised graph regularization and information fusion [J]. Journal of Shandong University(Engineering Science), 2023, 53(4): 65-73.
[9] WU Yanli, LIU Shuwei, HE Dongxiao, WANG Xiaobao, JIN Di. Poisson-gamma topic model of describing multiple underlying relationships [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 51-60.
[10] YU Mingjun, DIAO Hongjun, LING Xinghong. Online multi-object tracking method based on trajectory mask [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 61-69.
[11] LIU Xing, YANG Lu, HAO Fanchang. Finger vein image retrieval based on multi-feature fusion [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 118-126.
[12] LIU Fangxu, WANG Jian, WEI Benzheng. Auxiliary diagnosis algorithm for pediatric pneumonia based on multi-spatial attention [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 135-142.
[13] YU Yixuan, YANG Geng, GENG Hua. Multimodal hierarchical keyframe extraction method for continuous combined motion [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 42-50.
[14] HUANG Huajuan, CHENG Qian, WEI Xiuxi, YU Chuchu. Adaptive crow search algorithm with Jaya algorithm and Gaussian mutation [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 11-22.
[15] ZHANG Hao, LI Ziling, LIU Tong, ZHANG Dawei, TAO Jianhua. A technology prediction model based on fuzzy Bayesian networks with sociological factors [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 23-33.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] SHI Lai-shun,WAN Zhong-yi . Synthesis and performance evaluation of a novel betaine-type asphalt emulsifier[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(4): 112 -115 .
[2] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 92 -95 .
[3] LI Hui-ping, ZHAO Guo-qun, ZHANG Lei, HE Lian-fang. The development status of hot stamping and quenching of ultra high-strength steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 69 -74 .
[4] SUN Cong-zheng,GUAN Cong-sheng,QIN Jing-yu,CHENG Chuan . The structure and performances of the electroless Ni-P alloy coating on aluminum alloy[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(5): 108 -112 .
[5] XIA Bin,ZHANG Lian-jun . Energy comparison-based TOA estimation algorithm for the DS-CDMA UWB system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(1): 70 -73 .
[6] HU Tian-liang,LI Peng,ZHANG Cheng-rui,ZUO Yi . Design of a QEP decode counter based on VHDL[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 10 -13 .
[7] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 104 -107 .
[8] XUE Qiang,AI Xing,ZHAO Jun,ZHOU Yong-hui,YUAN Xun-liang . Effects of TiC nano-sized particle on the microstructure and properties of Si3N4 composite ceramics[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 69 -72 .
[9] DIAO Yong, TIAN Si-Meng, CAO Zhe-Meng. Geological work method for the construction of the Yichang Wanzhou Railway tunnel in high risk karst areas[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 91 -95 .
[10] CHEN Peng, HU Wen-Rong, FEI Hai-Yan. Screening of a denitrifying bacterium strain LZ-14 and its nitrogen removal characteristics[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 133 -138 .