Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (6): 68-75.doi: 10.6040/j.issn.1672-3961.0.2020.236

Previous Articles    

Pre-trained based joint model for intent classification and slot filling in Chinese spoken language understanding

MA Changxia1, ZHANG Chen2   

  1. 1. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, Jiangsu, China;
    2. Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore
  • Published:2020-12-15

Abstract: We explored a joint model for intent classification and slot filling based on pre-train and attention mechanism because intent classification and slot filling were correlative. We combined bidirectional long short-term memory(BiLSTM), conditional random fields(CRF)and bidirectional encoder representations from transformers(BERT), which supported bidirectional and self-attentional mechanism without relying heavily on hand-crafted features and domain-specific knowledge or resources, into the proposed model. We compared the performance of our proposed architecture with the state-of-the-art models. Experiments on dataset demonstrated that the proposed architecture outperformed the state-of-the-art approaches on both tasks. Furthermore, we presented a new dialogue corpus from autonomous bus information inquiry system(ABIIS), and our methods yielded effective improvement on intent classification accuracy and slot filling F1 compared with a state-of-the-art baseline.

Key words: intent classification, slot filling, pre-train, bidirectional encoder representation, multi-head attention

CLC Number: 

  • TP391
[1] CHEN H, LIU X, YIN D, et al. A survey on dialogue systems[J]. ACM SIGKDD Explorations Newsletter, 2017, 19(2):25-35.
[2] MCCALLUM A, FREITAG D, PEREIRA F. Maximum entropy Markov models for information extraction and segmentation[C] //Proc of the Seventeenth International Conference on Machine Learning. Stanford, USA: ICML, 2000:591-598.
[3] MCCALLUM A, WEI L. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons[C] // Proc of NAACL. Singapore: ACL, 2003:188-191.
[4] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C] // Proc of the Eighteenth International Conference on Machine Learning. San Francisco, USA: ICML, 2001:282-289.
[5] KOICHI T, NIGEL C. Use of support vector machines in extended named entity recognition[C] //Proc of the 6th Conference on Natural Language Learning. Stroudsburg, USA: ACM, 2002: 1-7.
[6] DAVID N, SATOSHI S. A survey of named entity recognition and classification[J]. International Journal of Linguistics and Language Resources, 2007, 30(1):3-26.
[7] LEV R, DAN R. Design challenges and misconceptions in named entity recognition[C] //Proc of CoNLL-2009. Madison, USA: ACL, 2009:147-155.
[8] HU Zhiting, MA Xuezhe, LIU Zhengzhong, et al. Harnessing deep neural networks with logic rules[C] //Proc of ACL-2016. Berlin, Germany: ACL, 2016:2410-2420.
[9] MA Xuezhe, EDUARD HOVY. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C] // Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: ACL, 2016:1064-1074.
[10] CHIU JASON, NICHOLS ERIC. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4(7):357-370.
[11] LIU Liyuan, SHANG Jingbo, XU FRANK, et al. Empower sequence labeling with task-aware neural language model[C] //Proc of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018:5253-5260.
[12] SURENDRAN D, LEVOW G A. Dialog act tagging with support vector machines and hidden Markov models[C] //Proc of INTERSPEECH. Pittsburgh, USA: ICSLP, 2006:1950-1953.
[13] ALI S A, SULAIMAN N, MUSTAPHA A N. Improving accuracy of intention-based response classification using decision tree[J]. Information Technology Journal, 2009, 8(6):923-928.
[14] NIIMI Y, OKU T, NISHIMOTO T, et al. A rule-based approach to extraction of topics and dialog acts in a spoken dialog system[C] //Proc of EUROSPEECH 2001 Scandinavia, European Conference on Speech Communication and Technology, INTERSPEECH Event. Aalborg, Denmark: ISCA, 2001:2185-2188.
[15] MIKOLOV T, MARTIN K, LUKAS B, et al. Recurrent neural network-based language model[C] //Proc of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH. Chiba, Japan: IEEE, 2010:1045-1048.
[16] SEPP H, JUEGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[17] FELIX G, JURGEN S, FRED C. Learning to forget: Continual prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471.
[18] YAO Kaisheng, PENG Baolin, ZHANG Yu, et al. Spoken language understanding using long short-term memory neural networks[C] //Proc of Spoken Language Technology Workshop. California, USA: IEEE, 2014:189-194.
[19] SHI Yangyang, YAO Kaisheng, TIAN Le, et al. Deep lstm based feature mapping for query classification[C] //Proc of NAACL-HLT. California, USA: ACL, 2016:1501-1511.
[20] ZHANG Xiaodong, WANG Houfeng. A joint model of intent determination and slot filling for spoken language understanding[C] //Proc of 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAL, 2016:2993-2999.
[21] MATTHEW P, MARK N, MOHIT I, et al. Deep contextualized word representations[C] // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Volume 1(Long Papers)In NAACL-HLT. Louisiana, USA: ACL, 2018: 2227-2237.
[22] MIKOLOV T, ILYA S, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 10: 3113-3119.
[23] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [R]. OpenAI, 2018.
[24] DEVLIN J, CHANG MW, LEE K, et al. Bert: Pretraining of deep bidirectional transformers for language understanding[C] //Proc of NAACL-HLT. Minneapolis, USA: ACL, 2019: 4171-4186.
[25] ASHISH V, NOAM S, NIKI P, et al. Attention is all you need[C] //Proc of 31st Conference on Neural Information Processing Systems. New York, USA: NIPS, 2017:6000-6010.
[26] CHRIS D, MIGUEL B, WANG Ling, et al. Transition-based dependency parsing with stack long short-term memory[C] //Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015: 334-343.
[1] Xiaolan XIE,Qi WANG. A scheduling algorithm based on multi-objective container cloud task [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 14-21.
[2] Guoyong CAI,Xinhao HE,Yangyang CHU. Visual sentiment analysis based on spatial attention mechanism and convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 8-13.
[3] Keyang CHENG,Shuang SUN,Yongzhao ZHAN. Modified SuBSENSE algorithm via adaptive distance threshold based on background complexity [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 38-44.
[4] Feng TIAN,Xin LI,Fang LIU,Chuang LI,Xiaoqiang SUN,Ruishan DU. A semantictag generation method based on multi-model subspace learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 31-37, 44.
[5] Jinping MA. A multi-microcontroller communication method based on UART asynchronous serial communication protocol [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 24-30.
[6] Gaoteng YUAN,Yihui LIU,Wei HUANG,Bing HU. MR image classification and recognition model of breast cancer based onGabor feature [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 15-23.
[7] Jiangli DUAN,Xin HU. Semantic relation recognition for natural language question answering [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 1-7.
[8] Baocheng LIU,Yan PIAO,Xuemei SONG. Adaptive fusion target tracking based on joint detection [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 51-57.
[9] Yunyang YAN,Chenxi DU,Yian LIU,Shangbing GAO. Fire detection based on lightweight convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 100-107.
[10] Shengnan ZHANG,Lei WANG,Chunhong CHANG,Benli HAO. Image denoising based on 3D shearlet transform and BM4D [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 83-90.
[11] Longmao HU,Xuegang HU. Identification of the same product feature based on multi-dimension similarity and sentiment word expansion [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 50-59.
[12] Yanping CHEN,Li FENG,Yongbin QIN,Ruizhang HUANG. A syntactic element recognition method based on deep neural network [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 44-49.
[13] Wei YAN,Damin ZHANG,Huijuan ZHANG,Ziyun XI,Zhongyun CHEN. Improved bird swarm algorithms based on mixed decision making [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 34-43.
[14] Shiqi SONG,Yan PIAO,Zexin JIANG. Vehicle classification and tracking for complex scenes based on improved YOLOv3 [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 27-33.
[15] Ningning CHEN,Jianwei ZHAO,Zhenghua ZHOU. Visual tracking algorithm based on verifying networks [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 17-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!