Journal of Shandong University(Engineering Science) ›› 2021, Vol. 51 ›› Issue (3): 15-21.doi: 10.6040/j.issn.1672-3961.0.2020.249

Previous Articles     Next Articles

Adaptive multi-domain sentiment analysis based on knowledge distillation

YANG Xiuyuan, PENG Tao, YANG Liang*, LIN Hongfei   

  1. College of Computer Science and Technology, Dalian University of Technology, Dalian 116023, Liaoning, China
  • Online:2021-06-20 Published:2021-06-24

Abstract: An adaptive multi-domain knowledge distillation framework was proposed, which effectively accelerated reasoning and reduced model parameters while ensuring model performance. The knowledge distillation method was used to study sentiment analysis problems. When performing knowledge distillation for each specific field, model distillation involved word embedding layer distillation, coding layer distillation(attention distillation, hidden state distillation), output prediction layer distillation and other aspects of distillation, in order to learn all aspects knowledge from the specific field teacher model. Selectively learning the importance of the teacher model corresponding to different fields to the data was proposed, which further improved the accuracy of the prediction results. The experimental results on multiple public datasets showed that after single-domain knowledge distillation increased the model accuracy by an average of 2.39%, while multi-domain knowledge distillation increased the model accuracy by an average of 0.5%. Compared with the knowledge distillation of a single domain, this framework enhanced the generalization ability of the student model and improved the performance.

Key words: knowledge distillation, adaptive, multi-domain, sentiment analysis, deep learning

CLC Number: 

  • TP391
[1] MATTHEW E, MARK N, MOHIT I, et al. Deep contextualized word representations[C] // Proceedings of NAACL-HLT. Stroudsburg, USA: Association for Computational Linguistics, 2018: 2227-2237.
[2] DEVLIN J, CHANG M, LEE K, et al. Bert: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of NAACL-HLT. Stroudsburg, USA: Association for Computational Linguistics, 2019: 4171-4186.
[3] YANG Z, DAI Z, YANG Y, et al. Xlnet: generalized autoregressive pretraining for language underst-anding[C] //Proceedings of NeurlPS. New York, USA: MIT Press, 2019: 5753-5763.
[4] JOSHI M, CHEN D, LIU Y, et al. Spanbert: improving pre-training by representing and predicting spans[J]. Transactions of the Association for Comp-utational Linguistics, 2019, 8(1): 64-77.
[5] WANG A, AMANPREET S, JULIAN M, et al. Glue: a multi-task benchmark and analysis platform for natural language understanding[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing(EMNLP). Brussels, Belgium: ACL, 2018: 353-355.
[6] DING M, ZHOU C, CHEN Q, et al. Cognitive graph for multi-hop reading comprehension at scale [C] //Proceedings of the 57th Conference of the Association for Computational Linguistics. Florence, Italy: ACL, 2019: 2694-2703.
[7] KOVALEVA O, ROMANOV A, ROGERS A, et al. Revealing the dark secrets of bert[C] //Proceedings of EMNLP-IJCNLP. Hong Kong, China: ACL, 2019: 4355-4365.
[8] RASTEGARI M, ORDONEZ V, REDMON J, et al. Xnor-net: imagenet classification using binary convolu-tional neural networks[C] //European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 525-542.
[9] HANG S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural network [C] //Proceedings of Neural Information Processing Systems(NeurIPS). New York, USA: MIT Press, 2015: 1135-1143.
[10] LI J, ZHAO R, HUANG J, et al. Learning small-size DNN with output-distribution-based criteria[C] //Proceedings of Interspeech. Lyon, France: Interspeech, 2014:1910-1914.
[11] HUANG G, LIU Z, VAN D, et al. Densely connected convolutional networks[C] //Proceedings of the IEEE Conference on Computer vision and Pattern Recognition. Hawaii, USA: IEEE, 2017: 4700-4708.
[12] YIM J, JOO D, BAE J, et al. A gift from knowledge distillation: fast optimization, network minimization and transfer learning[C] //Proceedings of CVPR. Hawaii, USA: IEEE, 2017: 7130-7138.
[13] WOO S, PARK J, LEE J Y, et al. Cbam: con-volutional block attention module[C] //Proceedings of the European Conference on Computer Vision(ECCV). Munich, Germany: Springer, 2018: 3-19.
[14] FURLANELLO T, LIPTON Z, TSCHANNEN M, et al. Born-again neural networks[C] //Proceedings of ICML. Stockholm, Sweden: ACM, 2018: 1602-1611.
[15] YANG C, XIE L, SU C, et al. Snapshot distillation: teacher-student optimization in one generation[C] //Proceedings of CVPR. Long Beach, USA: IEEE, 2019: 2859-2868.
[16] XU T, LIU C. Data-distortion guided self-distillation for deep neural networks[C] //Proceedings of AAAI. Hawaii, USA: AAAI, 2019: 5565-5572.
[17] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Proceedings of NIPS. New York, USA: MIT Press, 2017: 5998-6008.
[18] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(NeurIPS). New York, USA: MIT Press, 2017: 770-778.
[19] QIU X, SUN T, XU Y, et al. Pre-trained models for natural language processing: a survey[J]. Science China Technological Sciences, 2020, 29(2): 1-26.
[20] BA L, CARUANA R. Do deep nets really need to be deep?[C] //Proceedings of Neural Information Processing Systems. New York, USA: MIT Press, 2013: 2654-2662.
[21] GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural networks [C] //Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: AAAI, 2010: 249-256.
[22] ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks [C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE 2017: 1125-1134.
[23] SARIKAYA R, HINTON G, DEORAS A. Application of deep belief networks for natural language under-standing[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2014, 22(4): 778-784.
[24] LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning [C] //Proceedings of IJCAI. New York, USA: AAAI, 2016:168-175.
[1] LI Kunbiao, YANG Xiaohui, ZHANG Feng, XU Tao, GUO Qingbei. Risky driving behavior detection based on local and global knowledge distillation [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 13-20.
[2] ZHOU Qian, LI Qun, ZHU Dandan, LI Yibo. Coordinated inertia response control for offshore low frequency wind power system based on adaptive virtual inertia of M3C [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 30-39.
[3] LI Changgang, LI Baoliang, CAO Yongji, WANG Jiaying. Review and prospect on artificial intelligence application in power system power flow calculation [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 1-17.
[4] LI Xiaohui, LIU Xiaofei, SUN Weitong, ZHAO Yi, DONG Yuan, JIN Yinli. An inspection task assignment and path planning algorithm based on vehicles-UAVs collaboration [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 101-109.
[5] ZHENG Xiao, CHEN He, ZHOU Dongao, GONG Yongshun. Video anomaly detection method based on video caption augmentation and dual-stream feature fusion [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 110-119.
[6] ZHOU Qunying, SUI Jiacheng, ZHANG Ji, WANG Hongyuan. Industrial product surface defect detection based on self supervised convolution and parameter free attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(4): 40-47.
[7] YANG Jucheng, LU Kaikui, WANG Yuan. Review of knowledge distillation based on generative adversarial networks [J]. Journal of Shandong University(Engineering Science), 2025, 55(4): 56-71.
[8] GAO Junjian, LIAO Zhuhua, LIU Yizhi, ZHAO Yijiang. Hierarchical multi-agent reinforcement learning based route guidance method combining personalization and signal control [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 34-45.
[9] XUE Bingbing, WANG Yong, YANG Weihao, WANG Chuan, YU Di, WANG Xu. Real-time expressway traffic data imputation and state prediction based on ETC system data [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 58-71.
[10] DONG Mingshu, CHEN Liqi, MA Chuanyi, ZHANG Zhuhao, SUN Renjuan, GUAN Yanhua, ZHUANG Peizhi. Deep learning-based intelligent judgment for radar detection of pavement cracks [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 72-79.
[11] Jiachun LI,Bowen LI,Jianbo CHANG. An efficient and lightweight RGB frame-level face anti-spoofing model [J]. Journal of Shandong University(Engineering Science), 2023, 53(6): 1-7.
[12] Yue YUAN,Yanli WANG,Kan LIU. Named entity recognition model based on dilated convolutional block architecture [J]. Journal of Shandong University(Engineering Science), 2022, 52(6): 105-114.
[13] Tongyu JIANG, Fan CHEN, Hongjie HE. Lightweight face super-resolution network based on asymmetric U-pyramid reconstruction [J]. Journal of Shandong University(Engineering Science), 2022, 52(1): 1-8.
[14] Haigen MIN,Yukun FANG,Xia WU,Wuqi WANG. Fault diagnosis of vehicle-to-vehicle communication in networked traffic environment [J]. Journal of Shandong University(Engineering Science), 2021, 51(6): 84-92.
[15] Jianqing WU,Xiuguang SONG. Review on development of simultaneous localization and mapping technology [J]. Journal of Shandong University(Engineering Science), 2021, 51(5): 16-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] KONG Xiang-zhen,LIU Yan-jun,WANG Yong,ZHAO Xiu-hua . Compensation and simulation for the deadband of the pneumatic proportional valve[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 99 -102 .
[2] CHEN Rui, LI Hongwei, TIAN Jing. The relationship between the number of magnetic poles and the bearing capacity of radial magnetic bearing[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(2): 81 -85 .
[3] ZOU Feifei,GUAN Xiaojun,HAN Zhenqiang,SHEN Xiaomin,MA Xiaofei ,LIU Yunteng . hermal simulating experiment and FEM simulation of dynamic recrystallization of 09CuPTiRE steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(5): 17 -20 .
[4] YU Hai-bo,LI Yu,YU Tian,LEI Hong . Influence of the dimensions of W-band folded waveguide slow-wave system on its cold characteristics[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 90 -94 .
[5] PAN Duo-tao,LIU Gui-ping,LIU Chang-feng . Screening of microbe producing flocculant and optimizationon its cultural conditions[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 99 -103 .
[6] WANG Wei,MAO Hua-yong,LI Guo-xiang,PAN Shi-yan,GONG Ting-fang,JIN Shi-qiang,HAO Sheng-bing . Numerical simulation of the flow in a fuel burned vehicle heater[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 64 -68 .
[7] ZHANG Dao-qiang. Knowledge preserving embedding[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(2): 1 -10 .
[8] ZHANG Xiao, LI Shu-Cai, ZHANG Qiang-Song, LIU Qin, ZHANG Ning, LIU Bin. Research in  field tests  of the influence factors of the TSP system signal collection quality[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(4): 25 -29 .
[9] LUO Yun-hu,WU Xu-wen,PAN Shuang-lai,DONG Er-ling,SUN Xiu-juan,WANG Chuan-jiang,WU Na . Coordination of two kinds of interruptible loads of demand side and reserve capacity of generation side[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(6): 66 -70 .
[10] WANG Kai,SUN Feng-zhong,ZHAO Yuan-bin,GAO Ming,GAO Shan . Mathematical model and numerical simulation of the air inlet flowfield of a natural-draft cooling tower[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(1): 13 -17 .