您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (4): 51-58.doi: 10.6040/j.issn.1672-3961.0.2023.125

• 机器学习与数据挖掘 • 上一篇    

混合BERT和宽度学习的低时间复杂度短文本分类

陈晓江1,2,杨晓奇2,陈广豪3,刘伍颖4,5*   

  1. 1.广东开放大学揭阳分校信息科, 广东 揭阳 522095;2.广东外语外贸大学信息科学与技术学院, 广东 广州 510006;3.广州软件学院软件工程系, 广东 广州 510990;4.鲁东大学山东省语言资源开发与应用重点实验室, 山东 烟台 264025;5.广东外语外贸大学外国语言学及应用语言学研究中心, 广东 广州 510420
  • 发布日期:2024-08-20
  • 作者简介:陈晓江(1995— ),男,广东揭阳人,助教,硕士,主要研究方向为自然语言处理. E-mail: 774847467@qq.com. *通信作者简介:刘伍颖(1980— ),男,江西九江人,教授,硕士生导师,博士,主要研究方向为计算语言学和自然语言处理. E-mail: wyliu@ldu.edu.cn
  • 基金资助:
    教育部新文科研究与改革实践资助项目(2021060049);山东省研究生教育教学改革研究资助项目(SDYJG21185);山东省本科教学改革研究重点资助项目(Z2021323);教育部人文社会科学研究青年基金资助项目(20YJC740062);上海市哲学社会科学“十三五”规划课题资助项目(2019BYY028);教育部人文社会科学研究规划基金资助项目(20YJAZH069);广州市科技计划资助项目(202201010061)

Low time complexity short text classification based on fusion of BERT and broad learing

CHEN Xiaojiang1,2, YANG Xiaoqi2, CHEN Guanghao3, LIU Wuying4,5*   

  1. 1. Information Department, Jieyang Campus of Guangdong Open University, Jieyang 522095, Guangdong, China;
    2. School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;
    3. Department of Software Engineering, Software Engineering Institute of Guangzhou, Guangzhou 510990, Guangdong, China;
    4. Shandong Key Laboratory of Language Resources Development and Application, Ludong University, Yantai 264025, Shandong, China;
    5. Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, Guangzhou 510420, Guangdong, China
  • Published:2024-08-20

摘要: 针对短文本分类任务效率低下和精度不高的问题,提出混合基于Transformer的双向编码器表示和宽度学习分类器(hybrid bidirectional encoder representations from transformer and broad learning, BERT-BL)的高效率和高精度文本分类模型。对基于Transformer的双向编码器表示(bidirectional encoder representation from transformer, BERT)进行微调以更新BERT的参数。使用微调好的BERT将短文本映射成对应的词向量矩阵,将词向量矩阵输入宽度学习(broad learning, BL)分类器中以完成分类任务。试验结果显示,BERT-BL模型在3个公共数据集上的准确率均达到最优;所需要的时间仅为基线模型支持向量机(support vector machine, SVM)、长短期记忆网络(long short-term memory, LSTM)、最小p范数宽度学习(minimum p-norm broad learning, p-BL)和BERT的几十分之一,而且训练过程不需要高性能显卡的参与。通过对比分析,BERT-BL模型不仅在短文本任务中具有良好的性能,而且能节省大量训练时间成本。

关键词: 短文本分类, BERT-BL, BERT, 宽度学习, 高精度

中图分类号: 

  • TP391
[1] LUO X. Efficient English text classification using selected machine learning techniques[J]. Alexandria Engineering Journal, 2021, 60(3): 3401-3409.
[2] Al-SALEMI B, AYOB M, NOAH S. Feature ranking for enhancing boosting-based multi-label text categorization[J]. Expert Systems with Applications, 2018, 113: 531-543.
[3] SORA O, JUNYA T, TOMOYUKI K, et al. Text classification with negative supervision[C] //Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [S.l.] : Association for Computational Linguistics, 2020: 351-357.
[4] BENGIO Y, SCHWENK H, SENÉCAL J, et al. Neural probabilistic language models[J]. The Journal of Machine Learning Research, 2003, 3(6): 1137-1155.
[5] LE Q, MIKOLOV T. Distributed representations of sentences and documents[C] //Proceedings of the 31st International Conference on International Conference on Machine Learning. Beijing, China: PMLR, 2014: 1188-1196.
[6] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[C] //Proceedings of the 1st International Conference on Learning Representations. Scottsdale, USA: ICLR, 2013: 1-12.
[7] PENNINGTON J, SOCHER R, CHRISTOPHER D M. Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1532-1543.
[8] DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: Association for Computational Linguistics, 2019: 4171-4186.
[9] PANG B, LEE L, VAITHYANATHAN S. Thumbs up? Sentiment classification using machine learning techniques[C] //Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing(EMNLP). Philadelphia, USA: Association for Computational Linguistics, 2002: 79-86.
[10] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(1): 273-297.
[11] BLACK T C, THOMPSON W J. Bayesian data analysis[J]. Computing in Science Engineering, 2001, 3(4): 86-91.
[12] ABEYWICKRAMA T, CHEEMA M A, TANIAR D. K-nearest neighbors on road networks: a journey in experimentation and in memory implementation[J]. Proceedings of the VLDB Endowment, 2016, 6(9): 492-503.
[13] SERGIO G C, LEE M. Stacked DeBERT: all attention in incomplete data for text classification[J]. Neural Networks, 2021, 136: 87-96.
[14] MENG Y, ZHANG Y, HUANG J, et al. Text classification using label names only: a language model self-training approach[C] //Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP). [S.l.] : Association for Computational Linguistics, 2020: 9006-9017.
[15] ZHOU P, QI Z, ZHENG S, et al. Textclassification improved by integrating bidirectional LSTM with two-dimensional max pooling[C] //Proceedings of the Conference on Computational Linguistics. Osaka, Japan: Association for Computational Linguistics, 2016: 3485-3495.
[16] KIM Y. Convolutionalneural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1746-1751.
[17] ZHAO W, YE J, YANG M. Investigating capsule networks with dynamic routing for text classification[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018: 3110-3119.
[18] CHU Y, LIN H, YANG L, et al. Hyperspectral image classification based on discriminative locality preserving broad learning system[J]. Knowledge-Based Systems, 2020, 206: 106319.
[19] 戴宏, 盛立杰, 苗启广. 基于胶囊网络的对抗判别域适应算法[J].计算机研究与发展, 2021, 58(9): 1997-2012. DAI Hong,SHENG Lijie, MIAO Qiguang. Adversarial discriminative domain adaptation algorithm with CapsNet[J]. Journal of Computer Research and Development, 2021, 58(9): 1997-2012.
[20] CHEN C L, LIU Z. Broad learning system: an effective and efficient incremental learning system without the need for deep architecture[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(1): 10-24.
[21] PRAMANIK S, BHATTACHARJEE D, NASIPURI M, et al. LINPE-BL: alocal descriptor and broad learning for identification of abnormal breast thermograms[J]. IEEE Transactions on Medical Imaging, 2021, 40(12): 3919-3931.
[22] JIN J, LI Y, YANG T, et al. Discriminative group-sparsity constrained broad learning system for visual recognition[J]. Information Sciences, 2021, 576: 800-818.
[23] WANG X, CHCENG L, ZHANG D, et al. Broad learning solution for rapid diagnosis of COVID-19[J]. Biomedical Signal Processing and Control, 2023, 83: 104724.
[24] DU J, VONG C, CHEN C L. Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification[J]. IEEE Transactions on Cybernetics, 2021, 51(3): 1586-1597.
[25] HOCHREITER S, JÜRGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[26] CHEN G, PENG S, ZENG R, et al. p-Norm broad learning for negative emotion classification in social networks[J]. Big Data Mining and Analytics, 2022, 5(3): 245-256.
[1] 侯延琛,赵金东. 任意形状聚类的SPK-means算法[J]. 山东大学学报 (工学版), 2023, 53(2): 87-92.
[2] 卢立倩,李增勇*,崔若飞,周伟伟. 血液酒精的近红外光谱法检测预放大电路设计[J]. 山东大学学报(工学版), 2014, 44(3): 64-68.
[3] 孙杰 李剑峰. 钛合金整体结构件加工关键技术研究[J]. 山东大学学报(工学版), 2009, 39(3): 81-88.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!