山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (4): 51-58.doi: 10.6040/j.issn.1672-3961.0.2023.125
陈晓江1,2,杨晓奇2,陈广豪3,刘伍颖4,5*
CHEN Xiaojiang1,2, YANG Xiaoqi2, CHEN Guanghao3, LIU Wuying4,5*
摘要: 针对短文本分类任务效率低下和精度不高的问题,提出混合基于Transformer的双向编码器表示和宽度学习分类器(hybrid bidirectional encoder representations from transformer and broad learning, BERT-BL)的高效率和高精度文本分类模型。对基于Transformer的双向编码器表示(bidirectional encoder representation from transformer, BERT)进行微调以更新BERT的参数。使用微调好的BERT将短文本映射成对应的词向量矩阵,将词向量矩阵输入宽度学习(broad learning, BL)分类器中以完成分类任务。试验结果显示,BERT-BL模型在3个公共数据集上的准确率均达到最优;所需要的时间仅为基线模型支持向量机(support vector machine, SVM)、长短期记忆网络(long short-term memory, LSTM)、最小p范数宽度学习(minimum p-norm broad learning, p-BL)和BERT的几十分之一,而且训练过程不需要高性能显卡的参与。通过对比分析,BERT-BL模型不仅在短文本任务中具有良好的性能,而且能节省大量训练时间成本。
中图分类号:
[1] LUO X. Efficient English text classification using selected machine learning techniques[J]. Alexandria Engineering Journal, 2021, 60(3): 3401-3409. [2] Al-SALEMI B, AYOB M, NOAH S. Feature ranking for enhancing boosting-based multi-label text categorization[J]. Expert Systems with Applications, 2018, 113: 531-543. [3] SORA O, JUNYA T, TOMOYUKI K, et al. Text classification with negative supervision[C] //Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [S.l.] : Association for Computational Linguistics, 2020: 351-357. [4] BENGIO Y, SCHWENK H, SENÉCAL J, et al. Neural probabilistic language models[J]. The Journal of Machine Learning Research, 2003, 3(6): 1137-1155. [5] LE Q, MIKOLOV T. Distributed representations of sentences and documents[C] //Proceedings of the 31st International Conference on International Conference on Machine Learning. Beijing, China: PMLR, 2014: 1188-1196. [6] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[C] //Proceedings of the 1st International Conference on Learning Representations. Scottsdale, USA: ICLR, 2013: 1-12. [7] PENNINGTON J, SOCHER R, CHRISTOPHER D M. Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1532-1543. [8] DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: Association for Computational Linguistics, 2019: 4171-4186. [9] PANG B, LEE L, VAITHYANATHAN S. Thumbs up? Sentiment classification using machine learning techniques[C] //Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing(EMNLP). Philadelphia, USA: Association for Computational Linguistics, 2002: 79-86. [10] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(1): 273-297. [11] BLACK T C, THOMPSON W J. Bayesian data analysis[J]. Computing in Science Engineering, 2001, 3(4): 86-91. [12] ABEYWICKRAMA T, CHEEMA M A, TANIAR D. K-nearest neighbors on road networks: a journey in experimentation and in memory implementation[J]. Proceedings of the VLDB Endowment, 2016, 6(9): 492-503. [13] SERGIO G C, LEE M. Stacked DeBERT: all attention in incomplete data for text classification[J]. Neural Networks, 2021, 136: 87-96. [14] MENG Y, ZHANG Y, HUANG J, et al. Text classification using label names only: a language model self-training approach[C] //Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP). [S.l.] : Association for Computational Linguistics, 2020: 9006-9017. [15] ZHOU P, QI Z, ZHENG S, et al. Textclassification improved by integrating bidirectional LSTM with two-dimensional max pooling[C] //Proceedings of the Conference on Computational Linguistics. Osaka, Japan: Association for Computational Linguistics, 2016: 3485-3495. [16] KIM Y. Convolutionalneural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1746-1751. [17] ZHAO W, YE J, YANG M. Investigating capsule networks with dynamic routing for text classification[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018: 3110-3119. [18] CHU Y, LIN H, YANG L, et al. Hyperspectral image classification based on discriminative locality preserving broad learning system[J]. Knowledge-Based Systems, 2020, 206: 106319. [19] 戴宏, 盛立杰, 苗启广. 基于胶囊网络的对抗判别域适应算法[J].计算机研究与发展, 2021, 58(9): 1997-2012. DAI Hong,SHENG Lijie, MIAO Qiguang. Adversarial discriminative domain adaptation algorithm with CapsNet[J]. Journal of Computer Research and Development, 2021, 58(9): 1997-2012. [20] CHEN C L, LIU Z. Broad learning system: an effective and efficient incremental learning system without the need for deep architecture[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(1): 10-24. [21] PRAMANIK S, BHATTACHARJEE D, NASIPURI M, et al. LINPE-BL: alocal descriptor and broad learning for identification of abnormal breast thermograms[J]. IEEE Transactions on Medical Imaging, 2021, 40(12): 3919-3931. [22] JIN J, LI Y, YANG T, et al. Discriminative group-sparsity constrained broad learning system for visual recognition[J]. Information Sciences, 2021, 576: 800-818. [23] WANG X, CHCENG L, ZHANG D, et al. Broad learning solution for rapid diagnosis of COVID-19[J]. Biomedical Signal Processing and Control, 2023, 83: 104724. [24] DU J, VONG C, CHEN C L. Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification[J]. IEEE Transactions on Cybernetics, 2021, 51(3): 1586-1597. [25] HOCHREITER S, JÜRGEN S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [26] CHEN G, PENG S, ZENG R, et al. p-Norm broad learning for negative emotion classification in social networks[J]. Big Data Mining and Analytics, 2022, 5(3): 245-256. |
[1] | 侯延琛,赵金东. 任意形状聚类的SPK-means算法[J]. 山东大学学报 (工学版), 2023, 53(2): 87-92. |
[2] | 卢立倩,李增勇*,崔若飞,周伟伟. 血液酒精的近红外光谱法检测预放大电路设计[J]. 山东大学学报(工学版), 2014, 44(3): 64-68. |
[3] | 孙杰 李剑峰. 钛合金整体结构件加工关键技术研究[J]. 山东大学学报(工学版), 2009, 39(3): 81-88. |
|