山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (1): 45-51.doi: 10.6040/j.issn.1672-3961.0.2023.168
• 机器学习与数据挖掘 • 上一篇
马坤,刘筱云,李乐平,纪科,陈贞翔,杨波
MA Kun, LIU Xiaoyun, LI Leping, JI Ke, CHEN Zhenxiang, YANG Bo
摘要: 为解决多标签文本分类在捕获标签关系时忽视标签共现特性的问题,提出基于统计特征的自适应多标签信息学习方法(adaptive label feature learning, ALFL),用于检测内容营销文章。构建主题先验自适应标记狄利克雷主题模型(labeled latent dirichlet allocation with adaptive topic priors, LDATP),根据每个文本的标签集合情况,与标签集合对应的全部营销主题约束模型生成主题词概率分布;构建标签信息整合网络(label information integration network, LIIN),利用主题词概率分布和标签的图结构学习标签相关信息,获得标签嵌入表示;进行文本和标签空间之间的信息交互,捕获语义特征以识别营销文章。试验结果表明,基于统计特征的ALFL方法以召回率为80.92%、准确率为88.14%,优于其他基线模型,具有更高的预测准确性。
中图分类号:
[1] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Shinjuku, Tokyo, Japan: SIGIR, 2017: 115-124. [2] LIANG Xiao, WANG Chenxu, ZHAO Guoshuai. Enhancing content marketing article detection with graph analysis[J]. IEEE Access, 2019, 7: 94869-94881. [3] ZHANG Lu, ZHANG Jian, LI Zhibin, et al. Towards better graph representation: two-branch collaborative graph neural networks for multimodal marketing intention detection[C] //2020 IEEE International Conference on Multimedia and Expo. London, UK: IEEE, 2020: 1-6. [4] MATHIAS N, MOHAMED A, KONSTANTIN K. Learning convolutional neural networks for graphs[C] //Proceedings of the 33nd International Conference on Machine Learning. New York, USA: JMLR, 2016: 2014-2023 [5] DANIEL B, GHOLAMREZA H, TREVOR C. Graph-to-sequence learning using gated graph neural networks[C] //Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018: 273-283. [6] LIU Yuli, LIU Yiquan, ZHOU Ke, et al. Detecting promotion campaigns in query auto completion[C] //Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York, USA: ACM, 2016: 125-134. [7] FAN Xiaoming, WANG Chenxu, LIANG Xiao. Extracting advertisements from content marketing articles based on topicCNN[C] //DASC/PiCom/CBD- Com/CyberSciTech. Calgary, Canada: IEEE, 2020: 355-360. [8] YANG Pengcheng, SUN Xu, LI Wei, et al. SGM: sequence generation model for multi-label classification[C] //Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: ACL, 2018: 3915-3926. [9] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2017: 115-124. [10] ZHANG Wenjie, YAN Junchi, WANG Xiangfeng, et al. Deep extreme multi-label learning[C] //Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. New York, USA: ACM, 2018: 100-107. [11] DU Cunxiao, CHUN Zhaozheng, FENG Fuli, et al. Explicit interaction model towards text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 6359-6366. [12] THARANGA D, GEEGANAGE K. Concept embedded topic modeling technique[C] //Companion Proceedings of the Web Conference 2018. Lyon, France: WWW, 2018: 831-835. [13] BLEI D M, NG A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. [14] DEMBCZYNSKI K, WAEGEMAN W, CHENG W, et al. Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss[C] //Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer, 2010: 280-295. [15] LACOSTE-JULIEN S, SHA F, JORDAN M. DiscLDA: discriminative learning for dimensionality reduction and classification[J]. Advances in Neural Information Processing Systems, 2008, 21: 897-904. [16] RAMAGE D, HALL D, NALLAPATI R, et al. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora[C] //Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Singapore: ACL, 2009: 248-256. [17] YAO Liang, MAO Chengsheng, LUO Yuan. Graph convolutional networks for text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 7370-7377. [18] HUANG Lianzhe, MA Dedong, LI Sujian, et al. Text level graph neural network for text classification[C] //Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China: ACL, 2019: 3444-3450. [19] FENG Yifan, YOU Haoxuan, ZHANG Zizhao, et al. Hypergraph neural networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 3558-3565. |
[1] | 吴艳丽,刘淑薇,何东晓,王晓宝,金弟. 刻画多种潜在关系的泊松-伽马主题模型[J]. 山东大学学报 (工学版), 2023, 53(2): 51-60. |
[2] | 孙志巍,宋明阳,潘泽华,景丽萍. 上下文感知的判别式主题模型[J]. 山东大学学报 (工学版), 2022, 52(4): 131-138. |
[3] | 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18. |
[4] | 闫盈盈,黄瑞章,王瑞,马灿,刘博伟,黄庭. 一种长文本辅助短文本的文本理解方法[J]. 山东大学学报(工学版), 2018, 48(3): 67-74. |
[5] | 卢文羊, 徐佳一, 杨育彬. 基于LDA主题模型的社会网络链接预测[J]. 山东大学学报(工学版), 2014, 44(6): 26-31. |
|