您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (1): 45-51.doi: 10.6040/j.issn.1672-3961.0.2023.168

• 机器学习与数据挖掘 • 上一篇    

用于意图识别的自适应多标签信息学习模型

马坤,刘筱云,李乐平,纪科,陈贞翔,杨波   

  1. 济南大学信息科学与工程学院, 山东 济南 250022
  • 发布日期:2024-02-01
  • 作者简介:马坤(1981— ),男,山东济南人,副教授,硕士生导师,博士,主要研究方向为大数据、云计算等. E-mail:ise_mak@ujn.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61772231);山东省自然科学基金资助项目(ZR2022LZH016);山东省重点研发计划(重大创新工程)资助项目(2021CXGC010103)

Adaptive label information learning for intention detection

MA Kun, LIU Xiaoyun, LI Leping, JI Ke, CHEN Zhenxiang, YANG Bo   

  1. School of Information Science and Engineering, University of Jinan, Jinan 250022, Shandong, China
  • Published:2024-02-01

摘要: 为解决多标签文本分类在捕获标签关系时忽视标签共现特性的问题,提出基于统计特征的自适应多标签信息学习方法(adaptive label feature learning, ALFL),用于检测内容营销文章。构建主题先验自适应标记狄利克雷主题模型(labeled latent dirichlet allocation with adaptive topic priors, LDATP),根据每个文本的标签集合情况,与标签集合对应的全部营销主题约束模型生成主题词概率分布;构建标签信息整合网络(label information integration network, LIIN),利用主题词概率分布和标签的图结构学习标签相关信息,获得标签嵌入表示;进行文本和标签空间之间的信息交互,捕获语义特征以识别营销文章。试验结果表明,基于统计特征的ALFL方法以召回率为80.92%、准确率为88.14%,优于其他基线模型,具有更高的预测准确性。

关键词: 多标签文本分类, 标签共现, 主题模型, 图结构, 标签嵌入

中图分类号: 

  • TP391
[1] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Shinjuku, Tokyo, Japan: SIGIR, 2017: 115-124.
[2] LIANG Xiao, WANG Chenxu, ZHAO Guoshuai. Enhancing content marketing article detection with graph analysis[J]. IEEE Access, 2019, 7: 94869-94881.
[3] ZHANG Lu, ZHANG Jian, LI Zhibin, et al. Towards better graph representation: two-branch collaborative graph neural networks for multimodal marketing intention detection[C] //2020 IEEE International Conference on Multimedia and Expo. London, UK: IEEE, 2020: 1-6.
[4] MATHIAS N, MOHAMED A, KONSTANTIN K. Learning convolutional neural networks for graphs[C] //Proceedings of the 33nd International Conference on Machine Learning. New York, USA: JMLR, 2016: 2014-2023
[5] DANIEL B, GHOLAMREZA H, TREVOR C. Graph-to-sequence learning using gated graph neural networks[C] //Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018: 273-283.
[6] LIU Yuli, LIU Yiquan, ZHOU Ke, et al. Detecting promotion campaigns in query auto completion[C] //Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York, USA: ACM, 2016: 125-134.
[7] FAN Xiaoming, WANG Chenxu, LIANG Xiao. Extracting advertisements from content marketing articles based on topicCNN[C] //DASC/PiCom/CBD- Com/CyberSciTech. Calgary, Canada: IEEE, 2020: 355-360.
[8] YANG Pengcheng, SUN Xu, LI Wei, et al. SGM: sequence generation model for multi-label classification[C] //Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: ACL, 2018: 3915-3926.
[9] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2017: 115-124.
[10] ZHANG Wenjie, YAN Junchi, WANG Xiangfeng, et al. Deep extreme multi-label learning[C] //Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. New York, USA: ACM, 2018: 100-107.
[11] DU Cunxiao, CHUN Zhaozheng, FENG Fuli, et al. Explicit interaction model towards text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 6359-6366.
[12] THARANGA D, GEEGANAGE K. Concept embedded topic modeling technique[C] //Companion Proceedings of the Web Conference 2018. Lyon, France: WWW, 2018: 831-835.
[13] BLEI D M, NG A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[14] DEMBCZYNSKI K, WAEGEMAN W, CHENG W, et al. Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss[C] //Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer, 2010: 280-295.
[15] LACOSTE-JULIEN S, SHA F, JORDAN M. DiscLDA: discriminative learning for dimensionality reduction and classification[J]. Advances in Neural Information Processing Systems, 2008, 21: 897-904.
[16] RAMAGE D, HALL D, NALLAPATI R, et al. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora[C] //Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Singapore: ACL, 2009: 248-256.
[17] YAO Liang, MAO Chengsheng, LUO Yuan. Graph convolutional networks for text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 7370-7377.
[18] HUANG Lianzhe, MA Dedong, LI Sujian, et al. Text level graph neural network for text classification[C] //Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China: ACL, 2019: 3444-3450.
[19] FENG Yifan, YOU Haoxuan, ZHANG Zizhao, et al. Hypergraph neural networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 3558-3565.
[1] 吴艳丽,刘淑薇,何东晓,王晓宝,金弟. 刻画多种潜在关系的泊松-伽马主题模型[J]. 山东大学学报 (工学版), 2023, 53(2): 51-60.
[2] 孙志巍,宋明阳,潘泽华,景丽萍. 上下文感知的判别式主题模型[J]. 山东大学学报 (工学版), 2022, 52(4): 131-138.
[3] 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18.
[4] 闫盈盈,黄瑞章,王瑞,马灿,刘博伟,黄庭. 一种长文本辅助短文本的文本理解方法[J]. 山东大学学报(工学版), 2018, 48(3): 67-74.
[5] 卢文羊, 徐佳一, 杨育彬. 基于LDA主题模型的社会网络链接预测[J]. 山东大学学报(工学版), 2014, 44(6): 26-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!