您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (1): 45-51.doi: 10.6040/j.issn.1672-3961.0.2023.168

• 机器学习与数据挖掘 • 上一篇    下一篇

用于意图识别的自适应多标签信息学习模型

马坤,刘筱云,李乐平,纪科,陈贞翔,杨波   

  1. 济南大学信息科学与工程学院, 山东 济南 250022
  • 发布日期:2024-02-01
  • 作者简介:马坤(1981— ),男,山东济南人,副教授,硕士生导师,博士,主要研究方向为大数据、云计算等. E-mail:ise_mak@ujn.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61772231);山东省自然科学基金资助项目(ZR2022LZH016);山东省重点研发计划(重大创新工程)资助项目(2021CXGC010103)

Adaptive label information learning for intention detection

MA Kun, LIU Xiaoyun, LI Leping, JI Ke, CHEN Zhenxiang, YANG Bo   

  1. School of Information Science and Engineering, University of Jinan, Jinan 250022, Shandong, China
  • Published:2024-02-01

摘要: 为解决多标签文本分类在捕获标签关系时忽视标签共现特性的问题,提出基于统计特征的自适应多标签信息学习方法(adaptive label feature learning, ALFL),用于检测内容营销文章。构建主题先验自适应标记狄利克雷主题模型(labeled latent dirichlet allocation with adaptive topic priors, LDATP),根据每个文本的标签集合情况,与标签集合对应的全部营销主题约束模型生成主题词概率分布;构建标签信息整合网络(label information integration network, LIIN),利用主题词概率分布和标签的图结构学习标签相关信息,获得标签嵌入表示;进行文本和标签空间之间的信息交互,捕获语义特征以识别营销文章。试验结果表明,基于统计特征的ALFL方法以召回率为80.92%、准确率为88.14%,优于其他基线模型,具有更高的预测准确性。

关键词: 多标签文本分类, 标签共现, 主题模型, 图结构, 标签嵌入

中图分类号: 

  • TP391
[1] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Shinjuku, Tokyo, Japan: SIGIR, 2017: 115-124.
[2] LIANG Xiao, WANG Chenxu, ZHAO Guoshuai. Enhancing content marketing article detection with graph analysis[J]. IEEE Access, 2019, 7: 94869-94881.
[3] ZHANG Lu, ZHANG Jian, LI Zhibin, et al. Towards better graph representation: two-branch collaborative graph neural networks for multimodal marketing intention detection[C] //2020 IEEE International Conference on Multimedia and Expo. London, UK: IEEE, 2020: 1-6.
[4] MATHIAS N, MOHAMED A, KONSTANTIN K. Learning convolutional neural networks for graphs[C] //Proceedings of the 33nd International Conference on Machine Learning. New York, USA: JMLR, 2016: 2014-2023
[5] DANIEL B, GHOLAMREZA H, TREVOR C. Graph-to-sequence learning using gated graph neural networks[C] //Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018: 273-283.
[6] LIU Yuli, LIU Yiquan, ZHOU Ke, et al. Detecting promotion campaigns in query auto completion[C] //Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York, USA: ACM, 2016: 125-134.
[7] FAN Xiaoming, WANG Chenxu, LIANG Xiao. Extracting advertisements from content marketing articles based on topicCNN[C] //DASC/PiCom/CBD- Com/CyberSciTech. Calgary, Canada: IEEE, 2020: 355-360.
[8] YANG Pengcheng, SUN Xu, LI Wei, et al. SGM: sequence generation model for multi-label classification[C] //Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: ACL, 2018: 3915-3926.
[9] LIU Jingzhou, CHANG Weicheng, WU Yuexin, et al. Deep learning for extreme multi-label text classification[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2017: 115-124.
[10] ZHANG Wenjie, YAN Junchi, WANG Xiangfeng, et al. Deep extreme multi-label learning[C] //Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. New York, USA: ACM, 2018: 100-107.
[11] DU Cunxiao, CHUN Zhaozheng, FENG Fuli, et al. Explicit interaction model towards text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 6359-6366.
[12] THARANGA D, GEEGANAGE K. Concept embedded topic modeling technique[C] //Companion Proceedings of the Web Conference 2018. Lyon, France: WWW, 2018: 831-835.
[13] BLEI D M, NG A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[14] DEMBCZYNSKI K, WAEGEMAN W, CHENG W, et al. Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss[C] //Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer, 2010: 280-295.
[15] LACOSTE-JULIEN S, SHA F, JORDAN M. DiscLDA: discriminative learning for dimensionality reduction and classification[J]. Advances in Neural Information Processing Systems, 2008, 21: 897-904.
[16] RAMAGE D, HALL D, NALLAPATI R, et al. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora[C] //Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Singapore: ACL, 2009: 248-256.
[17] YAO Liang, MAO Chengsheng, LUO Yuan. Graph convolutional networks for text classification[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 7370-7377.
[18] HUANG Lianzhe, MA Dedong, LI Sujian, et al. Text level graph neural network for text classification[C] //Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China: ACL, 2019: 3444-3450.
[19] FENG Yifan, YOU Haoxuan, ZHANG Zizhao, et al. Hypergraph neural networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI Press, 2019: 3558-3565.
[1] 周彦冰,马士伦,文益民. 基于图结构的概念漂移检测[J]. 山东大学学报 (工学版), 2025, 55(2): 88-96.
[2] 吴艳丽,刘淑薇,何东晓,王晓宝,金弟. 刻画多种潜在关系的泊松-伽马主题模型[J]. 山东大学学报 (工学版), 2023, 53(2): 51-60.
[3] 孙志巍,宋明阳,潘泽华,景丽萍. 上下文感知的判别式主题模型[J]. 山东大学学报 (工学版), 2022, 52(4): 131-138.
[4] 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18.
[5] 闫盈盈,黄瑞章,王瑞,马灿,刘博伟,黄庭. 一种长文本辅助短文本的文本理解方法[J]. 山东大学学报(工学版), 2018, 48(3): 67-74.
[6] 卢文羊, 徐佳一, 杨育彬. 基于LDA主题模型的社会网络链接预测[J]. 山东大学学报(工学版), 2014, 44(6): 26-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 王素玉,艾兴,赵军,李作丽,刘增文 . 高速立铣3Cr2Mo模具钢切削力建模及预测[J]. 山东大学学报(工学版), 2006, 36(1): 1 -5 .
[2] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[3] 孔祥臻,刘延俊,王勇,赵秀华 . 气动比例阀的死区补偿与仿真[J]. 山东大学学报(工学版), 2006, 36(1): 99 -102 .
[4] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[5] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[6] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[7] 浦剑1 ,张军平1 ,黄华2 . 超分辨率算法研究综述[J]. 山东大学学报(工学版), 2009, 39(1): 27 -32 .
[8] 王丽君,黄奇成,王兆旭 . 敏感性问题中的均方误差与模型比较[J]. 山东大学学报(工学版), 2006, 36(6): 51 -56 .
[9] 孙殿柱,朱昌志,李延瑞 . 散乱点云边界特征快速提取算法[J]. 山东大学学报(工学版), 2009, 39(1): 84 -86 .
[10] 赵然杭,陈守煜 . 水资源数量与质量联合评价理论模型研究[J]. 山东大学学报(工学版), 2006, 36(3): 46 -50 .