双侧协同过滤多模态推荐对比表示提升算法

doi:10.6040/j.issn.1672-3961.0.2024.284

山东大学学报 (工学版) ›› 2026, Vol. 56 ›› Issue (3): 106-117.doi: 10.6040/j.issn.1672-3961.0.2024.284

双侧协同过滤多模态推荐对比表示提升算法

陈宇¹,孟广婷¹,宗臣¹,袁卫华^1,2*,王洁宁³,王星¹

1.山东建筑大学计算机与人工智能学院, 山东济南 250101;2.山东建筑大学计算智能研究中心, 山东济南 250101;3.山东建筑大学建筑城规学院, 山东济南 250101

发布日期:2026-06-09
作者简介:陈宇(1998— ),女,河北张家口人,硕士研究生,主要研究方向为推荐系统. E-mail:chenyu19980507@163.com. *通信作者简介:袁卫华(1977— ),女,山东青岛人,教授,硕士生导师,博士,主要研究方向为推荐系统及机器学习. E-mail: huahua_qingdao@sdjzu.edu.cn
基金资助:
国家自然科学基金资助项目(62176142,62177031);山东省自然科学基金资助项目(ZR2022MF334);山东省本科教育改革资助项目(M2022245);山东省优质专业学位教学案例库建设资助项目(SDYAL2025085)

Algorithm for two-sided collaborative filtering multimodal contrastive representation enhancement recommender

CHEN Yu¹, MENG Guangting¹, ZONG Chen¹, YUAN Weihua^1,2*, WANG Jiening³, WANG Xing¹

CHEN Yu¹, MENG Guangting¹, ZONG Chen¹, YUAN Weihua^{1, 2*}, WANG Jiening³, WANG Xing¹(1. School of Computer and Artificial Intelligence, Shandong Jianzhu University, Jinan 250101, Shandong, China;
2. Computational Intelligence Center, Shandong Jianzhu University, Jinan 250101, Shandong, China;
3. School of Architecture and Urban Planning, Shandong Jianzhu University, Jinan 250101, Shandong, China

Published:2026-06-09

摘要/Abstract

摘要： 现有多模态推荐系统存在三方面不足:未充分挖掘多模态数据与交互数据的潜在关联,导致关键特征弱化;未考虑物品中与用户兴趣无关的噪声及交互行为中偶然因素引入的噪声干扰;采用静态融合赋予各模态均等权重,无法动态感知用户兴趣变化,特征表示区分度不足。为此,提出一种用户和物品双侧协同过滤多模态推荐对比表示提升算法(two-sided collaborative filtering multimodal contrastive representation enhancement, TCFCRE),通过对比学习强化关键特征,以有效地挖掘多模态数据与交互数据之间的潜在关联。同时,为降低噪声对用户表示学习的影响,设计跨模态用户表示对齐模块,挖掘不同模态用户特征的一致性以提取真实兴趣;基于用户-物品多模态关系构建掩码矩阵生成增强视图,借助对比学习减弱隐式反馈中的噪声干扰。为解决传统静态融合无法区分模态重要性与适配动态兴趣变化的问题,设计多模态动态融合模块,为各模态表示自适应计算融合权重。在三个公开数据集上进行大量试验,结果表明,TCFCRE 相较于多种先进基线模型取得显著性能提升。

关键词: 多模态推荐, 图神经网络, 表示学习, 对比学习, 表示增强

Abstract: The existing multimodal recommenders had three main problems: the potential relevance between multimodal data and interaction data had not been fully explored, leading to the weakening of key features; the accidentally caused noise unrelated to user interests was ignored; the static multimodal fusion method provided the same weight to each modality and could not dynamically perceive the change of user interests, resulting in insufficient discrimination of the learned representations. Therefore, a user and item two-sided collaborative filtering multimodal contrastive representation enhancement(TCFCRE)recommender was proposed. To address the shortcomings in combining multimodal and interaction data, TCFCRE used contrastive learning to enhance key features and mine the potential associations. Meanwhile, to reduce the impact of noise, a cross-modal user representation alignment module was designed to discover the consistency of user features and extract users' true interests. A mask matrix based on the user-item multimodal relationship was also constructed to generate an augmented view, and contrastive learning was adopted to reduce the noise impact in implicit feedback. To alleviate the problem that traditional methods ignored the importance of modalities and could not adapt to dynamic changes, a multimodal dynamic fusion module that calculated fusion weights for each representation was designed. Experiments on three public datasets demonstrated that TCFCRE had achieved significant improvements over existing solutions.

Key words: multimodal recommendation, graph neural network, representation learning, contrastive learning, representation enhancement

中图分类号:

TP391.3

陈宇,孟广婷,宗臣,袁卫华,王洁宁,王星. 双侧协同过滤多模态推荐对比表示提升算法[J]. 山东大学学报 (工学版), 2026, 56(3): 106-117.

CHEN Yu, MENG Guangting, ZONG Chen, YUAN Weihua, WANG Jiening, WANG Xing. Algorithm for two-sided collaborative filtering multimodal contrastive representation enhancement recommender[J]. Journal of Shandong University(Engineering Science), 2026, 56(3): 106-117.

参考文献

[1] WEI W, HUANG C, XIA L H, et al. Multi-modal self-supervised learning for recommendation[C] //Proceedings of the ACM Web Conference. Austin, USA:ACM, 2023: 790-800.
[2] 李璐, 张志军, 范钰敏, 等. 面向冷启动用户的元学习与图转移学习序列推荐[J]. 山东大学学报(工学版), 2024, 54(2): 69-79. LI Lu, ZHANG Zhijun, FAN Yumin, et al. Sequential recommendation for cold-start users with meta graph transitional learning[J]. Journal of Shandong University(Engineering Science), 2024, 54(2): 69-79.
[3] 段圣宇, 吴伊宁, 赛高乐. 一种面向矩阵分解模型的推荐系统训练加速方法[J]. 山东大学学报(工学版), 2025, 55(1): 24-29. DUAN Shengyu, WU Yining, SAI Gaole. Algorithmic acceleration of matrix factorization based recommendation system[J]. Journal of Shandong University(Engineering Science), 2025, 55(1): 24-29.
[4] HE R N, MCAULEY J. VBPR: visual Bayesian personalized ranking from implicit feedback[EB/OL].(2015-10-06)[2024-01-28]. https://arxiv.org/abs/1510.01784
[5] WEI Y W, WANG X, NIE L Q, et al. MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video[C] //Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM, 2019: 1437-1445.
[6] TAO Z L, WEI Y W, WANG X, et al. MGAT: multimodal graph attention network for recommendation[J]. Information Processing & Management, 2020, 57(5): 102277.
[7] WEI Y W, WANG X, HE X N, et al. Hierarchical user intent graph network for multimedia recommendation[J]. IEEE Transactions on Multimedia, 2022, 24: 2701-2712.
[8] CHEN F Y, WANG J J, WEI Y W, et al. Breaking isolation: multimodal graph fusion for multimedia recommendation by edge-wise modulation[C] //Proceedings of the 30th ACM International Conference on Multimedia. Lisboa, Portugal: ACM, 2022: 385-394.
[9] ZHANG J H, ZHU Y Q, LIU Q, et al. Mining latent structures for multimedia recommendation[EB/OL].(2021-08-19)[2024-01-28]. https://arxiv.org/abs/2104.09036
[10] KIM T, LEE Y C, SHIN K, et al. MARIO: modality-aware attention and modality-preserving decoders for multimedia recommendation[C] //Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta, USA: ACM, 2022: 993-1002.
[11] ZHOU X, SHEN Z Q. A tale of two graphs: freezing and denoising graph structures for multimodal recommendation[C] //Proceedings of the 31st ACM International Conference on Multimedia. Ottawa, Canada: ACM, 2023: 935-943.
[12] WANG Q F, WEI Y W, YIN J H, et al. DualGNN: dual graph neural network for multimedia recommendation[J]. IEEE Transactions on Multimedia, 2023, 25: 1074-1084.
[13] LIU K, XUE F, GUO D, et al. Multimodal graph contrastive learning for multimedia-based recommendation[J]. IEEE Transactions on Multimedia, 2023, 25: 9343-9355.
[14] ZHOU X, ZHOU H Y, LIU Y, et al. Bootstrap latent representations for multi-modal recommendation[C] //Proceedings of the ACM Web Conference. Austin, USA:ACM, 2023: 845-854.
[15] HE X N, DENG K, WANG X, et al. LightGCN: simplifying and powering graph convolution network for recommendation[EB/OL].(2020-07-07)[2024-01-28]. https://arxiv.org/abs/2002.02126
[16] WEI Y W, WANG X, NIE L Q, et al. Graph-refined convolutional network for multimedia recommendation with implicit feedback[C] //Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020: 3541-3549.
[17] TIAN C X, XIE Y X, LI Y L, et al. Learning to denoise unreliable interactions for graph collaborative filtering[C] //Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. Madrid, Spain:ACM, 2022: 122-132.
[18] REN X B, XIA L H, ZHAO J S, et al. Disentangled contrastive collaborative filtering[EB/OL].(2024-02-25)[2024-01-28]. https://arxiv.org/abs/2305.02759
[19] RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[EB/OL].(2012-05-09)[2024-01-28]. https://arxiv.org/pdf/1205.2618
[20] WANG X, HE X N, WANG M, et al. Neural graph collaborative filtering[C] //Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Paris, France:ACM, 2019: 165-174.
[21] BURABAK M, AYTEKIN T. SynerGraph: an in-tegrated graph convolution network for multim-odal recommendation[EB/OL].(2020-07-13)[2024-01-28]. https://arxiv.org/pdf/2405.19031

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

双侧协同过滤多模态推荐对比表示提升算法

Algorithm for two-sided collaborative filtering multimodal contrastive representation enhancement recommender

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

多维度评价

本文评价

推荐阅读 0

[1]	唐凯,王芳,刘建霞. 基于多层级核心聚合GNN的图节点分类算法[J]. 山东大学学报 (工学版), 2026, 56(3): 137-143.
[2]	黎俊亮,蒋沅,吴珑雪,刘宇. 融合节点与边特征的增强Graph Transformer[J]. 山东大学学报 (工学版), 2026, 56(3): 118-126.
[3]	邓彬, 张宗包, 赵文猛, 罗新航, 吴秋伟. 基于云边协同和图神经网络的电动汽车充电站负荷预测方法[J]. 山东大学学报 (工学版), 2025, 55(5): 62-69.
[4]	刁振宇,韩小凡,张承宇,聂慧佳,赵秀阳,牛冬梅. 基于实例判别与特征增强的单图三维模型检索[J]. 山东大学学报 (工学版), 2025, 55(2): 71-77.
[5]	林振宇,邵蓥侠. 基于盖根堡多项式最佳平方近似的谱图网络[J]. 山东大学学报 (工学版), 2024, 54(5): 93-100.
[6]	常新功,苏敏惠,周志刚. 基于进化集成的图神经网络解释方法[J]. 山东大学学报 (工学版), 2024, 54(4): 1-12.
[7]	李璐,张志军,范钰敏,王星,袁卫华. 面向冷启动用户的元学习与图转移学习序列推荐[J]. 山东大学学报 (工学版), 2024, 54(2): 69-79.
[8]	赵涛,张宁,王小超,马川义,田源,张圣涛,杨梓梁. 基于图神经网络轨迹预测的合流区交通冲突预测方法[J]. 山东大学学报 (工学版), 2024, 54(2): 36-46.
[9]	郑顺,王绍卿,刘玉芳,李可可,孙福振. 基于动态掩码和多对对比学习的序列推荐模型[J]. 山东大学学报 (工学版), 2023, 53(6): 47-55.
[10]	陈雷,赵耀帅,林彦,郭晟楠,万怀宇,林友芳. 交通流量预测的时间异质性图注意力网络[J]. 山东大学学报 (工学版), 2023, 53(5): 29-36.
[11]	苏佳林,王元卓,靳小龙,程学旗. 自适应属性选择的实体对齐方法[J]. 山东大学学报 (工学版), 2020, 50(1): 14-20.