山东大学学报 (工学版) ›› 2023, Vol. 53 ›› Issue (6): 47-55.doi: 10.6040/j.issn.1672-3961.0.2022.381
• 机器学习与数据挖掘 • 上一篇
郑顺,王绍卿*,刘玉芳,李可可,孙福振
ZHENG Shun, WANG Shaoqing*, LIU Yufang, LI Keke, SUN Fuzhen
摘要: 为解决BERT(bidirectional encoder representations from transformers)编码器在掩码过程中人为引入噪音、掩码比例过小难以掩盖短交互序列中的项目以及掩码比例过大导致模型难以训练3个问题,提出一种更改BERT编码器掩码方式的对比学习方法,为模型提供3类学习样本,使模型在训练过程中模仿人类学习进程,从而取得较好的结果。提出的算法在3个公开数据集上进行对比试验,性能基本优于基线模型,其中,在MovieLens-1M数据集上HR@5和NDCG@5指标分别提高9.68%和10.55%。由此可见,更改BERT编码器的掩码方式以及新的对比学习方法能够有效提高BERT编码器的编码准确性,从而提高推荐的正确率。
中图分类号:
[1] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of NAACL-HLT. Minneapolis, USA: ACL, 2019: 4171-4186. [2] SUN F, LIU J, WU J, et al. BERT4Rec: sequential recommendation with bidirectional encoder representations from transformer[C] //Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China: ACM, 2019: 1441-1450. [3] DU H, SHI H, ZHAO P, et al. Contrastive learning with bidirectional transformers for sequential recommendation[C] //Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta, USA: ACM, 2022: 396-405. [4] WETTIG A, GAO T, ZHONG Z, et al. Should you mask 15% in masked language modeling?[C] // Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia: ACL, 2023: 2977-2992. [5] SHANI G, HECKERMAN D, BRAFMAN R I, et al. An MDP-based recommender system[J]. Journal of Machine Learning Research, 2005, 6(9):1532-4435. [6] RENDLE S, FREUDENTHALER C, SCHMIDT-THIEME L. Factorizing personalized Markov chains for next-basket recommendation[C] //Proceedings of the 19th International Conference on World Wide Web. Raleigh, USA: ACM, 2010: 811-820. [7] HE R, MCAULEY J. Fusing similarity models with Markov chains for sparse sequential recommendation[C] //2016 16th International Conference on Data Mining(ICDM). Barcelona, Spain: IEEE, 2016: 191-200. [8] 冯兴杰, 曾云泽. 基于评分矩阵与评论文本的深度推荐模型[J]. 计算机学报, 2020, 43(5):884-900. FENG Xingjie, ZENG Yunze. Joint deep modeling of rating matrix and reviews for recommendation[J]. Chinese Journal of Computers, 2020, 43(5):884-900. [9] 沈学利, 杜志伟. 融合自注意力机制与长短期偏好的序列推荐模型[J]. 计算机应用研究, 2021, 38(5):1371-1375. SHEN Xueli, DU Zhiwei. Sequential recommendation model that combines self-attention mechanism with long-term and short-term preferences [J]. Application Research of Computers, 2021, 38(5):1371-1375. [10] KANG W C, MCAULEY J. Self-attentive sequential recommendation[C] //2018 IEEE International Conference on Data Mining(ICDM). Singapore: IEEE, 2018: 197-206. [11] XU C, FENG J, ZHAO P, et al. Long- and short- term self-attention network for sequential recommendation[J]. Neurocomputing, 2021, 423: 580-589. [12] WU S, TANG Y, ZHU Y, et al. Session-based recommendation with graph neural networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI, 2019, 33(1): 346-353. [13] XU C, ZHAO P, LIU Y, et al. Graph contextualized self-attention network for session-based recommendation[C] //Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. Macao, China:Morgan Kaufmann, 2019, 19: 3940-3946. [14] ZHANG T, ZHAO P, LIU Y, et al. Feature-level deeper self-attention network for sequential recommendation[C] //Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. Macao, China: Morgan Kaufmann, 2019: 4320-4326. [15] ZHOU K, WANG H, ZHAO W X, et al. S3-rec: self-supervised learning for sequential recommendation with mutual information maximization[C] //Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Virtual Event, Ireland: ACM, 2020: 1893-1902. [16] XIE X, SUN F, LIU Z, et al. Contrastive learning for sequential recommendation[C] //2022 IEEE 38th International Conference on Data Engineering(ICDE). Kuala Lumpur, Malaysia: IEEE, 2022: 1259-1273. [17] LIU Z, CHEN Y, LI J, et al. Contrastive self-supervised sequential recommendation with robust augmentation[J/OL].(2021-08-14)[2022-11-11]. https://arxiv.org/abs/2108.06479. [18] QIU R, HUANG Z, YIN H, et al. Contrastive learning for representation degeneration problem in sequential recommendation[C] //Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. Tempe, USA: ACM, 2022: 813-823. [19] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958. [20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. Long Beach, USA: MIT, 2017: 5998-6008. [21] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778. [22] XIONG R, YANG Y, HE D, et al. On layer normalization in the transformer architecture[C] // Proceedings of the 37th International Conference on Machine Learning. Vienna, Austria: ACM, 2020: 10524-10533. [23] MCAULEY J, TARGETT C, SHI Q, et al. Image-based recommendations on styles and substitutes[C] //Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. Santiago, Chile: ACM, 2015: 43-52. [24] HARPER F M, KONSTAN J A. The movielens datasets: history and context[J]. Acm Transactions on Interactive Intelligent Systems, 2015, 5(4): 1-19. [25] KRICHENE W, RENDLE S. On sampled metrics for item recommendation[J]. Communications of the ACM, 2022, 65(7): 75-83. [26] ZHAO W X, MU S, HOU Y, et al. Recbole: towards a unified, comprehensive and efficient framework for recommendation algorithms[C] //Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Queensland, Australia: ACM, 2021: 4653-4664. [27] PASZKE A, GROSS S, MASSA F, et al. Pytorch: an imperative style, high-performance deep learning library[J]. Advances in Neural Information Processing Systems, 2019: 8024-8035. [28] KINGMA D P, BA J. Adam: a method for stochastic optimization[C] //The 3rd International Conference for Learning Representations. San Diego, USA: [s.n.] , 2015:1-15. |
[1] | 李家春,李博文,常建波. 一种高效且轻量的RGB单帧人脸反欺诈模型[J]. 山东大学学报 (工学版), 2023, 53(6): 1-7. |
[2] | 王冰,马文明,武聪,郝昱猛. 融合信任相似度的偏置概率矩阵分解算法[J]. 山东大学学报 (工学版), 2022, 52(4): 110-117. |
[3] | 黄丹,王志海,刘海洋. 一种局部协同过滤的排名推荐算法[J]. 山东大学学报(工学版), 2016, 46(5): 29-36. |
[4] | 庞俊涛, 张晖, 杨春明, 李波, 赵旭剑. 基于概率矩阵分解的多指标协同过滤算法[J]. 山东大学学报(工学版), 2016, 46(3): 65-73. |
[5] | 陈大伟,闫昭*,刘昊岩. SVD系列算法在评分预测中的过拟合现象[J]. 山东大学学报(工学版), 2014, 44(3): 15-21. |
[6] | 孙远帅,陈垚,刘向荣,陈珂,林琛. 基于项目层次相似性的推荐算法[J]. 山东大学学报(工学版), 2014, 44(3): 8-14. |
[7] | 李改1,2,3, 李磊2,3. 一种解决协同过滤系统冷启动问题的新算法[J]. 山东大学学报(工学版), 2012, 42(2): 11-17. |
|