您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (2): 43-50.doi: 10.6040/j.issn.1672-3961.2.2015.047

• 机器学习与数据挖掘 • 上一篇    下一篇

基于信息熵的协同过滤算法

张佳1,林耀进1,林梦雷1,刘景华1,李慧宗2   

  1. 1. 闽南师范大学计算机学院, 福建 漳州 363000;2. 安徽理工大学经济与管理学院, 安徽 淮南 232001
  • 收稿日期:2015-05-18 出版日期:2016-04-20 发布日期:2015-05-18
  • 作者简介:张佳(1991— ),男,湖南衡阳人,硕士研究生,主要研究方向为数据挖掘,信息处理.E-mail: zhangjia_gl@163.com
  • 基金资助:
    国家自然科学基金资助项目(61303131,61379021);福建省自然科学基金资助项目(2013J01028);教育部人文社会科学研究青年基金资助项目(13YJCZH077);福建省高校杰出青年科研人才培养计划资助项目(JA14192)

Entropy-based collaborative filtering algorithm

ZHANG Jia1, LIN Yaojin1, LIN Menglei1, LIU Jinghua1, LI Huizong2   

  1. 1. School of Computer Science, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. School of Economics and Management, Anhui University of Science and Technology, Huainan 232001, Anhui, China
  • Received:2015-05-18 Online:2016-04-20 Published:2015-05-18

摘要: 针对用户评分数据的稀疏性制约着系统的推荐质量的问题,提出了一种基于信息熵的协同过滤算法。首先定义了用户信息熵以反映用户评分分布和倾向程度;然后,利用大间隔的方法计算目标用户与其他用户的间隔距离,结合目标用户的信息熵,得到目标用户的近邻选择范围;最后,同时考虑用户的信息熵和用户间的相似性大小得到目标用户的近邻集合,以降低数据稀疏性对推荐结果的影响。试验结果表明:基于信息熵的协同过滤算法能够有效地提高推荐质量。

关键词: 数据稀疏性, 相似性, 大间隔, 近邻选择, 协同过滤, 信息熵

Abstract: In the recommender system, the recommended quality was restricted by the sparsity of user rating data. To solve this problem, a novel entropy-based collaborative filtering algorithm was proposed. First, the definition of user entropy was given to reflect the rating distribution of users and their rating tendency degree. Then, the method of large margin was introduced to calculate the margin distance, and the neighbor selection range was determined via combining both of the active users entropy and margin distance with other users. Finally, neighbors were obtained by making full of the user entropy and the similarity between users, which could degrade the influence of the sparse rating data. Experimental results on two data sets showed that the proposed algorithm could improve the recommended quality effectively.

Key words: data sparsity, similarity, large margin, entropy, collaborative filtering, neighbor selection

中图分类号: 

  • TP391
[1] ADOMAVICIUS G, TUZHILIN A. Toward the next generation of recommender systems:a survey of the state-of-the-art and possible extensions[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(6):734-749.
[2] LI Yanen, ZHAI Chengxiang, Chen Ye. Exploiting rich user information for one-class collaborative filtering[J]. Knowledge and Information Systems, 2014, 38(2):277-301.
[3] SHI Yue, LARSON M, HANJALIC A. Collaborative filtering beyond the user-item matrix:a survey of the state of the art and future challenges[J]. ACM Computing Surveys, 2014, 47(1):3:1-3:45.
[4] BOBADILLA J, ORTEGA F, HEMANDO A, et al. Recommender systems survey[J]. Knowledge-Based Systems, 2013,(46):109-132.
[5] BREESE J S, HECKERMAN D, KADIE C. Empirical Analysis of Predictive Algorithms for Collaborative Filtering[C] //Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. Madison, USA:UAI, 1998:43-52.
[6] RESNICK P, IACOVOU N, SUCHAK M, et al. GroupLens:An open architecture for collaborative filtering of netnews[C] //Proceedings of the ACM Conference on Computer Supported Cooperative Work. Chapel Hill, USA:ACM, 1994:175-186.
[7] JAMALI M, ESTER M. TrustWalker:a random walk model for combining trust-based and item-based recommendation[C] //Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA:ACM, 2009:397-406.
[8] SARWAR B, KARPIS G, KONSTAN J, et al. Item-based collaborative filtering recommendation algorithms[C] //Proceedings of the 10th International Conference on World Wide Web. New York, USA:ACM, 2001:285-295.
[9] AHN H J. A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem[J]. Information Sciences, 2008, 178(1):37-51.
[10] 李改,李磊.一种解决协同过滤系统冷启动问题的新算法[J].山东大学学报(工学版),2012,42(2):11-17. LI Gai, LI Lei. A new algorithm of cold-start in a collaborative filtering system[J]. Journal of Shandong University(Engineering Science), 2012, 42(2):11-17.
[11] LIU Haifeng, HU Zheng, MIAN Ahmad, et al. A new user similarity model to improve the accuracy of collaborative filtering[J]. Knowledge-Based Systems, 2014(56):156-166.
[12] 邓晓懿,金淳,韩庆平,等.基于情境聚类和用户评级的协同过滤推荐模型[J].系统工程理论与实践,2013,33(11):2945-2953. DENG Xiaoyi, JIN Chun, HAN Qingping, et al. Improved collaborative filtering model based on context clustering and user raking[J]. Systems Engineering—Theory & Practice, 2013, 33(11):2945-2953.
[13] 邓爱林,朱扬勇,施伯乐.基于项目评分预测的协同过滤推荐算法[J].软件学报,2003,14(9):1621-1628. DENG Ailin, ZHU Yangyong, SHI Bole. A collaborative filtering recommendation algorithm based on item rating prediction[J]. Journal of Software, 2003, 14(9):1621-1628.
[14] 林耀进,胡学钢,李慧宗.基于用户群体影响的协同过滤推荐算法[J].情报学报,2013,32(3):299-305. LIN Yaojin, HU Xuegang, LI Huizong. Collaborative filtering recommendation algorithm based on user group influence[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(3):299-305.
[15] JEONG B, LEE J, CHO H. User credit-based collaborative filtering[J]. Expert Systems with Applications, 2009, 36(6):7309-7312.
[16] ANAND D, BHARADWAJ K K. Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities[J]. Expert Systems with Applications, 2011, 38(5):5101-5109.
[17] BOUMAZA A, BRUN A. Stochastic search for global neighbors selection in collaborative filtering[C] //Proceedings of the 27th Annual ACM Symposium on Applied Computing. New York, USA:ACM, 2012:232-237.
[18] 黄创光,印鉴,汪静,等.不确定近邻的协同过滤推荐算法[J].计算机学报,2010,33(8):1369-1377. HUANG Chuangguang, YIN Jian, WANG Jing, et al. Uncertain neighbors' collaborative filtering recommendation algorithm[J]. Chinese Journal of Computers, 2010, 33(8):1369-1377.
[19] 李聪,梁昌勇,马丽.基于领域最近邻的协同过滤推荐算法[J].计算机研究与发展,2008,45(9):1532-1538. LI Cong, LIANG Changyong, MA Li. A collaborative filtering recommendation algorithm based on domain nearest neighbor[J]. Journal of Computer Research and Development, 2008, 45(9):1532-1538.
[20] LIU Qi, CHEN Enhong, XIONG Hui, et al. Enhancing collaborative filtering by user interest expansion via personalized ranking[J]. IEEE Transactions on Systems, Man, and Cybernetics, 2012, 42(1):218-233.
[21] BOBADILLA J, HERNANDO A, ORTEGA F, et al. Collaborative filtering based on significances[J]. Information Sciences, 2012, 185(1):1-17.
[22] LIN Yaojin, HU Xuegang, WU Xindong. Quality of information-Based source assessment and selection[J]. Neurocomputing, 2014(133):95-102.
[23] YANG Ming, WANG Fei, YANG Ping. A novel feature selection algorithm based on hypothesis-margin[J]. Journal of Computers, 2008, 3(12):27-34.
[24] KALELI C.An entropy-based neighbor selection approach for collaborative filtering[J]. Knowledge-Based Systems, 2014(56):273-280.
[1] 吴建萍,姜斌,刘剑慰. 基于小波包信息熵和小波神经网络的异步电机故障诊断[J]. 山东大学学报(工学版), 2017, 47(5): 223-228.
[2] 郝崇清,王志宏. 基于复杂网络的癫痫脑电分类与分析[J]. 山东大学学报(工学版), 2017, 47(3): 8-15.
[3] 张莉, 夏佩佩, 李凡长. 基于余弦相似性的供应商选择方法[J]. 山东大学学报(工学版), 2017, 47(1): 1-6.
[4] 王志强,文益民,李芳. 基于多方面评分的景点协同推荐算法[J]. 山东大学学报(工学版), 2016, 46(6): 54-61.
[5] 林耀进,张佳,林梦雷,王娟. 一种基于模糊信息熵的协同过滤推荐方法[J]. 山东大学学报(工学版), 2016, 46(5): 13-20.
[6] 黄丹,王志海,刘海洋. 一种局部协同过滤的排名推荐算法[J]. 山东大学学报(工学版), 2016, 46(5): 29-36.
[7] 李朔,石宇良. 基于位置社交网络中地点聚类推荐方法[J]. 山东大学学报(工学版), 2016, 46(3): 44-50.
[8] 庞俊涛, 张晖, 杨春明, 李波, 赵旭剑. 基于概率矩阵分解的多指标协同过滤算法[J]. 山东大学学报(工学版), 2016, 46(3): 65-73.
[9] 钟智彦,文志强, 张潇云,叶德刚. 基于半色调图像的邻域相似性描述子方法[J]. 山东大学学报(工学版), 2016, 46(3): 58-64.
[10] 王会青,孙宏伟,张建辉. 基于Map/Reduce的时间序列相似性搜索算法[J]. 山东大学学报(工学版), 2016, 46(1): 15-21.
[11] 辛丽玲, 何威, 于剑, 贾彩燕. 一种基于密度差异的离群点检测算法[J]. 山东大学学报(工学版), 2015, 45(3): 7-14.
[12] 陈大伟,闫昭*,刘昊岩. SVD系列算法在评分预测中的过拟合现象[J]. 山东大学学报(工学版), 2014, 44(3): 15-21.
[13] 孙远帅,陈垚,刘向荣,陈珂,林琛. 基于项目层次相似性的推荐算法[J]. 山东大学学报(工学版), 2014, 44(3): 8-14.
[14] 潘盼1,王熙照2,翟俊海2. 基于有序决策树的改进归纳算法[J]. 山东大学学报(工学版), 2014, 44(1): 41-44.
[15] 刘一方1,2, 张云峰1,2*, 迟静1,2,张彩明1,2. 基于SSLUT的颜色空间转换的快速算法[J]. 山东大学学报(工学版), 2013, 43(1): 41-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[2] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[3] 岳远征. 远离平衡态玻璃的弛豫[J]. 山东大学学报(工学版), 2009, 39(5): 1 -20 .
[4] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[5] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[6] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[7] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[8] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[9] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[10] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .