Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (2): 66-75.doi: 10.6040/j.issn.1672-3961.0.2019.304

• Machine Learning & Data Mining • Previous Articles     Next Articles

Entity recommendation based on normalized similarity measure of meta graph in heterogeneous information network

Wenkai ZHANG(),Ke YU,Xiaofei WU   

  1. School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2019-06-13 Online:2020-04-20 Published:2020-04-16
  • Supported by:
    国家自然科学基金资助项目(61601046);国家自然科学基金资助项目(61171098);中国111基地资助项目(B08004);欧盟FP7 IRSES资助项目(612212)

Abstract:

Based on the promising result of meta graph in heterogeneous information networks (HIN), normalized similarity measure of meta graph (NSMG) was proposed which combined implicit feedback matrix and PathSim(meta path-based similarity) to solve the problem of preference for large degree entities. Yelp-HIN(heterogeneous information networks in Yelp) and Amazon-HIN(heterogeneous information networks in Amazon) were constructed based on Yelp and Amazon datasets. Different types of meta graphs and normalized similarity measures were defined. Matrix decomposition and factorization machine were used to combine the similarities on different meta graphs. The experimental results showed that the proposed method based on normalization similarity measure of meta graphs performed better than the commonly used entity recommendation method in HIN on very sparse data sets.

Key words: heterogeneous information networks, meta graph, normalized similarity measure, entity recommendation, matrix decomposition, factorization machine

CLC Number: 

  • TP311

Fig.1

Schema of Yelp-HIN network"

Fig.2

Schema of Amazon-HIN network"

Fig.3

Meta graphs in Yelp-HIN/Amazon-HIN"

Table 1

Meta paths in Yelp-HIN and Amazon-HIN"

HIN 元路径
Yelp-HIN M3:用户-商家-分类-商家
M4:用户-商家-州-商家
M5:用户-商家-城市-商家
M6:用户-商家-星级-商家
M7:用户-商家-分类-商家-分类-商家
M8:用户-商家-州-商家-州-商家
M9:用户-评论-主题-评论-用户-商家
M10:用户-评论-商家-评论-用户-商家
Amazon-HIN M3:用户-产品-分类-产品
M4:用户-产品-品牌-产品
M5:用户-产品-分类-产品-分类-产品
M6:用户-产品-品牌-产品-品牌-产品
M7:用户-评论-主题-评论-用户-产品
M8:用户-评论-产品-评论-用户-产品

Fig.4

Symmetrical part and user implicit feedback part of the meta graph"

Fig.5

Unified representation of the item-symmetric meta graph M1"

Fig.6

Unified representation of the user-symmetric meta graph M2"

Table 2

Statistics of Yelp-50k and Amazon-50k"

统计信息 用户数/个 商家/个 类别数/个 星级数/个 城市数/个 州数/个 品牌数/个 签到数/个 签到数:用户数 签到数:商家数
Yelp-50k 13 663 8 164 716 9 154 13 37 659 2.76 4.61
Amazon-50k 20 345 8 146 488 640 29 350 1.44 3.60

Table 3

Sparse comparison of datasets"

数据集 稠密度/%
Yelp-50k 0.037
Amazon-50k 0.017
IMDb-MovieLens-100k 6.988
Yelp[19] 0.086
Douban 0.630

Table 4

Performance comparison"

方法 Yelp-50k Amazon-50k
RMSE NSMG提升百分比/% RMSE NSMG提升百分比/%
RSVD 2.599 6 52.6 2.639 5 55.5
HeteRec 1.732 0 28.9 1.936 0 39.3
FMG 1.247 3 1.2 1.185 0 1.3
NSMG 1.231 8 1.174 1

Fig.7

Performance comparison on each type ofmeta graph"

Fig.8

Performance with single meta graph on Yelp-HIN and Amazon-HIN"

Fig.9

Parameter tuning on Yelp-50k and Amazon-50k"

1 ZHAO H, YAO Q M, LI J D, et al. Meta-graph based recommendation fusion over heterogeneous information networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, Canada: ACM, 2017: 635-644.
2 SHI C , LI Y T , ZHANG J W , et al. A survey of heterogeneous information network analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (1): 17- 37.
doi: 10.1109/TKDE.2016.2598561
3 JEH G, WIDOM J. Scaling personalized web search[C]//Proceedings of the 12th international conference on World Wide Web. Budapest, Hungary: ACM, 2003: 271-279.
4 郑玉艳, 田莹, 石川. 一种元路径下基于频繁模式的实体集扩展方法[J]. 软件学报, 2018, 29 (10): 2915- 2930.
ZHENG Yuyan , TIAN ying , SHI Chuan . Method of entity set expansion based on frequent pattern under meta path[J]. Journal of Software, 2018, 29 (10): 2915- 2930.
5 黄立威, 李德毅, 马于涛, 等. 一种基于元路径的异质信息网络链路预测模型[J]. 计算机学报, 2014, 37 (4): 848- 858.
HUANG Liwei , LI Deyi , MA Yutao , et al. A meta path-based link prediction model for heterogeneous information networks[J]. Chinese Journal of Computers, 2014, 37 (4): 848- 858.
6 盛权为, 汪一百, 高阳. 一种改进的异构链路协同预测算法研究[J]. 计算机工程与应用, 2017, 53 (15): 155- 163.
SHENG Quanwei , WANG Yibai , GAO Yang . Research on improved algorithm for collaborative prediction of heterogeneous links[J]. Computer Engineering and Applications, 2017, 53 (15): 155- 163.
7 SUN Y Z , HAN J W , YAN X F , et al. Pathsim: meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4 (11): 992- 1003.
8 SHI C , KONG X N , HUANG Y , et al. Hetesim: a general framework for relevance measure in heterogeneous networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26 (10): 2479- 2492.
doi: 10.1109/TKDE.2013.2297920
9 HUANG Z P, ZHENG Y D, CHENG R, et al. Meta structure: computing relevance in large heterogeneous information networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA: ACM, 2016: 1595-1604.
10 YU X, REN X, SUN Y Z, et al. Recommendation in heterogeneous information networks with implicit user feedback[C]//Proceedings of the 7th ACM conference on Recommender systems. Hong Kong, China: ACM, 2013: 347-350.
11 YU X, REN X, SUN Y Z, et al. Personalized entity recommendation: a heterogeneous information network approach[C]//Proceedings of the 7th ACM international conference on Web search and data mining. New York, USA: ACM, 2014: 283-292.
12 SHI C , LIU J , ZHUANG F Z , et al. Integrating heterogeneous information via flexible regularization framework for recommendation[J]. Knowledge and Information Systems, 2016, 49 (3): 835- 859.
doi: 10.1007/s10115-016-0925-0
13 ZHENG J, LIU J, SHI C, et al. Dual similarity regularization for recommendation[C]// Pacific-Asia Conference on Knowledge Discovery and Data Mining. Auckland, New Zealand: Springer, 2016: 542-554.
14 JAMALI M, LAKSHMANAN L. HeteroMF: recommendation in heterogeneous information networks using context dependent factor models[C]// Proceedings of the 22nd International Conference on World Wide Web. Rio de Janeiro, Brazil: ACM, 2013: 643-654.
15 XIE F, CHEN L, YE Y, et al. A weighted meta-graph based approach for mobile application recommendation on heterogeneous information networks[C]//International Conference on Service-Oriented Computing. Hangzhou, China: Springer, 2018: 404-420.
16 ZHENG J, LIU J, SHI C, et al. Dual similarity regularization for recommendation[C]// Pacific-Asia Conference on Knowledge Discovery and Data Mining. Auckland, New Zealand: Springer, 2016: 542-554.
17 KOREN Y , BELL R , VOLINSKY C . Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42 (8): 30- 37.
doi: 10.1109/MC.2009.263
18 SHI C, ZHOU C, KONG X N, et al. Heterecom: a semantic-based recommendation system in heterogeneous networks[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing, China: ACM, 2012: 1552-1555.
19 SHI C, ZHANG Z Q, LUO P, et al. Semantic path based personalized recommendation on weighted heterogeneous information networks[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. Melbourne, Australia: ACM, 2015: 453-462.
20 BURKE R, VAHEDIAN F, MOBASHER B. Hybrid recommendation in heterogeneous networks[C]//International Conference on User Modeling, Adaptation, and Personalization. Aalborg, Denmark: Springer, 2014: 49-60.
21 王瑜, 武延军, 吴敬征, 等. 基于异构网络面向多标签系统的推荐模型研究[J]. 软件学报, 2017, 28 (10): 2611- 2624.
WANG Yu , WU Yanjun , WU Jingzheng , et al. Multi-Dimensional tag recommender model via heterogeneous networks[J]. Journal of Software(in Chinese), 2017, 28 (10): 2611- 2624.
22 王永, 邓永恒, 李晓光. 考虑非对称用户偏好的推荐算法[J]. 计算机工程与应用, 2018, 54 (23): 1- 6.
doi: 10.3778/j.issn.1002-8331.1809-0322
WANG Yong , DENG Yongheng , LI Xiaoguang . Asymmetric recommendation algorithm based on user preference[J]. Computer Engineering and Applications, 2018, 54 (23): 1- 6.
doi: 10.3778/j.issn.1002-8331.1809-0322
23 戴琳, 孟祥武, 张玉洁, 等. 一种融合多种数据信息的餐馆推荐模型[J]. 软件学报, 2019, 30 (9): 2869- 2885.
DAI Lin , MENG Xiangwu , ZHANG Yujie , et al. A restaurant recommendation model with multiple information fusion[J]. Journal of Software(in Chinese), 2019, 30 (9): 2869- 2885.
24 YUAN M , LIN Y . Model selection and estimation in regression with grouped variables[J]. Journal of the Royal Statistical Society: Series B:Statistical Methodology, 2006, 68 (1): 49- 67.
doi: 10.1111/j.1467-9868.2005.00532.x
25 LI H, LIN Z. Accelerated proximal gradient methods for nonconvex programming[C]//Advances in Neural Information Processing Systems. Montreal, Canada: NIPS, 2015: 379-387.
[1] LI Guo-dong, ZHAO Wei, TIAN Guo-hui*, XUE Ying-hua. A visual servoing algorithm based on rotation matrix decomposition [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(1): 45-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] ZHANG Yong-hua,WANG An-ling,LIU Fu-ping . The reflected phase angle of low frequent inhomogeneous[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 22 -25 .
[2] LIU Zhao-juan,LIU Jin-bo . A new control strategy of a tristate boost DC/DC converter based on input-output linearization[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(1): 43 -47 .
[3] CAI Xiaojun , ZHAGN Qing , CHAI Qiaolin 1, KONG Suli 2. AnDivided multipath dynamic source routing based on energybalanced[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 141 -145 .
[4] HUANG Jinchao. A new method for muti-objects image segmentation based on faster region proposal networks[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 20 -26 .
[5] MENG Xiang-xing1, YU Da-yang2, HAN Xue-shan2, ZHAO Jian-guo3. The  influence of  correlation  between  solar  irradiation  and  the  load  variation  on  grid-connected  photovoltaic  power  generation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(2): 126 -129 .
[6] LUO Yun-hu,XING Li-dong,WANG Qin,LIU Hai-chun,WENG Xiao-guang . Coordination of bidding strategies for two kinds of interruptible load reserve markets on demand side[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 77 -80 .
[7] HAN Xue. Example analysis for landslide hazard remote monitoring at  the Pingzhuang west open-pit mine[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(4): 116 -120 .
[8] CAO Xin, SUN Xin-Li, LI Zhen. Improved grey Bootstrap method and its application in  reliability evaluation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(1): 144 -148 .
[9] LIU Yu-zhen,XU Cheng-qiang . Post-processing of finite element analysis for the 3D microstructure of polycrystalline materials[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(2): 13 -17 .
[10] ZHAO Yuan-bin,SUN Feng-zhong,WANG Kai,GAO Ming . Three dimensional numerical analyses of heat and mass transfer in a wet cooling tower[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(5): 36 -41 .