山东大学学报(工学版) ›› 2014, Vol. 44 ›› Issue (6): 26-31.doi: 10.6040/j.issn.1672-3961.1.2014.116

卢文羊, 徐佳一, 杨育彬   

  1. 南京大学计算机软件新技术国家重点实验室, 江苏 南京 210023
  • 收稿日期:2014-03-31 修回日期:2014-11-14 出版日期:2014-12-20 发布日期:2014-03-31
  • 通讯作者: 杨育彬(1977-),男,江西赣州人,教授,博士(后),主要研究方向为数字媒体理解与智能处理技术及其应用,基于云计算的海量数据挖掘算法及应用系统,社会网络分析及其可视化.E-mail:yangyubin@nju.edu.cn E-mail:yangyubin@nju.edu.cn
  • 作者简介:卢文羊(1992-),男,江苏宿迁人,硕士研究生,主要研究方向为数据挖掘与社会网络分析.E-mail:luwy007@gmail.com
  • 基金资助:
    教育部新世纪优秀人才计划资助项目(NCET-11-0213);国家自然科学基金资助项目(61273257,61035003, 61021062);江苏省六大人才高峰计划资助项目(2013-XXRJ-018)

LDA-based link prediction in social network

LU Wenyang, XU Jiayi, YANG Yubin   

  1. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, Jiangsu, China
  • Received:2014-03-31 Revised:2014-11-14 Online:2014-12-20 Published:2014-03-31

摘要: 针对传统社会网络链接预测方法忽视节点文本内容的问题,提出一种基于潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)主题模型的协作演化链接预测算法。算法利用LDA模型,对节点的文本内容进行分析,提取出每个节点的主题分布向量,利用分布向量的点积来衡量节点文本的相似性;然后将节点文本内容相似性矩阵与节点邻接矩阵相加,在此基础上计算节点之间的相似性;最后选取相似性最高的k个节点作为预测结果。实验结果表明该算法在网络图稀疏的情况下有较好的效果。

关键词: 链接预测, 网络演化, 主题模型, 潜在狄利克雷分配, 社会网络

Abstract: To address the problem of ignoring the text contents of nodes in social network link prediction methods, a Latent Dirichlet Allocation(LDA)-based collaborative evolutionary link prediction algorithm was proposed. The algorithm used LDA model to analyze the text content and abstracted a topic distribution vector for each node; The product of the topic distribution vectors was adopted to measure the similarity between the nodes' contents; Afterwards, the content similarity matrix was added to the adjacency matrix and the similarities between the nodes were computed consequently; At last, k most similar nodes were selected as the prediction result. The experimental results showed that the proposed algorithm achieved good prediction performance in sparse networks.

Key words: network evolution, social network, link prediction, topic model, Latent Dirichlet Allocation


