您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2012, Vol. 42 ›› Issue (2): 18-22.

• 机器学习与数据挖掘 • 上一篇    下一篇

两阶段近邻传播半监督聚类算法

张友新,王立宏   

  1. 烟台大学计算机学院, 山东 烟台 264005
  • 收稿日期:2011-08-21 出版日期:2012-04-20 发布日期:2011-08-21
  • 作者简介:张友新(1987- ),男,山东东营人,硕士研究生,主要研究方向为数据挖掘. E-mail:zhangyouxinboy@126.com
  • 基金资助:

    国家自然科学基金资助项目(61170224);山东省自然科学基金资助项目(2009ZRB019CE)

Two-stage semi-supervised clustering algorithm based on affinity propagation

ZHANG You-xin, WANG Li-hong   

  1. School of Computer Science & Technology, Yantai University, Yantai 264005, China
  • Received:2011-08-21 Online:2012-04-20 Published:2011-08-21

摘要:

近邻传播聚类算法(affinity propagation, AP)受偏向参数影响较大,很难确定最优聚类所需的参数。设计了两阶段近邻传播半监督聚类算法(two-stage semisupervised clustering algorithm based on affinity propagation, 2SAP),在整个数据集上运行半监督近邻传播算法(semi-supervised clustering based on affinity propagation, SAP),得出类代表点集合,在类代表点集合上运行SAP算法得出结果。在实际数据集上进行实验,结果证实:与算法SAP和并行近邻传播半监督聚类算法(parallel computation of semi-supervised clustering algorithm based on affinity propagation,PSAP)相比,2SAP算法的CRI和FCRI值较高,而相应的离散系数较小,说明2SAP受偏向参数的影响较小。

关键词: 近邻传播, 偏向参数, 半监督聚类, 先验信息, 成对约束

Abstract:

The affinity propagation clustering algorithm(AP) is sensitive to the preference value, and it is difficult  to find the optimal preference value. 2SAP, a two-stage semisupervised clustering algorithm based on AP, was proposed to overcome this limitation. Semisupervised clustering based on affinity propagation (SAP) was used to cluster the whole dataset and obtain the exemplar set, and then the SAP was used again to cluster the exemplar set to find the final clusters. Experimental results on real data sets showed that the 2SAP was better than SAP and PSAP in terms of CRI and FCRI, and the lower coefficients of dispersion illustrated that 2SAP was less sensitive to the preference value.

Key words: affinity propagation, preference value, semi-supervised clustering, prior knowledge, pairwise constraints

[1] 陈文强1,林琛1,2,陈珂3,陈锦秀1,邹权1,2*. 基于GraphLab的分布式近邻传播聚类算法[J]. 山东大学学报(工学版), 2013, 43(5): 13-18.
[2] 夏战国,万玲,蔡世玉,孙鹏辉. 一种面向入侵检测的半监督聚类算法[J]. 山东大学学报(工学版), 2012, 42(6): 1-7.
[3] 丁彦,李永忠*. 基于PCA和半监督聚类的入侵检测算法研究[J]. 山东大学学报(工学版), 2012, 42(5): 41-46.
[4] 赵加敏,冯爱民*,刘学军. 局部密度嵌入的结构单类支持向量机[J]. 山东大学学报(工学版), 2012, 42(4): 13-18.
[5] 张道强. 知识保持的嵌入方法[J]. 山东大学学报(工学版), 2010, 40(2): 1-10.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!