您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (3): 1-6.doi: 10.6040/j.issn.1672-3961.3.2014.127

• 机器学习与数据挖掘 •    下一篇

变粒度二次聚类方法

朱红1,2, 丁世飞2   

  1. 1. 徐州医学院医学信息学院, 江苏 徐州 221005;
    2. 中国矿业大学计算机科学与技术学院, 江苏 徐州 221116
  • 收稿日期:2014-10-08 修回日期:2015-05-11 出版日期:2015-06-20 发布日期:2014-10-08
  • 通讯作者: 丁世飞(1963- ),男,山东青岛人,教授,博士,主要研究方向智能信息信息处理与数据挖掘.E-mail:ShifeiDing@cumt.edu.cn E-mail:ShifeiDing@cumt.edu.cn
  • 作者简介:朱红(1970- ),女,江苏徐州人,副教授,博士,主要研究方向为数据挖掘与粒度计算.E-mail:zhuhongwin@126.com
  • 基金资助:
    国家自然科学基金资助项目(61379101);江苏省自然科学基金资助项目(BK20130209);江苏省高校自然科学基金资助项目(14KJB520039)

Twice clustering method based on variable granularity

ZHU Hong1,2, DING Shifei2   

  1. 1. School of Medical Information, Xuzhou Medical College, Xuzhou 221005, Jiangsu, China;
    2. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China
  • Received:2014-10-08 Revised:2015-05-11 Online:2015-06-20 Published:2014-10-08

摘要: 为了克服单一聚类算法的不足,将粒度计算与聚类算法相结合,提出基于聚合网络的变粒度二次聚类方法(twice clustering method based on the variable granularity and clustering network, VGTC)。首次聚类的目的是寻找合适的聚合粒层,以发现数据的局部结构,二次聚类在此基础之上完成对论域的聚类操作。创新之处在于依据聚类算法参数的改变来调整聚类的粒度,通过粒度计算将两种聚类算法的优点融合在一起。以基于k均值与层次聚类算法的变粒度自适应二次聚类方法(Twice clustering adaptive method of variable granulation based on k-means and hierarchical clustering algorithms, KHVGTC)为例,从理论和实验验证了VGTC算法的准确率和效率。

关键词: 二次聚类, 聚合网络, 聚合粒层, VGTC, 粒度计算, KHVGTC

Abstract: In order to make up the deficiency of single clustering algorithm, a new twice clustering method based on the variable granularity and clustering network (VGTC) was presented, which combined granularity computing with clustering algorithms together. The aim of the first clustering was to find local data structure through searching an appropriate clustering layer. On this basis, the secondary clustering could complete clustering operation for domain. The creativity of VGTC was that the granularity of clustering could be adjusted by changing clustering algorithm parameters, and the advantages of two clustering algorithms could be combined together through granularity computing. The twice clustering adaptive method of variable granulation based on k-means and hierarchical clustering algorithms(KHVGTC was an example of VGTC) verified the accuracy and efficiency of VGTC algorithm by theory analysis and experimental results.

Key words: clustering network, twice clustering, VGTC, granularity computing, KHVGTC, clustering layer

中图分类号: 

  • TP181
[1] 孙吉贵, 刘杰, 赵连宇. 聚类算法研究[J].软件学报, 2008, 19(1):48-60. SUN Jigui, LIU Jie, ZHAO Lianyu. Clustering algorithms research[J]. Journal of Software, 2008, 19(1):48-60.
[2] STREHL A, GHOSH J. Cluster Ensembles:a knowledge reuse framework for combing multiple partitions[J]. Journal of Machine Learning Research, 2003, 3(3):583-617.
[3] MINAEI-BIDGOLI B, TOPCHY A, PUNCH W F. A comparison of resampling methods for clustering ensembles[C]//International Conference on Machine Learning, Models, Technologies and Applications. Las Vegas, USA:CSREA, 2004:939-945.
[4] 欧阳浩, 陈波, 王萌, 等. 基于网格的二次K-means聚类算法[J]. 广西工学院学报, 2012, 23(1):24-27. OUYANG Hao, CHEN Bo, WANG Meng, et al. Two times K-means algorithm based on grid[J]. Journal of Guangxi University of Technology, 2012, 23(1):24-27.
[5] 胡学钢, 曹永照, 吴共庆. 一种有效的数据流二次聚类算法[J]. 西南交通大学学报, 2009, 44(4):490-494. HU Xuegang, CAO Yongzhao, WU Gongqing. Effective twice-clustering algorithm for data streams[J]. Journal of Southwest Jiaotong University, 2009, 44(4):490-494.
[6] ZHU H, DING S F, XU L, et al. Research and development of granularityclustering[J]. Communications in Computer and Information Science, 2011, 159(5):253-258.
[7] DING S F, XU L, ZHU H, et al. Research and progress of cluster algorithms basedon granular computing[J]. International Journal of Digital Content Technology and its Applications, 2010, 4(5):96-104.
[8] ZADEH L A. Fuzzy sets[J]. Information and Control, 1965, 8(3):338-353.
[9] PAWLAK Z. Rough sets[J]. International Journal of Information and Computer Sciences, 1982, 11(5):145-172.
[10] ZHANG B, ZHANG L. Theory and applications of problem solving[M]. AmsterdamThe Kingdom of Holland:North-Holland Publishing Co, 1992.
[11] RUSPINI E H. A new approach to clustering[J]. Information and Control, 1969, 15(1):22-32.
[12] 李远成, 阴培培, 赵银亮. 基于模糊聚类的推测多线程划分算法[J].计算机学报, 2014, 37(3):580-592. LI Yuancheng, YIN Peipei, ZHAO Yinliang. A FCM—based thread partitioning algorithm for speculative multithreading[J]. Chinese Journal of Computers, 2014, 37(3):580-592.
[13] 唐利明, 王洪珂, 陈照辉, 等. 基于变分水平集的图像模糊聚类分割[J]. 软件学报, 2014, 25(7):1570-1582. TANG Liming, WANG Hongke, CHEN Zhaohui, et al. Image fuzzy clustering segmentation based on variational level set[J]. Journal of Software, 2014, 25(7):1570-1582.
[14] MALYSZKO D, STEPANIUK J. Rough entropy hierarchical agglomerative clustering in image segmentation[J]. Transactions on Rough Sets XIII, 2011, 6499:89-103.
[15] YANTO I T R, HERAWAN T, DERIS M M. Data clustering using variable precision rough set[J]. Intelligent Data Analysis, 2011, 15(4):465-482.
[16] ZHANG L, ZHANG B. Quotient space based cluster analysis[C]//Proceedings of Foundations and Novel Approaches in Data Mining. Berlin, Germany:Springer, 2006:259-269.
[17] XUE Z X, SHANG Y L, FENG A F. Semi-supervised outlier detection based on fuzzy rough C-means clustering[J]. Mathematics and Computers in Simulation, 2010, 80(9):1911-1921.
[18] MAJI P. Fuzzy-rough supervised attribute clustering algorithm and classification of microarray data[J]. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 2011, 41(1):222-233.
[19] ZHOU J, PEDRYCZ W, MIAO D Q. Shadowed sets in the characterization of rough-fuzzy clustering[J]. Pattern Recognition, 2011, 44(8):1738-1749.[ZK)]
[20] [ZK(]张铃, 张钹. 模糊商空间理论(模糊粒度计算方法)[J]. 软件学报, 2003, 14(4):770-776. ZHANG Ling, ZHANG Bo. Theory of fuzzy quotient space (Methods of fuzzy granular computing)[J]. Journal of Software, 2003, 14(4):770-776.[ZK)]
[21] [ZK(]严莉莉, 张燕平, 胡必云.基于商空间粒度的覆盖聚类算法[J]. 计算机应用研究, 2008, 25(1):47-49. YAN Lili, ZhANG Yanping, HU biyun. Covering clustering algorithm based on quotient space granularity[J]. Application Research of Computers, 2008, 25(1):47-49.[ZK)]
[22] [ZK(]卜东波, 白硕, 李国杰. 聚类/分类中的粒度原理[J]. 计算机学报, 2002, 25(8):810-815. BU Dongbo, BAI Shuo, LI Guojie. Principle of granularity in clustering and classification[J]. Chinese Journal of Computers, 2002, 25(8):810-815.[ZK)]
[23] [ZK(]王伦文. 聚类的粒度分析[J]. 计算机工程与应用, 2006, 42(5):29-31. WANG Lunwen. Study of granular analysis in clustering[J]. Computer Engineering and Applications, 2006, 42(5):29-31.[ZK)]
[24] ZHU H, DING S F, XU L, et al. A parallel attribute reduction algorithm based on affinity propagation clustering[J]. Journal of Computers, 2013, 8(4):990-997.
[25] ZHU H, DING S F, HAN Z, et al. Attribute granulation based on attribute discernibility and AP algorithm[J]. Journal of Software, 2013, 8(4):834-841.[ZK)]
[26] [ZK(]汪小寒, 张燕平, 赵姝, 等. 基于分层递阶粒度聚类法的空气质量评价[J].计算机应用研究, 2013, 30(1):192-194.
WANG Xiaohan, ZHANG Yanping, ZHAO Shu, et al. Air quality evaluation based on delaminated granular clustering method[J]. Application Research of Computers, 2013, 30(1):192-194.
[1] 王婷婷,翟俊海,张明阳,郝璞. 基于HBase和SimHash的大数据K-近邻算法[J]. 山东大学学报(工学版), 2018, 48(3): 54-59.
[2] 何正义,曾宪华,郭姜. 一种集成卷积神经网络和深信网的步态识别与模拟方法[J]. 山东大学学报(工学版), 2018, 48(3): 88-95.
[3] 崔晓松,王颖,孟佳, 邹丽. 基于语言值相似度推理的网络商家自评价方法[J]. 山东大学学报(工学版), 2018, 48(1): 1-7.
[4] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[5] 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42.
[6] 刘英霞,王希常,唐晓丽,常发亮. 基于小波域特征和贝叶斯估计的目标检测算法[J]. 山东大学学报(工学版), 2017, 47(2): 63-70.
[7] 何正义,曾宪华,曲省卫,吴治龙. 基于集成深度学习的时间序列预测模型[J]. 山东大学学报(工学版), 2016, 46(6): 40-47.
[8] 王梅,曾昭虎,孙莺萁,杨二龙,宋考平. 基于输入K-近邻的正则化路径上SVR贝叶斯组合[J]. 山东大学学报(工学版), 2016, 46(6): 8-14.
[9] 陈泽华,尚晓慧,柴晶. 基于混合Hausdorff距离的多示例学习近邻分类器[J]. 山东大学学报(工学版), 2016, 46(6): 15-22.
[10] 王志强,文益民,李芳. 基于多方面评分的景点协同推荐算法[J]. 山东大学学报(工学版), 2016, 46(6): 54-61.
[11] 黄丹,王志海,刘海洋. 一种局部协同过滤的排名推荐算法[J]. 山东大学学报(工学版), 2016, 46(5): 29-36.
[12] 莫小勇,潘志松,邱俊洋,余亚军,蒋铭初. 基于在线特征选择的网络流异常检测[J]. 山东大学学报(工学版), 2016, 46(4): 21-27.
[13] 庞俊涛, 张晖, 杨春明, 李波, 赵旭剑. 基于概率矩阵分解的多指标协同过滤算法[J]. 山东大学学报(工学版), 2016, 46(3): 65-73.
[14] 翟俊海,张素芳,胡文祥,王熙照. 核心集径向基函数极限学习机[J]. 山东大学学报(工学版), 2016, 46(2): 1-5.
[15] 江峰,杜军威,刘国柱,眭跃飞. 基于加权的K-modes聚类初始中心选择算法[J]. 山东大学学报(工学版), 2016, 46(2): 29-34.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!