您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (2): 1-9.doi: 10.6040/j.issn.1672-3961.1.2014.095

• 机器学习与数据挖掘 •    下一篇

一种基于协同进化方法的聚类集成算法

董红斌, 张广江, 逄锦伟, 韩启龙   

  1. 哈尔滨工程大学计算机科学与技术学院, 黑龙江 哈尔滨 150001
  • 收稿日期:2014-03-26 修回日期:2015-03-17 出版日期:2015-04-20 发布日期:2014-03-26
  • 作者简介:董红斌(1963-),男,黑龙江哈尔滨人,教授,博士,主要研究方向为人工智能,协同演化算法,数据挖掘与机器学习.E-mail:donghongbin@hrbeu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(60973075,61272186,61472095);哈尔滨工程大学中央高校基本科研业务费资助项目(HEUCFl00607)

A clustering ensemble algorithm based on co-evolution

DONG Hongbin, ZHANG Guangjiang, PANG Jinwei, HAN Qilong   

  1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, Heilongjiang, China
  • Received:2014-03-26 Revised:2015-03-17 Online:2015-04-20 Published:2014-03-26

摘要: 针对单一聚类算法存在的不能泛化的问题,将集成学习技术应用于聚类算法中,集成学习技术可以显著提高学习系统的泛化能力。提出了1种基于粒子群和遗传算法的协同进化聚类集成算法,粒子群算法保证算法快速收敛,遗传算法全局搜索扩大搜索范围,提高了聚类的性能和收敛速度。将本研究提出的算法在多个UCI数据集上进行试验验证,结果表明该算法是有效的。

关键词: 协同聚类集成, 粒子群优化算法, 协同进化, 聚类集成, 聚类, 遗传算法

Abstract: Since clustering could not solve the problem of generalization, the integration technology was introduced into clustering algorithm, which could significantly improve the generalization ability of learning systems. A co-evolutionary clustering ensemble algorithm based on particle swarm optimization and genetic algorithm (CEGPCE) was proposed. PSO (particle swarm optimization) ensured the algorithm with fast convergence, and GA (genetic algorithm) expanded the search scope with its global search capability, which improved the performance of the algorithm and the convergence speed. Experiments on the UCI data sets verified the effectiveness of CEGPCE.

Key words: particle swarm optimization, genetic algorithm, co-evolutionary clustering ensemble, clustering, clustering ensemble, co-evolution

中图分类号: 

  • TP18
[1] 孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报, 2008, 19(1):48-60. SUN Jigui, LIU Jie, ZHAO Lianyu. Clustering algorithms research[J]. Journal of Software, 2008, 19(1):48-60.
[2] AK J. Data clustering: 50 years beyond K-means[J]. Pattern Recognition Letters, 2010, 31(8):651-666.
[3] MIRKIN B. Clustering: a data recovery approach[M]. Florida, USA:CRC Press, 2012.
[4] AZIMI J, FERN X. Adaptive cluster ensemble selection[C]//Proceedings of the 21st International Joint Conference on Artificial Intelligence. Pasadena, California, USA: IJCAI, 2009:992-997.
[5] VEGA-PONS S, RUIZ-SHULCLOPER J. A survey of clustering ensemble algorithms[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2011, 25(03):337-372.
[6] JIA J, XIAO X, LIU B, et al. Bagging-based spectral clustering ensemble selection[J]. Pattern Recognition Letters, 2011, 32(10):1456-1467.
[7] YU Z, YOU J, WONG H S, et al. From cluster ensemble to structure ensemble[J]. Information Sciences, 2012, 198:81-99.
[8] YU Z, LI L, WONG H S, et al. Probabilistic cluster structure ensemble[J]. Information Sciences, 2014, 267:16-34.
[9] YU Z, CHEN H, YOU J, et al. Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular data[J]. IEEE-ACM Transactions on Computational Biology and Bioinformatics, 2013, 10(3):657-670.
[10] XIAO J, HE C, JIANG X, et al. A dynamic classifier ensemble selection approach for noise data[J]. Information Sciences, 2010, 180(18):3402-3421.
[11] CHRISTOU I T. Coordination of cluster ensembles via exact methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2):279-293.
[12] YANG Y, CHEN K. Temporal data clustering via weighted clustering ensemble with different representations[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(2):307-320.
[13] WANG T. CA-Tree:a hierarchical structure for efficient and scalable coassociation-based cluster ensembles[J]. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 2011, 41(3):686-698.
[14] YANG F, LI X, LI Q, et al. Exploring the diversity in cluster ensemble generation: random sampling and random projection[J]. Expert Systems with Applications, 2014(41):4844-4866.
[15] 罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007, 30(8):1315-1324. LUO Huilan, KONG Fansheng, LI Yixiao. An analysis of diversity measures in clustering enserbles[J]. Chinese Journal of Computers, 2007, 30(8):1315-1324.
[16] 何灵敏,潘益民.一种基于GA的聚类集成算法[J].中国计量学院学报, 2011, 22(3):282-285. HE Linmin, PAN Yimin. A clustering ensemble algorithm based on GA[J]. Journal of China University of Metrology, 2011, 22(3):282-285.
[17] 王丙景,高茂庭.一种基于遗传算法的聚类集成方法[J]. 计算机工程与应用, 2013, 49(8):1-8. WANG Bingjing, GAO Maoting. New model for clustering ensemble based on genetic algorithms[J]. Computer Engineering and Applications, 2013, 49(8):1-8.
[18] HE J, TAN A H, TAN C L. Modified ART 2A growing network capable of generating a fixed number of nodes[J]. Neural Networks, IEEE Transactions on, 2004, 15(3):728-737.
[19] PATERLINI S, KRINK T. Differential evolution and particle swarm optimisation in partitional clustering[J]. Computational Statistics & Data Analysis, 2006, 50(5):1220-1247.
[20] 董红斌, 杨宝迪, 刘佳媛, 等. 协同演化算法在聚类中的应用[J]. 模式识别与人工智能, 2012, 25(4):676-683. DONG Hongbin, YANG Baodi, LIU Jiayuan, et al. A co-evolutionary algorithm for clustering[J]. Pattern Recognition and Artificial Intelligence, 2012, 25(4):676-683.
[21] 董红斌, 黄厚宽, 印桂生, 等. 协同演化算法研究进展[J]. 计算机研究与发展, 2008, 45(3):454-463. DONG Hongbin, HUANG Houkuan, YIN Guisheng, et al. An overview of the research on coevolutionary algorithms[J]. Journal of Computer Research and Development, 2008, 45(3):454-463.
[22] TAN K C, YANG Y J, GOH C K. A distributed cooperative coevolutionary algorithm for multiobjective optimization[J]. IEEE Transactions on Evolutionary Computation, 2006, 10(5):527-549.
[23] 唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报, 2005, 16(4):496-502. TANG Wei, ZHOU Zhihua. Bagging-based selective clusterer ensemble[J]. Journal of Software, 2005, 16(4):496-502.
[24] 王继成, 萧嵘, 孙正兴, 等. Web 信息检索研究进展[J]. 计算机研究与发展, 2001, 38(2):187-193. WANG Jicheng, XIAO Rong, SUN Zhengxing, et al. State of the art of information retrieval on the Web[J]. Journal of Computer Research and Development, 2001, 38(2):187-193.
[25] STREHL A, GHOSH J. Cluster ensembles—a knowledge reuse framework for combining multiple partitions[J]. The Journal of Machine Learning Research, 2003, 3(1):583-617.
[26] 谷鹏花. 聚类集成及差异性的研究[D]. 成都:西南交通大学, 2012. GU Penghua. Research on clustering ensembles and diversity[D]. Chengdu: Southwest Jiaotong University, 2012.
[1] 王换,周忠眉. 一种基于聚类的过抽样算法[J]. 山东大学学报(工学版), 2018, 48(3): 134-139.
[2] 张佩瑞,杨燕,邢焕来,喻琇瑛. 基于核K-means的增量多视图聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 48-53.
[3] 陈嘉杰,王金凤. 基于蚁群算法求解Choquet模糊积分模型[J]. 山东大学学报(工学版), 2018, 48(3): 81-87.
[4] 读习习,刘华锋,景丽萍. 一种融合社交网络的叠加联合聚类推荐模型[J]. 山东大学学报(工学版), 2018, 48(3): 96-102.
[5] 杨天鹏,徐鲲鹏,陈黎飞. 非均匀数据的变异系数聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 140-145.
[6] 王飞,徐健,李伟,汪新浩,施啸寒. 基于分布式储能系统的风储滚动优化调度方法[J]. 山东大学学报(工学版), 2017, 47(6): 89-94.
[7] 庞人铭,王波,叶昊,张海峰,李明亮. 基于PCA相似度和谱聚类相结合的高炉历史数据聚类[J]. 山东大学学报(工学版), 2017, 47(5): 143-149.
[8] 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44.
[9] 吉兴全,韩国正,李可军,傅荣荣,朱仰贺. 基于密度的改进K均值聚类算法在配网区块划分中的应用[J]. 山东大学学报(工学版), 2016, 46(4): 41-46.
[10] 王常顺,肖海荣. 基于自抗扰控制的水面无人艇路径跟踪控制器[J]. 山东大学学报(工学版), 2016, 46(4): 54-59.
[11] 李朔,石宇良. 基于位置社交网络中地点聚类推荐方法[J]. 山东大学学报(工学版), 2016, 46(3): 44-50.
[12] 江峰,杜军威,刘国柱,眭跃飞. 基于加权的K-modes聚类初始中心选择算法[J]. 山东大学学报(工学版), 2016, 46(2): 29-34.
[13] 樊淑炎, 丁世飞. 基于多尺度的改进Graph cut算法[J]. 山东大学学报(工学版), 2016, 46(1): 28-33.
[14] 徐平安,唐雁,石教开,张辉荣. 基于薛定谔方程的K-Means聚类算法[J]. 山东大学学报(工学版), 2016, 46(1): 34-41.
[15] 刘德宝, 吴耀华, 郭耀阳, 王艳艳. 基于串并行混合拣选策略的自动拣选系统品项分配优化[J]. 山东大学学报(工学版), 2015, 45(6): 36-44.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!