您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (4): 24-34.doi: 10.6040/j.issn.1672-3961.0.2020.398

• • 上一篇    下一篇

自适应半监督邻域聚类算法

朱恒东1, 马盈仓1*, 代雪珍2   

  1. 1.西安工程大学理学院, 陕西 西安 710600;2. 西安交通工程学院, 陕西 西安 710300
  • 发布日期:2021-08-18
  • 作者简介:朱恒东(1997— ),男,湖南常德人,硕士研究生,主要研究方向为机器学习. E-mail:ZhuHengDong1997@163.com. *通信作者简介:马盈仓(1972— ),男,陕西合阳人,教授,硕导,博士,主要研究方向为人工智能与机器学习. E-mail:mayingcang@126.com
  • 基金资助:
    国家自然科学基金资助项目(61976130);陕西省重点研发计划项目(2018KW-021);陕西省自然科学基金资助项目(2020JQ-923)

Adaptive semi-supervised neighborhood clustering algorithm

ZHU Hengdong1, MA Yingcang1*, DAI Xuezhen2   

  1. 1. School of Science, Xi'an Polytechnic University, Xi'an 710600, Shaanxi, China;
    2. Xi'an Traffic Engineering College, Xi'an 710300, Shaanxi, China
  • Published:2021-08-18

摘要: 为了充分利用监督信息指导聚类过程,提出自适应半监督邻域聚类算法(adaptive semi-supervised neighborhood clustering algorithm, SSCAN)。引入监督矩阵与距离度量结合,构造合理的相似矩阵;充分利用监督信息,通过标签信息矩阵与流形正则项结合调整模型,改善聚类效果。在多种数据集进行试验,并与其他聚类算法作对比,结果表明,SSCAN可以充分利用监督信息,提高聚类的准确率。

关键词: 半监督学习, 流形正则项, 标签信息, 聚类, 距离度量

Abstract: In order to make full use of the supervision information to guide the clustering process, an adaptive semi-supervised neighborhood clustering algorithm(SSCAN)was proposed. The combination of supervision matrix and distance measurement was introduced to construct a reasonable similarity matrix; The supervision information was fully utilized to adjust the model through the combination of the label information matrix and the manifold regular term to improve the clustering effect. Through experiments on various data sets and comparison with other clustering algorithms, the results showed that the SSCAN algorithm could make full use of the supervision information and improve the accuracy of clustering.

Key words: semi-supervised learning, manifold regular term, label information, clustering, distance measurement

中图分类号: 

  • TP181
[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review[J]. Acm Computing Surveys, 1999, 31(3): 264-323.
[2] 刘友超,张曦煌. 基于密度自适应邻域相似图的半监督谱聚类[J]. 计算机应用研究, 2020, 37(9): 2604-2609. LIU Youchao, ZHANG Xihuang. Semi-supervised spectral clustering based on density adaptive neighborhood similarity graphs [J]. Computer Application Research, 2020, 37(9):2604-2609.
[3] 赵佳, 王士同. 特征加权距离的半监督模糊子空间聚类算法[J]. 小型微型计算机系统, 2017, 38(2): 405-410. ZHAO Jia, WANG Shitong. Semi-supervised fuzzy subspace clustering algorithm based on feature weighted distance[J]. Small Microcomputer System, 2017, 38(2): 405-410.
[4] ZHUANG L, ZHOU Z, GAO S, et al. Label information guided graph construction for semi-supervised learning[J]. IEEE Transactions on Image Processing, 2017: 4182-4192.
[5] QIU S, NIE F, XU X, et al. Accelerating flexible manifold embedding for scalable semi-supervised learning[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018: 2286-2295.
[6] WAGSTAFF K, CLAIRE C. Clustering with instance level constraints[C] //Proceedings of 17th International Conference on Machine Learning. San Francisco,USA: MKPI Press, 2000: 1097-1103.
[7] JI X, XU W. Document clustering with prior knowledge[C] //Proceedings of the 29th Annual International ACM SIGIR Conferenceon Research and Development in Information Retrieval. Seattle, USA: DBLP Press, 2006: 405-412.
[8] SHENTAL N, BAR-HILLEL A, HERTZ T, et al. Gaussian mixture models with equivalence constraints[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: NIPS Press, 2009: 33-58.
[9] LU Z, LEEN T K. Semi-supervised learning with penalized probabilistic clustering[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: DBLP Press, 2004: 849-856.
[10] COZMAN F G, COHEN I. Unlabeled data can degrade classification performance of generative classifiers[C] //Proceedings of the 15th International Florida Artificial Intelligence Research Society Conference. Florida, USA: DBLP Press, 2002: 327-331.
[11] ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[J]. Advances in Neural Information Processing Systems, 2004, 16(3): 321-328.
[12] XIANG L, YAO W, ESTER M, et al. Semi-supervised clustering in attributed heterogeneous information networks[C] // Proceedings of the 26th International Conference. Perth, Australia: IEEE Press, 2017: 1621-1629.
[13] BASU S, BANERJEE A, MOONEY R. Semi-supervised clustering by seeding[C] // Proceedings of 19th International Conference on Machine Learning. Sydney, Australia: DBLP Press, 2002: 27-34.
[14] GAO J, TAN P N, CHENG H. Semi-supervised clustering with partial background information[C] //Pro-ceedings of the 6th SIAM International Conference on Data Mining. Maryland, USA: DBLP Press, 2006: 489-493.
[15] ZHOU D, SCHOLKOPF B. A regularization framework for learning from graph data[C] //Proceedings of the Workshop on Statistical Relational Learning at 21st International Conference on Machine Learning. Alberta, Canada: ICML Press, 2004: 132-137.
[16] JIN X, LIU S Y, HAO D. Distributed semi-supervised learning algorithm based on extreme learning machine over networks using event-triggered communication scheme[J]. Neural Networks, 2019, 119: 261-272.
[17] KANNO Y, KANEKO H. Improvement of predictive accuracy in semi-supervised regression analysis by selecting unlabeled chemical structures[J]. Chemometrics and Intelligent Laboratory Systems, 2019, 191: 82-87.
[18] BAI Y, LIU S, YIN K, et al. Variational community partition with novel network structure centrality prior[J]. Applied Mathematical Modelling, 2019, 75: 333-348.
[19] WANG J, SHAO W, SONG Z. Semi-supervised variational bayesian student's t mixture regression and robust inferential sensor application[J]. Control Engineering Practice, 2019, 92:104155.1-104155.15.
[20] NIE F, WANG X, HUANG H. Clustering and projected clustering with adaptive neighbors[C] //Proceedings of the 29th Acm Sigkdd International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2014: 977-986.
[21] NIE F, WANG C L, LI X. K-multiple-means: a multiple-means clustering method with specified k clusters[C] //Proceedings of the 25th ACM SIGKDD International Conference. Anchorage, USA: ACM Press, 2019: 959-967.
[22] NIE F, WANG X, JORDAN M I, et al. The constrained laplacian rank algorithm for graph-based clustering[C] //Proceedings of the 30th AAAI Conference on Artificial Intelligence. Arizona, USA: AAAI Press, 2016: 1969-1976.
[23] HAO C D, YU F L, ZHOU Z H. Learning from semi-supervised weak-label data[C] //Proceedings of the 32th AAAI Conference on Artificial Intelligence. Louisiana, USA: DBLP Press, 2018: 2926-2933.
[24] NIE F, WANG H, HUANG H, et al. Adaptive loss minimization for semi-supervised elastic embedding[C] //Proceedings of the 23th International Joint Conference on Artificial Intelligence. Beijing, China: AAAI Press, 2013: 1565-1571.
[25] KULIS B, BASU S, DHILLON I, et al. Semi-supervised graph clustering: a kernel approach[J]. Machine Learning, 2009, 74(1): 1-22.
[1] 李晓辉,刘小飞,孙炜桐,赵毅,董媛,靳引利. 基于车辆与无人机协同的巡检任务分配与路径规划算法[J]. 山东大学学报 (工学版), 2025, 55(5): 101-109.
[2] 陈素根,赵志忠. 融合局部截断距离及小簇合并的密度峰值聚类[J]. 山东大学学报 (工学版), 2025, 55(2): 58-70.
[3] 王梅,宋凯文,刘勇,王志宝,万达. DMKK-means——一种深度多核K-means聚类算法[J]. 山东大学学报 (工学版), 2024, 54(6): 1-7.
[4] 王丽娟,徐晓,丁世飞. 面向密度峰值聚类的高效相似度度量[J]. 山东大学学报 (工学版), 2024, 54(3): 12-21.
[5] 张鑫,费可可. 基于log鲁棒核岭回归的子空间聚类算法[J]. 山东大学学报 (工学版), 2023, 53(6): 26-34.
[6] 李兆彬,叶军,周浩岩,卢岚,谢立. 变异萤火虫优化的粗糙K-均值聚类算法[J]. 山东大学学报 (工学版), 2023, 53(4): 74-82.
[7] 侯延琛,赵金东. 任意形状聚类的SPK-means算法[J]. 山东大学学报 (工学版), 2023, 53(2): 87-92.
[8] 程业超,刘惊雷. 自适应图正则的单步子空间聚类[J]. 山东大学学报 (工学版), 2022, 52(2): 57-66.
[9] 尹旭,刘兆英,张婷,李玉鑑. 基于弱监督和半监督学习的红外舰船分割方法[J]. 山东大学学报 (工学版), 2022, 52(2): 99-106.
[10] 卢建云,张蔚,李林. 一种基于动态局部密度和聚类结构的聚类算法[J]. 山东大学学报 (工学版), 2022, 52(2): 118-127.
[11] 孟银凤,杨佳宇,曹付元. 函数型数据的分裂转移式层次聚类算法[J]. 山东大学学报 (工学版), 2022, 52(1): 19-27.
[12] 朱昌明,岳闻,王盼红,沈震宇,周日贵. 主动三支聚类下的全局和局部多视角多标签学习算法[J]. 山东大学学报 (工学版), 2021, 51(2): 34-46.
[13] 解子奇,王立宏,李嫚. 块对角子空间聚类中成对约束的主动式学习[J]. 山东大学学报 (工学版), 2021, 51(2): 65-73.
[14] 李蓓,赵松,谢志佳,牛萌. 电动汽车虚拟储能可用容量建模[J]. 山东大学学报 (工学版), 2020, 50(6): 101-111.
[15] 董新宇,陈瀚阅,李家国,孟庆岩,邢世和,张黎明. 基于多方法融合的非监督彩色图像分割[J]. 山东大学学报 (工学版), 2019, 49(2): 96-101.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 张永花,王安玲,刘福平 . 低频非均匀电磁波在导电界面的反射相角[J]. 山东大学学报(工学版), 2006, 36(2): 22 -25 .
[2] 孔祥臻,刘延俊,王勇,赵秀华 . 气动比例阀的死区补偿与仿真[J]. 山东大学学报(工学版), 2006, 36(1): 99 -102 .
[3] 来翔 . 用胞映射方法讨论一类MKdV方程[J]. 山东大学学报(工学版), 2006, 36(1): 87 -92 .
[4] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .
[5] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[6] 秦通,孙丰荣*,王丽梅,王庆浩,李新彩. 基于极大圆盘引导的形状插值实现三维表面重建[J]. 山东大学学报(工学版), 2010, 40(3): 1 -5 .
[7] 孙殿柱,朱昌志,李延瑞 . 散乱点云边界特征快速提取算法[J]. 山东大学学报(工学版), 2009, 39(1): 84 -86 .
[8] 孙从征,管从胜,秦敬玉,程川 . 铝合金化学镀镍磷合金结构和性能[J]. 山东大学学报(工学版), 2007, 37(5): 108 -112 .
[9] 胡天亮,李鹏,张承瑞,左毅 . 基于VHDL的正交编码脉冲电路解码计数器设计[J]. 山东大学学报(工学版), 2008, 38(3): 10 -13 .
[10] 卜德云 张道强. 自适应谱聚类算法研究[J]. 山东大学学报(工学版), 2009, 39(5): 22 -26 .