您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (4): 24-34.doi: 10.6040/j.issn.1672-3961.0.2020.398

• • 上一篇    

自适应半监督邻域聚类算法

朱恒东1, 马盈仓1*, 代雪珍2   

  1. 1.西安工程大学理学院, 陕西 西安 710600;2. 西安交通工程学院, 陕西 西安 710300
  • 发布日期:2021-08-18
  • 作者简介:朱恒东(1997— ),男,湖南常德人,硕士研究生,主要研究方向为机器学习. E-mail:ZhuHengDong1997@163.com. *通信作者简介:马盈仓(1972— ),男,陕西合阳人,教授,硕导,博士,主要研究方向为人工智能与机器学习. E-mail:mayingcang@126.com
  • 基金资助:
    国家自然科学基金资助项目(61976130);陕西省重点研发计划项目(2018KW-021);陕西省自然科学基金资助项目(2020JQ-923)

Adaptive semi-supervised neighborhood clustering algorithm

ZHU Hengdong1, MA Yingcang1*, DAI Xuezhen2   

  1. 1. School of Science, Xi'an Polytechnic University, Xi'an 710600, Shaanxi, China;
    2. Xi'an Traffic Engineering College, Xi'an 710300, Shaanxi, China
  • Published:2021-08-18

摘要: 为了充分利用监督信息指导聚类过程,提出自适应半监督邻域聚类算法(adaptive semi-supervised neighborhood clustering algorithm, SSCAN)。引入监督矩阵与距离度量结合,构造合理的相似矩阵;充分利用监督信息,通过标签信息矩阵与流形正则项结合调整模型,改善聚类效果。在多种数据集进行试验,并与其他聚类算法作对比,结果表明,SSCAN可以充分利用监督信息,提高聚类的准确率。

关键词: 半监督学习, 流形正则项, 标签信息, 聚类, 距离度量

Abstract: In order to make full use of the supervision information to guide the clustering process, an adaptive semi-supervised neighborhood clustering algorithm(SSCAN)was proposed. The combination of supervision matrix and distance measurement was introduced to construct a reasonable similarity matrix; The supervision information was fully utilized to adjust the model through the combination of the label information matrix and the manifold regular term to improve the clustering effect. Through experiments on various data sets and comparison with other clustering algorithms, the results showed that the SSCAN algorithm could make full use of the supervision information and improve the accuracy of clustering.

Key words: semi-supervised learning, manifold regular term, label information, clustering, distance measurement

中图分类号: 

  • TP181
[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review[J]. Acm Computing Surveys, 1999, 31(3): 264-323.
[2] 刘友超,张曦煌. 基于密度自适应邻域相似图的半监督谱聚类[J]. 计算机应用研究, 2020, 37(9): 2604-2609. LIU Youchao, ZHANG Xihuang. Semi-supervised spectral clustering based on density adaptive neighborhood similarity graphs [J]. Computer Application Research, 2020, 37(9):2604-2609.
[3] 赵佳, 王士同. 特征加权距离的半监督模糊子空间聚类算法[J]. 小型微型计算机系统, 2017, 38(2): 405-410. ZHAO Jia, WANG Shitong. Semi-supervised fuzzy subspace clustering algorithm based on feature weighted distance[J]. Small Microcomputer System, 2017, 38(2): 405-410.
[4] ZHUANG L, ZHOU Z, GAO S, et al. Label information guided graph construction for semi-supervised learning[J]. IEEE Transactions on Image Processing, 2017: 4182-4192.
[5] QIU S, NIE F, XU X, et al. Accelerating flexible manifold embedding for scalable semi-supervised learning[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018: 2286-2295.
[6] WAGSTAFF K, CLAIRE C. Clustering with instance level constraints[C] //Proceedings of 17th International Conference on Machine Learning. San Francisco,USA: MKPI Press, 2000: 1097-1103.
[7] JI X, XU W. Document clustering with prior knowledge[C] //Proceedings of the 29th Annual International ACM SIGIR Conferenceon Research and Development in Information Retrieval. Seattle, USA: DBLP Press, 2006: 405-412.
[8] SHENTAL N, BAR-HILLEL A, HERTZ T, et al. Gaussian mixture models with equivalence constraints[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: NIPS Press, 2009: 33-58.
[9] LU Z, LEEN T K. Semi-supervised learning with penalized probabilistic clustering[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: DBLP Press, 2004: 849-856.
[10] COZMAN F G, COHEN I. Unlabeled data can degrade classification performance of generative classifiers[C] //Proceedings of the 15th International Florida Artificial Intelligence Research Society Conference. Florida, USA: DBLP Press, 2002: 327-331.
[11] ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[J]. Advances in Neural Information Processing Systems, 2004, 16(3): 321-328.
[12] XIANG L, YAO W, ESTER M, et al. Semi-supervised clustering in attributed heterogeneous information networks[C] // Proceedings of the 26th International Conference. Perth, Australia: IEEE Press, 2017: 1621-1629.
[13] BASU S, BANERJEE A, MOONEY R. Semi-supervised clustering by seeding[C] // Proceedings of 19th International Conference on Machine Learning. Sydney, Australia: DBLP Press, 2002: 27-34.
[14] GAO J, TAN P N, CHENG H. Semi-supervised clustering with partial background information[C] //Pro-ceedings of the 6th SIAM International Conference on Data Mining. Maryland, USA: DBLP Press, 2006: 489-493.
[15] ZHOU D, SCHOLKOPF B. A regularization framework for learning from graph data[C] //Proceedings of the Workshop on Statistical Relational Learning at 21st International Conference on Machine Learning. Alberta, Canada: ICML Press, 2004: 132-137.
[16] JIN X, LIU S Y, HAO D. Distributed semi-supervised learning algorithm based on extreme learning machine over networks using event-triggered communication scheme[J]. Neural Networks, 2019, 119: 261-272.
[17] KANNO Y, KANEKO H. Improvement of predictive accuracy in semi-supervised regression analysis by selecting unlabeled chemical structures[J]. Chemometrics and Intelligent Laboratory Systems, 2019, 191: 82-87.
[18] BAI Y, LIU S, YIN K, et al. Variational community partition with novel network structure centrality prior[J]. Applied Mathematical Modelling, 2019, 75: 333-348.
[19] WANG J, SHAO W, SONG Z. Semi-supervised variational bayesian student's t mixture regression and robust inferential sensor application[J]. Control Engineering Practice, 2019, 92:104155.1-104155.15.
[20] NIE F, WANG X, HUANG H. Clustering and projected clustering with adaptive neighbors[C] //Proceedings of the 29th Acm Sigkdd International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2014: 977-986.
[21] NIE F, WANG C L, LI X. K-multiple-means: a multiple-means clustering method with specified k clusters[C] //Proceedings of the 25th ACM SIGKDD International Conference. Anchorage, USA: ACM Press, 2019: 959-967.
[22] NIE F, WANG X, JORDAN M I, et al. The constrained laplacian rank algorithm for graph-based clustering[C] //Proceedings of the 30th AAAI Conference on Artificial Intelligence. Arizona, USA: AAAI Press, 2016: 1969-1976.
[23] HAO C D, YU F L, ZHOU Z H. Learning from semi-supervised weak-label data[C] //Proceedings of the 32th AAAI Conference on Artificial Intelligence. Louisiana, USA: DBLP Press, 2018: 2926-2933.
[24] NIE F, WANG H, HUANG H, et al. Adaptive loss minimization for semi-supervised elastic embedding[C] //Proceedings of the 23th International Joint Conference on Artificial Intelligence. Beijing, China: AAAI Press, 2013: 1565-1571.
[25] KULIS B, BASU S, DHILLON I, et al. Semi-supervised graph clustering: a kernel approach[J]. Machine Learning, 2009, 74(1): 1-22.
[1] 朱昌明,岳闻,王盼红,沈震宇,周日贵. 主动三支聚类下的全局和局部多视角多标签学习算法[J]. 山东大学学报 (工学版), 2021, 51(2): 34-46.
[2] 解子奇,王立宏,李嫚. 块对角子空间聚类中成对约束的主动式学习[J]. 山东大学学报 (工学版), 2021, 51(2): 65-73.
[3] 李蓓,赵松,谢志佳,牛萌. 电动汽车虚拟储能可用容量建模[J]. 山东大学学报 (工学版), 2020, 50(6): 101-111.
[4] 董新宇,陈瀚阅,李家国,孟庆岩,邢世和,张黎明. 基于多方法融合的非监督彩色图像分割[J]. 山东大学学报 (工学版), 2019, 49(2): 96-101.
[5] 秦军,张远鹏,蒋亦樟,杭文龙. 多代表点自约束的模糊迁移聚类[J]. 山东大学学报 (工学版), 2019, 49(2): 107-115.
[6] 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18.
[7] 宋琦悦,穆学文,程欢. 改进滴水算法的黏连字符分割方法[J]. 山东大学学报 (工学版), 2018, 48(6): 89-94, 108.
[8] 王换,周忠眉. 一种基于聚类的过抽样算法[J]. 山东大学学报(工学版), 2018, 48(3): 134-139.
[9] 张佩瑞,杨燕,邢焕来,喻琇瑛. 基于核K-means的增量多视图聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 48-53.
[10] 读习习,刘华锋,景丽萍. 一种融合社交网络的叠加联合聚类推荐模型[J]. 山东大学学报(工学版), 2018, 48(3): 96-102.
[11] 杨天鹏,徐鲲鹏,陈黎飞. 非均匀数据的变异系数聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 140-145.
[12] 庞人铭,王波,叶昊,张海峰,李明亮. 基于PCA相似度和谱聚类相结合的高炉历史数据聚类[J]. 山东大学学报(工学版), 2017, 47(5): 143-149.
[13] 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44.
[14] 吉兴全,韩国正,李可军,傅荣荣,朱仰贺. 基于密度的改进K均值聚类算法在配网区块划分中的应用[J]. 山东大学学报(工学版), 2016, 46(4): 41-46.
[15] 李朔,石宇良. 基于位置社交网络中地点聚类推荐方法[J]. 山东大学学报(工学版), 2016, 46(3): 44-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!