山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (4): 24-34.doi: 10.6040/j.issn.1672-3961.0.2020.398
• • 上一篇
朱恒东1, 马盈仓1*, 代雪珍2
ZHU Hengdong1, MA Yingcang1*, DAI Xuezhen2
摘要: 为了充分利用监督信息指导聚类过程,提出自适应半监督邻域聚类算法(adaptive semi-supervised neighborhood clustering algorithm, SSCAN)。引入监督矩阵与距离度量结合,构造合理的相似矩阵;充分利用监督信息,通过标签信息矩阵与流形正则项结合调整模型,改善聚类效果。在多种数据集进行试验,并与其他聚类算法作对比,结果表明,SSCAN可以充分利用监督信息,提高聚类的准确率。
中图分类号:
[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review[J]. Acm Computing Surveys, 1999, 31(3): 264-323. [2] 刘友超,张曦煌. 基于密度自适应邻域相似图的半监督谱聚类[J]. 计算机应用研究, 2020, 37(9): 2604-2609. LIU Youchao, ZHANG Xihuang. Semi-supervised spectral clustering based on density adaptive neighborhood similarity graphs [J]. Computer Application Research, 2020, 37(9):2604-2609. [3] 赵佳, 王士同. 特征加权距离的半监督模糊子空间聚类算法[J]. 小型微型计算机系统, 2017, 38(2): 405-410. ZHAO Jia, WANG Shitong. Semi-supervised fuzzy subspace clustering algorithm based on feature weighted distance[J]. Small Microcomputer System, 2017, 38(2): 405-410. [4] ZHUANG L, ZHOU Z, GAO S, et al. Label information guided graph construction for semi-supervised learning[J]. IEEE Transactions on Image Processing, 2017: 4182-4192. [5] QIU S, NIE F, XU X, et al. Accelerating flexible manifold embedding for scalable semi-supervised learning[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018: 2286-2295. [6] WAGSTAFF K, CLAIRE C. Clustering with instance level constraints[C] //Proceedings of 17th International Conference on Machine Learning. San Francisco,USA: MKPI Press, 2000: 1097-1103. [7] JI X, XU W. Document clustering with prior knowledge[C] //Proceedings of the 29th Annual International ACM SIGIR Conferenceon Research and Development in Information Retrieval. Seattle, USA: DBLP Press, 2006: 405-412. [8] SHENTAL N, BAR-HILLEL A, HERTZ T, et al. Gaussian mixture models with equivalence constraints[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: NIPS Press, 2009: 33-58. [9] LU Z, LEEN T K. Semi-supervised learning with penalized probabilistic clustering[C] //Proceedings of Neural Information Processing Systems. Vancouver, Canada: DBLP Press, 2004: 849-856. [10] COZMAN F G, COHEN I. Unlabeled data can degrade classification performance of generative classifiers[C] //Proceedings of the 15th International Florida Artificial Intelligence Research Society Conference. Florida, USA: DBLP Press, 2002: 327-331. [11] ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[J]. Advances in Neural Information Processing Systems, 2004, 16(3): 321-328. [12] XIANG L, YAO W, ESTER M, et al. Semi-supervised clustering in attributed heterogeneous information networks[C] // Proceedings of the 26th International Conference. Perth, Australia: IEEE Press, 2017: 1621-1629. [13] BASU S, BANERJEE A, MOONEY R. Semi-supervised clustering by seeding[C] // Proceedings of 19th International Conference on Machine Learning. Sydney, Australia: DBLP Press, 2002: 27-34. [14] GAO J, TAN P N, CHENG H. Semi-supervised clustering with partial background information[C] //Pro-ceedings of the 6th SIAM International Conference on Data Mining. Maryland, USA: DBLP Press, 2006: 489-493. [15] ZHOU D, SCHOLKOPF B. A regularization framework for learning from graph data[C] //Proceedings of the Workshop on Statistical Relational Learning at 21st International Conference on Machine Learning. Alberta, Canada: ICML Press, 2004: 132-137. [16] JIN X, LIU S Y, HAO D. Distributed semi-supervised learning algorithm based on extreme learning machine over networks using event-triggered communication scheme[J]. Neural Networks, 2019, 119: 261-272. [17] KANNO Y, KANEKO H. Improvement of predictive accuracy in semi-supervised regression analysis by selecting unlabeled chemical structures[J]. Chemometrics and Intelligent Laboratory Systems, 2019, 191: 82-87. [18] BAI Y, LIU S, YIN K, et al. Variational community partition with novel network structure centrality prior[J]. Applied Mathematical Modelling, 2019, 75: 333-348. [19] WANG J, SHAO W, SONG Z. Semi-supervised variational bayesian student's t mixture regression and robust inferential sensor application[J]. Control Engineering Practice, 2019, 92:104155.1-104155.15. [20] NIE F, WANG X, HUANG H. Clustering and projected clustering with adaptive neighbors[C] //Proceedings of the 29th Acm Sigkdd International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2014: 977-986. [21] NIE F, WANG C L, LI X. K-multiple-means: a multiple-means clustering method with specified k clusters[C] //Proceedings of the 25th ACM SIGKDD International Conference. Anchorage, USA: ACM Press, 2019: 959-967. [22] NIE F, WANG X, JORDAN M I, et al. The constrained laplacian rank algorithm for graph-based clustering[C] //Proceedings of the 30th AAAI Conference on Artificial Intelligence. Arizona, USA: AAAI Press, 2016: 1969-1976. [23] HAO C D, YU F L, ZHOU Z H. Learning from semi-supervised weak-label data[C] //Proceedings of the 32th AAAI Conference on Artificial Intelligence. Louisiana, USA: DBLP Press, 2018: 2926-2933. [24] NIE F, WANG H, HUANG H, et al. Adaptive loss minimization for semi-supervised elastic embedding[C] //Proceedings of the 23th International Joint Conference on Artificial Intelligence. Beijing, China: AAAI Press, 2013: 1565-1571. [25] KULIS B, BASU S, DHILLON I, et al. Semi-supervised graph clustering: a kernel approach[J]. Machine Learning, 2009, 74(1): 1-22. |
[1] | 朱昌明,岳闻,王盼红,沈震宇,周日贵. 主动三支聚类下的全局和局部多视角多标签学习算法[J]. 山东大学学报 (工学版), 2021, 51(2): 34-46. |
[2] | 解子奇,王立宏,李嫚. 块对角子空间聚类中成对约束的主动式学习[J]. 山东大学学报 (工学版), 2021, 51(2): 65-73. |
[3] | 李蓓,赵松,谢志佳,牛萌. 电动汽车虚拟储能可用容量建模[J]. 山东大学学报 (工学版), 2020, 50(6): 101-111. |
[4] | 董新宇,陈瀚阅,李家国,孟庆岩,邢世和,张黎明. 基于多方法融合的非监督彩色图像分割[J]. 山东大学学报 (工学版), 2019, 49(2): 96-101. |
[5] | 秦军,张远鹏,蒋亦樟,杭文龙. 多代表点自约束的模糊迁移聚类[J]. 山东大学学报 (工学版), 2019, 49(2): 107-115. |
[6] | 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18. |
[7] | 宋琦悦,穆学文,程欢. 改进滴水算法的黏连字符分割方法[J]. 山东大学学报 (工学版), 2018, 48(6): 89-94, 108. |
[8] | 王换,周忠眉. 一种基于聚类的过抽样算法[J]. 山东大学学报(工学版), 2018, 48(3): 134-139. |
[9] | 张佩瑞,杨燕,邢焕来,喻琇瑛. 基于核K-means的增量多视图聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 48-53. |
[10] | 读习习,刘华锋,景丽萍. 一种融合社交网络的叠加联合聚类推荐模型[J]. 山东大学学报(工学版), 2018, 48(3): 96-102. |
[11] | 杨天鹏,徐鲲鹏,陈黎飞. 非均匀数据的变异系数聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 140-145. |
[12] | 庞人铭,王波,叶昊,张海峰,李明亮. 基于PCA相似度和谱聚类相结合的高炉历史数据聚类[J]. 山东大学学报(工学版), 2017, 47(5): 143-149. |
[13] | 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44. |
[14] | 吉兴全,韩国正,李可军,傅荣荣,朱仰贺. 基于密度的改进K均值聚类算法在配网区块划分中的应用[J]. 山东大学学报(工学版), 2016, 46(4): 41-46. |
[15] | 李朔,石宇良. 基于位置社交网络中地点聚类推荐方法[J]. 山东大学学报(工学版), 2016, 46(3): 44-50. |
|