山东大学学报 (工学版) ›› 2022, Vol. 52 ›› Issue (2): 118-127.doi: 10.6040/j.issn.1672-3961.0.2021.310
卢建云1,2,张蔚3,李林2
LU Jianyun1,2, ZHANG Wei3, LI Lin2
摘要: 为提高逆k最近邻在度量局部密度时的区分度,提出动态逆k最近邻概念。利用泊松概率密度函数拟合逆k最近邻分布,并计算累积动态逆k最近邻局部密度;基于动态局部密度对数据对象进行排序,利用逆k最近邻域扩展算法生成聚类结构;依据动态局部密度和欧式距离设计聚类决策图,根据决策图找出聚类结构中的类间间断点,利用间断点将聚类结构直接划分成独立的类簇。将本研究提出的聚类结构划分聚类(cluster structure partition clustering,CSPC)算法与DBSCAN、DPC和RNN-DBSCAN算法在人工和真实数据集上进行试验对比,CSCP在人工和真实数据集上的评价指标F1平均分别提高8.8%和8.2%,评价指标标准互信息平均分别提高11.6%和7.3%。试验结果表明CSPC算法取得了更好的聚类结果。
中图分类号:
| [1] 章永来, 周耀鉴.聚类算法综述[J].计算机应用, 2019, 39(7):1869-1882. ZHANG Yonglai, ZHOU Yaojian. Review ofclustering algorithms[J].Journal of Computer Applications, 2019, 39(7):1869-1882. [2] UTTARKABAT S, SUNKARA N D, PATRA B K. RSOD: efficient technique for outlier detection using reverse nearest neighbors statistics[C] //Proceedings of the 4th International Conference on Computational Intel-ligence and Networks(CINE).Kolkata, India:IEEE, 2020:1-6. [3] KURSUN O. Spectral clustering with reverse soft k-nearest neighbor density estimation[C] // Proceedings of the 2010 International Joint Conference on Neural Networks(IJCNN). Barcelona, Spain: IEEE, 2010: 1-8. [4] BRYANT A, COIL K. RNN-DBSCAN: a density-based clustering algorithm using reverse nearest neighbor density estimates[J].IEEE Transactions on Knowledge and Data Engineering, 2018, 30(6):1109-1121. [5] VADAPALLI S, VALLURI S R, KARLAPALEM K. A simple yet effective data clustering algorithm[C] //Proceedings of the Sixth International Conference on Data Mining(ICDM’06). Hong Kong, China: IEEE, 2006:1108-1112. [6] WU Q, ZHANG Q, SUN R, et al. Adaptive density peak clustering based on dimensional-free and reverse k-nearest neighbors[J]. Information Technology and Control, 2020, 49(3):395-411. [7] LIU Y, LIU D, YU F, et al. A double-density clustering method based on "nearest to first in" strategy[J]. Symmetry-Basel, 2020, 12(5):747-764. [8] LU J, ZHU Q. An Effective algorithm based on density clustering framework[J]. IEEE Access, 2017:4991-5000. [9] MUTHUKRISHNAN S. Influence sets based on reverse nearest neighbor queries[C] //Proceedings of the International Conference on Management of Data.Dallas, United States: ACM, 2000:201-212. [10] ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C] // Proceedings of the 2nd International Conference on Knowledge Discovery(KDD).Portland, United States:AAAI press, 1996:226-231. [11] SCITO VS KI R, SABO K. DBSCAN-like clustering method for various data densities[J].Pattern Analysis and Applications, 2020, 23(2):541-554. [12] CHEN Y, ZHOU L, BOUGUILA N, et al. BLOCK-DBSCAN: fast clustering for large scale data[J].Pattern Recognition, 2021, 51(6):3939-3953. [13] ANKERST M, BREUNIG M, KRIEGELH P, et al. OPTICS: ordering points to identify the clustering structure[C] //Proceedings of the International Conference on Management of Data. Nanjing, China: ACM, 1999:49-60. [14] RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191):1492-1496. [15] 刘颖莹,刘培玉,王智昊,等.一种基于密度峰值发现的文本聚类算法[J].山东大学学报(理学版), 2016, 51(1):65-70. LIU Yingying, LIU Peiyu, WANG Zhihao, et al. A text clustering algorithm based on find of density peaks [J].Journal of Shandong University(Natural Science), 2016, 51(1):65-70. [16] DIAO Q, DAI Y, AN Q, et al. Clustering by detecting density peaks and assigning points by similarity-first search based on weighted k-nearest neighbors graph[J]. Complexity, 2020:1-17. [17] ABBAS M, EL-ZOGHABI A, SHOUKRY A. DenMune: density peak based clustering using mutual nearest neighbors[J]. Pattern Recognition, 2020:1-18. [18] 杨天鹏,徐鲲鹏,陈黎飞.非均匀数据的变异系数聚类算法[J].山东大学学报(工学版), 2018,48(3):140-146. YANG Tianpeng, XU Kunpeng, CHEN Lifei. Coef-ficient of variation clustering algorithm for non-uniform data[J]. Journal of Shandong University(Engineering Science), 2018, 48(3):140-146. [19] 卢建云,朱庆生,吴全旺.一种启发式确定聚类数方法[J].小型微型计算机系统, 2018, 39(7):1381-1385. LU Jinayun, ZHU Qingsheng, WU Quanwang. Heuristic method of determining the number of clusters[J]. Journal of Chinese Computer Systems, 2018, 39(7):1381-1385. [20] 纪霞,姚晟,赵鹏.相对邻域与剪枝策略优化的密度峰值聚类算法[J].自动化学报,2020,46(3):562-575. JI Xia, YAO Cheng, ZHAO Peng. Relative neigh-borhood and pruning strategy optimized density peaks clustering algorithm [J].Acta Automatica Sinica, 2020, 46(3):562-575. [21] BAKOUCH H, KACHOUR M, NADARAJAH S. An extended Poisson distribution[J]. Communications in Statistics-Thoeryand Methods, 2013, 45(22):6746-6764. [22] University of Eastern Finland. Clustering datasets: Shape sets[DB/OL].[2022-02]. https://cs.joensuu.fi/sipu/datasets/. [23] LICHMAN M. UCI machine learning repository 2013. [DB/OL]. [2022-02]. http://archive.ics.uci.edu/ml. [24] LIU X, CHENG H M, ZHANG Z Y, Evaluation of community detection methods[J].IEEE Transactions on Knowledge and Data Engineering, 2020, 32(9):1736-1746. |
| [1] | 邓彬, 张宗包, 赵文猛, 罗新航, 吴秋伟. 基于云边协同和图神经网络的电动汽车充电站负荷预测方法[J]. 山东大学学报 (工学版), 2025, 55(5): 62-69. |
| [2] | 李二超, 张智钊. 在线动态订单需求车辆路径规划[J]. 山东大学学报 (工学版), 2024, 54(5): 62-73. |
| [3] | 杨巨成, 魏峰, 林亮, 贾庆祥, 刘建征. 驾驶员疲劳驾驶检测研究综述[J]. 山东大学学报 (工学版), 2024, 54(2): 1-12. |
| [4] | 肖伟, 郑更生, 陈钰佳. 结合自训练模型的命名实体识别方法[J]. 山东大学学报 (工学版), 2024, 54(2): 96-102. |
| [5] | 胡钢, 王乐萌, 卢志宇, 王琴, 徐翔. 基于节点多阶邻居递阶关联贡献度的重要性辨识[J]. 山东大学学报 (工学版), 2024, 54(1): 1-10. |
| [6] | 李家春,李博文,常建波. 一种高效且轻量的RGB单帧人脸反欺诈模型[J]. 山东大学学报 (工学版), 2023, 53(6): 1-7. |
| [7] | 樊禹江,黄欢欢,丁佳雄,廖凯,余滨杉. 基于云模型的老旧小区韧性评价体系[J]. 山东大学学报 (工学版), 2023, 53(5): 1-9, 19. |
| [8] | 李颖,王建坤. 基于监督图正则化和信息融合的轻度认知障碍分类方法[J]. 山东大学学报 (工学版), 2023, 53(4): 65-73. |
| [9] | 吴艳丽,刘淑薇,何东晓,王晓宝,金弟. 刻画多种潜在关系的泊松-伽马主题模型[J]. 山东大学学报 (工学版), 2023, 53(2): 51-60. |
| [10] | 余明骏,刁红军,凌兴宏. 基于轨迹掩膜的在线多目标跟踪方法[J]. 山东大学学报 (工学版), 2023, 53(2): 61-69. |
| [11] | 刘行,杨璐,郝凡昌. 基于多特征融合的手指静脉图像检索方法[J]. 山东大学学报 (工学版), 2023, 53(2): 118-126. |
| [12] | 刘方旭,王建,魏本征. 基于多空间注意力的小儿肺炎辅助诊断算法[J]. 山东大学学报 (工学版), 2023, 53(2): 135-142. |
| [13] | 于艺旋,杨耕,耿华. 连续复合运动的多模态层次化关键帧提取方法[J]. 山东大学学报 (工学版), 2023, 53(2): 42-50. |
| [14] | 黄华娟,程前,韦修喜,于楚楚. 融合Jaya高斯变异的自适应乌鸦搜索算法[J]. 山东大学学报 (工学版), 2023, 53(2): 11-22. |
| [15] | 张豪,李子凌,刘通,张大伟,陶建华. 融合社会学因素的模糊贝叶斯网技术预测模型[J]. 山东大学学报 (工学版), 2023, 53(2): 23-33. |
|