您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2025, Vol. 55 ›› Issue (2): 88-96.doi: 10.6040/j.issn.1672-3961.0.2024.184

• 机器学习与数据挖掘 • 上一篇    

基于图结构的概念漂移检测

周彦冰,马士伦,文益民*   

  1. 广西图像图形与智能处理重点实验室(桂林电子科技大学), 广西 桂林 541004
  • 发布日期:2025-04-15
  • 作者简介:周彦冰(2001— ),男,湖南益阳人,硕士研究生,主要研究方向为机器学习. E-mail:18074392274@163.com. *通信作者简介:文益民(1969— ),男,湖南桃江人,教授,博士生导师,博士,主要研究方向为机器学习、数据流分类、媒体分析与数据挖掘. E-mail: ymwen@guet.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(62366011);广西重点研发计划资助项目(桂科AB21220023);广西图像图形与智能处理重点实验室资助项目(GIIP2306)

Concept drift detection based on graph structure

ZHOU Yanbing, MA Shilun, WEN Yimin*   

  1. Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China
  • Published:2025-04-15

摘要: 为了解决传统的概念漂移检测方法,仅依赖错误率进行漂移检测不可靠的问题,提出一种基于图结构的概念漂移检测方法。该方法使用k关联最优图表示当前数据分布,定义样本的漂移率表示分类器与当前数据分布的不一致性,利用漂移率形成比特流,使用概念漂移检测器在比特流上检测概念漂移。通过与传统的使用错误率的概念漂移检测方法的对比和分析,结果表明在人工数据集上基分类器的准确率提高1%~5%,在真实数据集上提高1%~2%。所提出的方法有效提高概念漂移检测的准确性,帮助基分类器更好适应概念漂移。

关键词: 数据挖掘, 数据流, 概念漂移, 图结构, k关联最优图

中图分类号: 

  • TP181
[1] GAMA J, MEDAS P, CASTILLO G, et al. Learning with drift detection[C] // Advances in Artificial Intelligence-SBIA 2004: 17th Brazilian Symposium on Artificial Intelligence. Sao Luis, Brazil: Springer, 2004: 286-295.
[2] FRIAS-BLANCO I, DEL CAMPO-ÁVILA J, RAMOS-JIMENEZ G, et al. Online and non-parametric drift detection methods based on Hoeffdings bounds[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(3): 810-823.
[3] BERTINI J R, ZHAO L, MOTTA R, et al. A nonparametric classification method based on k-associated graphs[J]. Information Sciences, 2011, 181(24): 5435-5456.
[4] BAYRAM F, AHMED B S, KASSLER A. From concept drift to model degradation: an overview on performance-aware drift detectors[J]. Knowledge-Based Systems, 2022, 245: 108632-108651.
[5] PESARANGHADER A, VIKTOR H L. Fast hoeffding drift detection method for evolving data streams[C] //Machine Learning and Knowledge Discovery in Databases: European Conference. Riva del Garda, Italy: Springer, 2016: 96-111.
[6] PESARANGHADER A, VIKTOR H, PAQUET E. Reservoir of diverse adaptive learners and stacking fast hoeffding drift detection methods for evolving data streams[J]. Machine Learning, 2018, 107(11): 1711-1743.
[7] YAN M M W. Accurate detecting concept drift in evolving data streams[J]. ICT Express, 2020, 6(4): 332-338.
[8] BAENA-GARCIA M, DEL CAMPO-ÁVILA J, FIDALGO R, et al. Early drift detection method[C] //Fourth international workshop on knowledge discovery from data streams. Berlin, Germany: ACM, 2006, 6: 77-86.
[9] BIFET A, GAVALDA R. Learning from time-changing data with adaptive windowing[C] //Proceedings of the 2007 SIAM international conference on data mining. Minneapolis, USA: SIMA, 2007: 443-448.
[10] NISHIDA K, YAMAUCHI K. Detecting concept drift using statistical testing[C] //International conference on discovery science. Berlin, Germany: Springer, 2007: 264-269.
[11] DE LIMA CABRAL D R, DE BARROS R S M. Concept drift detection based on Fishers exact test[J]. Information Sciences, 2018, 442: 220-234.
[12] FISHER R A. On the interpretation of χ2 from contingency tables, and the calculation of P[J]. Journal of the Royal Statistical Society, 1922, 85(1): 87-94.
[13] DE BARROS R S M, HIDALGO J I G, DE LIMA CABRAL D R. Wilcoxon rank sum test drift detector[J]. Neurocomputing, 2018, 275: 1954-1963.
[14] WILCOXON F. Individual comparisons by ranking methods[M]. New York: Springer, 1992: 196-202.
[15] HIDALGO J I G, MARIÑO L M P, DE BARROS R S M. Cosine similarity drift detector[C] //International Conference on Artificial Neural Networks.Munich, Germany: Springer, 2019: 669-685.
[16] MINKU L L, YAO X. DDD: A new ensemble approach for dealing with concept drift[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 24(4): 619-633.
[17] SIDHU P, BHATIA M P S. An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection[J]. International Journal of Machine Learning and Cybernetics, 2015, 6(6): 883-909.
[18] MAHDI O A, PARDEDE E, ALI N. A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts[J]. Cluster Computing, 2021, 24(3): 2327-2340.
[19] BERTINI J R, ZHAO L, LOPES A A. An incremental learning algorithm based on the K-associated graph for non-stationary data classification[J]. Information Sciences, 2013, 246: 52-68.
[20] BERTINI J R, LOPES A A, ZHAO L. Partially labeled data stream classification with the semi-supervised K-associated graph[J]. Journal of the Brazilian Computer Society, 2012, 18: 299-310.
[21] DA SILVA A T, BERTINI J R. Using the k-associated optimal graph to provide counterfactual explanations[C] //IEEE International Conference on Fuzzy Systems. Padua, Italy: IEEE, 2022: 1-8.
[1] 王梅,宋凯文,刘勇,王志宝,万达. DMKK-means——一种深度多核K-means聚类算法[J]. 山东大学学报 (工学版), 2024, 54(6): 1-7.
[2] 马坤,刘筱云,李乐平,纪科,陈贞翔,杨波. 用于意图识别的自适应多标签信息学习模型[J]. 山东大学学报 (工学版), 2024, 54(1): 45-51.
[3] 张喜龙,韩萌,陈志强,武红鑫,李慕航. 动态集成选择的不平衡漂移数据流Boosting分类算法[J]. 山东大学学报 (工学版), 2023, 53(4): 83-92.
[4] 聂秀山,马玉玲,乔慧妍,郭杰,崔超然,于志云,刘兴波,尹义龙. 任务粒度视角下的学生成绩预测研究综述[J]. 山东大学学报 (工学版), 2022, 52(2): 1-14.
[5] 张妮,韩萌,王乐,李小娟,程浩东. 基于索引列表的增量高效用模式挖掘算法[J]. 山东大学学报 (工学版), 2022, 52(2): 107-117.
[6] 杨思, 李思童, 张进东, 白羽. 高速光通信激光器带宽模型改进与并行计算优化[J]. 山东大学学报 (工学版), 2019, 49(1): 17-22.
[7] 李尧, 王志海, 孙艳歌, 张伟. 一种基于深度属性加权的数据流自适应集成分类算法[J]. 山东大学学报 (工学版), 2018, 48(6): 44-55.
[8] 庞人铭,王波,叶昊,张海峰,李明亮. 基于PCA相似度和谱聚类相结合的高炉历史数据聚类[J]. 山东大学学报(工学版), 2017, 47(5): 143-149.
[9] 周哲, 商琳. 一种基于动态词典和三支决策的情感分析方法[J]. 山东大学学报(工学版), 2015, 45(1): 19-23.
[10] 朱全银1,严云洋1,周培1,谷天峰2. 一种线性插补与自适应滑动窗口价格预测模型[J]. 山东大学学报(工学版), 2012, 42(5): 53-58.
[11] 郭躬德1,2,李南1,2,陈黎飞1,2. 一种适应概念漂移数据流的分类算法[J]. 山东大学学报(工学版), 2012, 42(4): 1-7.
[12] 王爱国,李廉*,杨静,陈桂林. 一种基于Bayesian网络的网页推荐算法[J]. 山东大学学报(工学版), 2011, 41(4): 137-142.
[13] 琚春华1,2,陈之奇1*. 一种挖掘概念漂移数据流的模糊积分集成分类方法[J]. 山东大学学报(工学版), 2011, 41(4): 44-48.
[14] 宋威,刘文博,李晋宏. 基于动态裁剪频繁模式树的频繁项集并发挖掘算法[J]. 山东大学学报(工学版), 2011, 41(4): 49-55.
[15] 张新猛,蒋盛益. 一种基于相似度概率的不确定分类数据聚类算法[J]. 山东大学学报(工学版), 2011, 41(3): 12-16.
Viewed
Full text
5
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 5

  From local
  Times 5
  Rate 100%

Abstract
21
Just accepted Online first Issue
0 0 21
  From Others local
  Times 19 2
  Rate 90% 10%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!