Journal of Shandong University(Engineering Science) ›› 2025, Vol. 55 ›› Issue (2): 88-96.doi: 10.6040/j.issn.1672-3961.0.2024.184

• Machine Learning & Data Mining • Previous Articles     Next Articles

Concept drift detection based on graph structure

ZHOU Yanbing, MA Shilun, WEN Yimin*   

  1. Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China
  • Published:2025-04-15

Abstract: In order to solve the problem that the traditional concept drift detection method only relied on the error rate for drift detection was not reliable enough, a concept drift detection method based on graph structure was proposed. In this method, the k-associated optimal graph was used to represent the current data distribution, and the drift rate of the sample was defined to represent the inconsistency between the classifier and the current data distribution. The drift rate was used to form a bit stream, and the concept drift detector was used to detect the concept drift on the bit stream. Compared with the traditional concept drift detection method using error rate, the results showed that the accuracy of the base classifier was improved by 1%-5% on artificial datasets and 1%-2% on real-world datasets. The proposed method could effectively improve the accuracy of concept drift detection and help base classifiers better adapt to concept drift.

Key words: data mining, data stream, concept drift, graph structure, k-associated optimal graph

CLC Number: 

  • TP181
[1] GAMA J, MEDAS P, CASTILLO G, et al. Learning with drift detection[C] // Advances in Artificial Intelligence-SBIA 2004: 17th Brazilian Symposium on Artificial Intelligence. Sao Luis, Brazil: Springer, 2004: 286-295.
[2] FRIAS-BLANCO I, DEL CAMPO-ÁVILA J, RAMOS-JIMENEZ G, et al. Online and non-parametric drift detection methods based on Hoeffdings bounds[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(3): 810-823.
[3] BERTINI J R, ZHAO L, MOTTA R, et al. A nonparametric classification method based on k-associated graphs[J]. Information Sciences, 2011, 181(24): 5435-5456.
[4] BAYRAM F, AHMED B S, KASSLER A. From concept drift to model degradation: an overview on performance-aware drift detectors[J]. Knowledge-Based Systems, 2022, 245: 108632-108651.
[5] PESARANGHADER A, VIKTOR H L. Fast hoeffding drift detection method for evolving data streams[C] //Machine Learning and Knowledge Discovery in Databases: European Conference. Riva del Garda, Italy: Springer, 2016: 96-111.
[6] PESARANGHADER A, VIKTOR H, PAQUET E. Reservoir of diverse adaptive learners and stacking fast hoeffding drift detection methods for evolving data streams[J]. Machine Learning, 2018, 107(11): 1711-1743.
[7] YAN M M W. Accurate detecting concept drift in evolving data streams[J]. ICT Express, 2020, 6(4): 332-338.
[8] BAENA-GARCIA M, DEL CAMPO-ÁVILA J, FIDALGO R, et al. Early drift detection method[C] //Fourth international workshop on knowledge discovery from data streams. Berlin, Germany: ACM, 2006, 6: 77-86.
[9] BIFET A, GAVALDA R. Learning from time-changing data with adaptive windowing[C] //Proceedings of the 2007 SIAM international conference on data mining. Minneapolis, USA: SIMA, 2007: 443-448.
[10] NISHIDA K, YAMAUCHI K. Detecting concept drift using statistical testing[C] //International conference on discovery science. Berlin, Germany: Springer, 2007: 264-269.
[11] DE LIMA CABRAL D R, DE BARROS R S M. Concept drift detection based on Fishers exact test[J]. Information Sciences, 2018, 442: 220-234.
[12] FISHER R A. On the interpretation of χ2 from contingency tables, and the calculation of P[J]. Journal of the Royal Statistical Society, 1922, 85(1): 87-94.
[13] DE BARROS R S M, HIDALGO J I G, DE LIMA CABRAL D R. Wilcoxon rank sum test drift detector[J]. Neurocomputing, 2018, 275: 1954-1963.
[14] WILCOXON F. Individual comparisons by ranking methods[M]. New York: Springer, 1992: 196-202.
[15] HIDALGO J I G, MARIÑO L M P, DE BARROS R S M. Cosine similarity drift detector[C] //International Conference on Artificial Neural Networks.Munich, Germany: Springer, 2019: 669-685.
[16] MINKU L L, YAO X. DDD: A new ensemble approach for dealing with concept drift[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 24(4): 619-633.
[17] SIDHU P, BHATIA M P S. An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection[J]. International Journal of Machine Learning and Cybernetics, 2015, 6(6): 883-909.
[18] MAHDI O A, PARDEDE E, ALI N. A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts[J]. Cluster Computing, 2021, 24(3): 2327-2340.
[19] BERTINI J R, ZHAO L, LOPES A A. An incremental learning algorithm based on the K-associated graph for non-stationary data classification[J]. Information Sciences, 2013, 246: 52-68.
[20] BERTINI J R, LOPES A A, ZHAO L. Partially labeled data stream classification with the semi-supervised K-associated graph[J]. Journal of the Brazilian Computer Society, 2012, 18: 299-310.
[21] DA SILVA A T, BERTINI J R. Using the k-associated optimal graph to provide counterfactual explanations[C] //IEEE International Conference on Fuzzy Systems. Padua, Italy: IEEE, 2022: 1-8.
[1] Xiushan NIE,Yuling MA,Huiyan QIAO,Jie GUO,Chaoran CUI,Zhiyun YU,Xingbo LIU,Yilong YIN. Survey on student academic performance prediction from the perspective of task granularity [J]. Journal of Shandong University(Engineering Science), 2022, 52(2): 1-14.
[2] Si YANG, Sitong LI, Jindong ZHANG, Yu BAI. Improvement of bandwidth model for high speed optical communicationlaser and its optimization by parallel computing [J]. Journal of Shandong University(Engineering Science), 2019, 49(1): 17-22.
[3] Yao LI, Zhihai WANG, Yan′ge SUN, Wei ZHANG. An adaptive ensemble classification method based on deep attribute weighting for data stream [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 44-55.
[4] YE Ziyun, YANG Jinfeng. A finger-vein recognition method based on weighted graph model [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 103-109.
[5] PANG Renming, WANG Bo, YE Hao, ZHANG Haifeng, LI Mingliang. Clustering of blast furnace historical data based on PCA similarity factor and spectral clustering [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 143-149.
[6] ZHOU Zhe, SHANG Lin. A sentiment analysis method based on dynamic lexicon and three-way decision [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 19-23.
[7] ZHU Quan-yin1, YAN Yun-yang1, ZHOU Pei1, GU Tian-feng2. Price forecasting model based on linear backfilling and adaptive sliding windows [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(5): 53-58.
[8] GUO Gong-de1,2, LI Nan1,2, CHEN Li-fei1,2. A self-adaptive classification method for conceptdrifting data streams [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(4): 1-7.
[9] WANG Ai-guo, LI Lian*, YANG Jing, CHEN Gui-lin. An algorithm based on Bayesian network for web page recommendation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(4): 137-142.
[10] JU Chun-hua1,2, CHEN Zhi-qi1*. A method of fuzzy integral ensemble classifiers for handling concept-drifting data streams [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(4): 44-48.
[11] SONG Wei, LIU Wen-bo, LI Jin-hong. Concurrent frequent itemsets mining algorithm based on dynamic prune of FP-tree [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(4): 49-55.
[12] ZHANG Xinmeng, JIANG Shengyi. An algorithm for clustering uncertain categorical data
based on similarity probability
[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(3): 12-16.
[13] SUN Jing-yu, YU Xue-li, CHEN Jun-jie, LI Xian-hua. Sampled peculiarity factor and its application in anomaly detection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 56-59.
[14] DONG Ai-Feng, DIAO Ge-Ji, SCHOMMER Christoph. A fingerprint engine for author profiling [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 27-31.
[15] SUN Yuqing,ZHAO Rui,YAO Qing,SHI Bin,LIU Jia . A meshbased clustering algorithm in the presence of obstacles [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 86-90 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] TIAN Wen, HU Ming-hua. Probabilistic airspace congestion management model and methodology[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(6): 41 -47 .
[2] LI Ming, LIU Wei, ZHANG Yanduo. Mulit-Agent dynamic task allocation based on improved contract net protocol[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(2): 51 -56 .
[3] Donglu YANG, Xiaotian MA, Jinglan HONG. Life cycle assessment-based water footprint analysis of paper making wastewater[J]. Journal of Shandong University(Engineering Science), 2019, 49(3): 114 -119 .
[4] JIA Xiu-qin,LIU Yun-gang . H partial-state observer design for nonlinear systems[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(5): 40 -46 .
[5]

YANG Guohui1, SUN Xiaoyu1,2*, TSUBAKI Noritatsu1

. Zeolite capsule catalyst for biogasoline[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 92 -97 .
[6] JI Xingquan, HAN Guozheng, LI Kejun, FU Rongrong, ZHU Yanghe. Application of improved K-means clustering algorithm based on density in distribution network block partitioning[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(4): 41 -46 .
[7] CHEN Enyu, DENG Siwen, CHEN Fangming, MA Chishuai. Development of a novel rock strength estimation model based on TBM boring performance[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(2): 7 -13 .
[8] XUE Yi-guo,LI Shu-cai,ZHANG Qing-song,LI Shu-chen,SU Mao-xin,LIU Qin .

Prediction and early-warning technology of geological hazards in tunnel informational construction

[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(5): 25 -30 .
[9] GE Kairong, CHANG Faliang, DONG Wenhui. Sparse representation tracking method based on locality sensitive histogram[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(5): 14 -19 .
[10] MA Qi-Hua, WANG Yi-Tai. Application of the high density resistivity method to surrvey huge empty water  outside of a coal mine[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(4): 107 -111 .