Journal of Shandong University(Engineering Science) ›› 2025, Vol. 55 ›› Issue (6): 1-12.doi: 10.6040/j.issn.1672-3961.0.2024.269

• Machine Learning & Data Mining •    

Mining Top-k frequent patterns for graphs based on subjective and objective metrics

HUANG Fang, WANG Xin*, GAO Guohai, SHEN Lingzhen, FU Xun, FANG Yu   

  1. HUANG Fang, WANG Xin*, GAO Guohai, SHEN Lingzhen, FU Xun, FANG Yu(School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, Sichuan, China
  • Published:2025-12-22

Abstract: In order to solve the problem that traditional Top-k pattern mining results failed to meet the users' practical needs, a graph data Top-k frequent pattern mining approach that integrates subjective and objective evaluations was proposed. A pattern representation technique based on minimum DFS coding was introduced to encode patterns. The graph patterns evaluation model(GPEM)was built based on a siamese neural network, which learned the biased order relationships between pattern pairs and predicted subjective preference of patterns. A pattern interestingness evaluation function that combined subjective and objective factors was designed to guide Top-k pattern mining. Experiments on six real graph datasets demonstrated that GPEM outperformed other models on various metrics, with up to 93% accuracy.

Key words: siamese neural network, frequent pattern mining, interestingness evaluation function

CLC Number: 

  • TP311
[1] INGALALLI V, IENCO D, PONCELET P. Mining frequent subgraphs in multigraphs[J]. Information Sciences, 2018, 451: 50-66.
[2] WANG X, LAN Z, HE Y A, et al. A cost-effective approach for mining near-optimal Top-k patterns[J]. Expert Systems with Applications, 2022, 202: 117262.
[3] PENG H, ZHANG D F. CFGM: an algorithm for closed frequent graph patterns mining[J]. Information Sciences, 2023, 625: 327-341.
[4] ZENG J, U L H, YAN X, et al. Fast core-based Top-k frequent pattern discovery in knowledge graphs[C] //2021 IEEE 37th International Conference on Data Engineering(ICDE). Chania, Greece: IEEE, 2021: 936-947.
[5] WANG X, XIANG M Y, ZHAN H Y, et al. Distributed Top-k pattern mining[M] // Cham: Springer International Publishing, 2021: 203-220.
[6] LE T, VO B, HUYNH V N, et al. Mining Top-k frequent patterns from uncertain databases[J]. Applied Intelligence, 2020, 50(5): 1487-1497.
[7] NATARAJAN D, RANU S. A scalable and generic framework to mine Top-k representative subgraph patterns[C] //2016 IEEE 16th International Conference on Data Mining(ICDM). Barcelona, Spain: IEEE, 2016: 370-379.
[8] SEMERTZIDIS K, PITOURA E. Top-k durable graph pattern queries on temporal graphs[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(1): 181-194.
[9] PRATEEK A, KHAN A, GOYAL A, et al. Mining Top-k pairs of correlated subgraphs in a large network[J]. Proceedings of the VLDB Endowment, 2020, 13(9): 1511-1524.
[10] 邹杰军, 王欣, 石俊豪, 等. 面向大图的Top-Rank-K频繁模式挖掘算法[J].南京大学学报(自然科学版), 2024, 60(1): 38-52. ZOU Jiejun, WANG Xin, SHI Junhao, et al. Top-Rank-K frequent pattern mining algorithm for large graphs[J]. Journal of Nanjing University(Natural Science), 2024, 60(1): 38-52.
[11] BELMECHERI N, ARIBI N, LAZAAR N, et al. Boosting the learning for ranking patterns[J]. Algorithms, 2023, 16(5): 218.
[12] DAVASHI R. ITUFP: a fast method for interactive mining of Top-k frequent patterns from uncertain data[J]. Expert Systems with Applications, 2023, 214: 119156.
[13] LEHEMBRE E, CREMILLEUX B, ZIMMERMANN A, et al. WaveLSea: helping experts interactively explore pattern mining search spaces[J]. Data Mining and Knowledge Discovery, 2024, 38(4): 2403-2439.
[14] WANG X, SHI J H, ZOU J J, et al. Supports estimation via graph sampling[J]. Expert Systems with App-lications, 2024, 240: 122554.
[15] FIRMANSYAH F, NURDIAWAN O. Penerapan data mining menggunakan algoritma frequent pattern-growth untuk menentukan pola pembelian produk chemicals[J]. Jurnal Mahasiswa Teknik Informatika, 2023, 7(1): 547-551.
[16] YAN X F, HAN J W. gSpan: graph-based substructure pattern mining[C] //2002 IEEE International Conference on Data Mining, 2002 Proceedings. Maebashi City, Japan: IEEE, 2002: 721-724.
[17] LI Y K, WU Z Y, LIN S, et al. Walking with perception: efficient random walk sampling via common neighbor awareness[C] //2019 IEEE 35th International Conference on Data Engineering(ICDE). Macao, China: IEEE, 2019: 962-973.
[18] YE S J, WANG Z, XIONG P B, et al. Multi-stage few-shot micro-defect detection of patterned OLED panel using defect inpainting and multi-scale Siamese neural network[J]. Journal of Intelligent Manufacturing, 2024, 35(6): 2653-2669.
[19] ROZEMBERCZKI B, ALLEN C, SARKAR R, et al. Multi-Scale attributed node embedding[J]. Journal of Complex Networks, 2021, 9(1): 1-22.
[20] YANG J, LESKOVEC J. Defining and evaluating network communities based on ground-truth[J]. Knowledge and Information Systems,2015,42(1):181-213.
[21] ELSEIDY M, ABDELHAMID E, SKIADOPOULOS S, et al. GraMi: frequent subgraph and pattern mining in a single large graph[J]. Proceedings of the VLDB Endowment, 2014, 7(7): 517-528.
[22] ABDELHAMID E, ABDELAZIZ I, KALNIS P, et al. ScaleMine: scalable parallel frequent subgraph mining in a single large graph[C] //SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Salt Lake City, USA: IEEE, 2016: 716-727.
[23] LESKOVEC J, MCAULEY J. Learning to discover social circles in ego networks[C] // Proceedings of the 26 International Conference on Neural Information Processing Systems. Nevade, USA: ACM, 2012: 539-547.
[24] KLIE J C, DE CASTILHO R E, GUREVYCH I. Analyzing dataset annotation quality management in the wild[J]. Computational Linguistics, 2024, 50(3): 817-866.
[25] SINGH R H, MAURYA S, TRIPATHI T, et al. Movie recommendation system using cosine similarity and KNN[J]. International Journal of Engineering and Advanced Technology, 2020, 9(5): 556-559.
[26] YI J S K, SEO M, PARK J, et al. PT4AL: using self-supervised pretext tasks for active learning[M] // Cham: Springer Nature Switzerland, 2022: 596-612.
[1] HOU Yanchen, ZHAO Jindong. SPK-means: a clustering algorithm for arbitrary shapes [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 87-92.
[2] CHU Jiajing, PAN Qingxian, PAN Ya'nan, LIU Qingju. Crowdsourcing quality control algorithm based on reputation model [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 93-101.
[3] LIU Bin, WANG Lei , WANG Chong, CAI Xiangxiang. An incremental method for updating approximations of consistent blocks while the universe evolves over time [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 109-117.
[4] Hao XIAO,Zhuhua LIAO,Yizhi LIU,Silin LIU,Jianxun LIU. Unmanned vehicle path planning based on deep Q learning in real environment [J]. Journal of Shandong University(Engineering Science), 2021, 51(1): 100-107.
[5] Zhuoyu XIAO,Pei HE,Guo CHEN,Yunbiao XU,Jie GUO. Design pattern classification mining with feature metrics constraints [J]. Journal of Shandong University(Engineering Science), 2020, 50(6): 48-58.
[6] Wenkai ZHANG,Ke YU,Xiaofei WU. Entity recommendation based on normalized similarity measure of meta graph in heterogeneous information network [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 66-75.
[7] Chao FENG,Kunpeng XU,Lifei CHEN. LDA-based topic feature representation method for symbolic sequences [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 60-65.
[8] Delei CHEN, Cheng WANG, Jianwei CHEN, Yiyin WU. GRU-based collaborative filtering recommendation algorithm with active learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(1): 21-27.
[9] Qijie ZOU,Haoyu LI,Rubo ZHANG,Tengda PEI,Yan LIU. Survey of human-robot interaction control for autonomous driving [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 23-33.
[10] Zhongwei ZHANG,Hongyan MEI,Jun ZHOU,Huiping JIA. A rule extraction method based on multi-objective co-evolutionarygenetic algorithm [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 122-130.
[11] Xiaoyan GONGYE,Peiguang LIN,Weilong REN. Genetic algorithm based on Grefenstette coding and 2-opt optimized [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 19-26.
[12] HE Dongzhi, ZHANG Jifeng, ZHAO Pengfei. Parallel implementing probabilistic spreading algorithm using MapReduce programming mode [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 22-28.
[13] DU Xixi, LIU Huafeng, JING Liping. An additive co-clustering for recommendation of integrating social network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 96-102.
[14] SHEN Ji, MA Zhiqiang, LI Tuya, ZHANG Li. A word extend LDA model for short text sentiment [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 120-126.
[15] WANG Huan, ZHOU Zhongmei. An over sampling algorithm based on clustering [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 134-139.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!