您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (2): 122-128.doi: 10.6040/j.issn.1672-3961.0.2020.228

• • 上一篇    

基于紧凑效用列表的增量高效用模式挖掘方法

张春砚,韩萌*,孙蕊,杜诗语,申明尧   

  1. 北方民族大学计算机科学与工程学院, 宁夏 银川 750021
  • 发布日期:2021-04-16
  • 作者简介:张春砚(1995— ),女,河北张家口人,硕士研究生,主要研究方向为数据挖掘. E-mail:310300538@qq.com. *通信作者简介:韩萌(1982— ),女,河南商丘人,副教授,博士,硕士生导师,主要研究方向是数据挖掘. E-mail:2003051@nmu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(2062004);宁夏自然科学基金资助项目(2020AAC03216);北方民族大学研究生创新项目资助项目(YCX20061)

Incremental high utility pattern mining method based on compact utility list

ZHANG Chunyan, HAN Meng*, SUN Rui, DU Shiyu, SHEN Mingyao   

  1. College of Computer Science and Engineering, North Minzu University, Yinchuan 750021, Ningxia, China
  • Published:2021-04-16

摘要: 针对存在大量冗余数据等问题,提出紧凑增量高效用挖掘算法。采用HUI-trie结构和紧凑效用列表两种结构,前者用于更新高效用项集的效用,后者用于存储信息,而无需生成任何候选项。这两种结构使算法无需再次分析整个数据集,就可以将增加的数据反映到以前的分析结果中,更有效地处理增量数据集。试验结果表明,该算法在各种数据集上,运行时间平均提高38%,内存平均减少32%,具有一定的可扩展性。

关键词: 增量数据集, 高效用模式, 紧凑效用列表, 候选项集, 效用

Abstract: Aiming at the problem of large amounts of redundant data, a compact incremental high utility mining algorithm was proposed. The HUI-trie structure and a compact utility list were used. The former was used to update the utility of the high utility itemsets, and the latter was used to store information without generating any candidates. These two structures enabled the algorithm to reflect the increased data into the previous analysis results without reanalyzing the entire data set, and processed incremental data sets more effectively. The test results showed that the algorithm had an average increase of 38% in running time and an average reduction in memory of 32% on various data sets, and it had certain scalability.

Key words: incremental datasets, high utility pattern, compact utility list, candidate itemsets, utility

中图分类号: 

  • TP301.6
[1] KRISTIAN S, RUDOLF S. An approach to cluster separability in a partition[J]. Information Sciences, 2015, 305(1): 208-218.
[2] SÁ J, JARBAS J M, ANDRÉ R, et al. Shape class-ification using line segment statistics[J]. Information Sciences, 2015, 305(1): 349-356.
[3] LIU J, WANG K, FUNG B C M. Direct discovery of high utility itemsets without candidate generation[C] //Processdings of IEEE International Conference on Data Mining. Brussels, Belgium: IEEE, 2012, 1: 984-989.
[4] RYANG H, YUN U. High utility pattern mining over data streams with sliding window technique[J]. Expert Systems with Applications, 2016, 57(9): 214-231.
[5] FOURNIER-VIGER P, LIN J C, GUENICHE T, et al. Efficient incremental high utility itemset mining[C] //Processdings of Ase Bigdata & Socialinformatics. Kaohsiung, China: ACM, 2015, 53: 1-6.
[6] KRISHNAMOORTHY S. HMiner: efficiently mining high utility itemsets[J]. Expert Systems with Applications, 2017, 90(30): 168-183.
[7] FOURNIER-VIGER P, WU C W, ZIDA S, et al. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning[J]. Lecture Notes in Computer Science, 2014, 8502: 83-92.
[8] LIU Y, LIAO W K, CHOUDHARY A. A two-phase algorithm for fast discovery of high utility itemsets[C] //Processdings of Pacific-asia Conference on Advances in Knowledge Discovery & Data Mining. Heidelberg, Germany: Springer-Verlag, 2005, 3518: 689-695.
[9] LIN C W, HONG T P, LU W H. An effective tree structure for mining high utility itemsets[J]. Expert Systems with Applications, 2011, 38(6): 7419-7424.
[10] HAN J, PEI J. Mining frequent patterns without can-didate generation[C] //Processdings of the ACM SIGMOD International Conference on Management of Data. Texas, USA: Sigmod Record, 2000, 29(2): 1-12.
[11] LIU M, QU J. Mining high utility itemsets without candidate generation[C] //Processdings of ACM Inter-national Conference on Information and Knowledge Management. Hawaii, USA: ACM, 2012: 55-64.
[12] ZIDA S, FOURNIER-VIGER P, LIN J C W, et al. EFIM: a highly efficient algorithm for high-utility itemset mining[C] //Processdings of Mexican Inter-national Conference on Artificial Intelligence. Cuernavaca, Mexico: Springer, 2015, 9413: 530-546.
[13] PENG A Y, KOH Y S, RIDDLE P. mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets[C] //Processdings of PacificAsia Conference on Knowledge Discovery and Data Mining. Jeju, Korea: Springer, 2017: 196-207.
[14] LIN C W, GAN W, FOURNIER-VIGER P, et al. Efficient mining of high-utility itemsets using multiple minimum utility thresholds[J]. Knowledge Based Systems, 2016, 113(1): 100-115.
[15] GAN W, LIN C W, FOURNIER-VIGER P, et al. HUOPM: high-utility occupancy pattern mining[J]. IEEE Transactions on Cybernetics, 2020, 50(3): 1195-1208.
[16] GAN W, LIN C W, FOURNIER-VIGER P, et al. Mining high-utility itemsets with both positive and negative unit profits from uncertain databases[C] //Processdings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. Jeju, Korea: Springer, 2017: 434-446.
[17] LIN C W, LAN G C, HONG T P. An incremental mining algorithm for high utility itemsets[J]. Expert Systems with Applications, 2012, 39(8): 7173-7180.
[18] LIN C W. Incrementally updating the discovered sequential patterns based on pre-large concept[J]. Intelligent Data Analysis, 2015, 19(5): 1071-1089.
[19] AHMED C F, TANBEER S K, JEONG B S, et al. Efficient tree structures for high utility pattern mining in incremental databases[J]. IEEE Transactions on Know-ledge and Data Engineering, 2009, 21(12): 1708-1721.
[20] YUN U, KIM D. Efficient algorithm for mining high average-utility itemsets in incremental transaction databases[J]. Applied Intelligence, 2017, 47(1): 1-18.
[21] YUN U, RYANG H, LEE G, et al. An efficient algorithm for mining high utility patterns from incremental databases with one database scan[J]. Knowledge Based Systems, 2017, 124(5): 188-206.
[22] LIN C W, GAN W, HONG T P, et al. Incrementally updating high-utility itemsets with transaction insertion[C] //Processdings of International Conference on Advanced Data Mining and Applications. Guilin, China: Springer, 2014, 8933: 44-56.
[1] 熊文涛,冯育强. 基于决策人满意度的区间UTA方法[J]. 山东大学学报(工学版), 2016, 46(2): 72-77.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!