山东大学学报 (工学版) ›› 2018, Vol. 48 ›› Issue (6): 82-88.doi: 10.6040/j.issn.1672-3961.0.2018.206
摘要:
针对Apriori算法需要频繁扫描事务数据库并且会产生大量候选项集的不足,提出一种改进的Apriori算法。采用矩阵压缩的思想,增加了3个向量,分别表示事务矩阵中各行各列1的个数,即事务项目数和项目支持数,以及重复的事务出现次数,从而减小矩阵规模,避免多次扫描数据库。在矩阵运算过程中,对矩阵中事务项目数和项目支持数进行排序并删除不满足条件的项集和非频繁项集,形成新的矩阵结构,提高空间效率。对改进后的算法进行性能分析和试验分析发现,该算法相对于Apriori算法具有更高的效率,同时可以更有效的挖掘出频繁项集。
中图分类号:
1 | ZHONG R, WANG H. Research of commonly used association rules mining algorithm in data mining[C]//International Conference on Internet Computing and Information Services. Hong Kong: IEEE Computer Society, 2011: 219-222. |
2 | AGRAWAL R, IMIELIENSKIN T, SWAMI A. Mining association rules between sets items in large databases[C]//Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Washington, USA: ACM Press, 1993: 207-216. |
3 | HAN J, PEI J, YIN Y. Mining frequent patterns without candidate generation[C]//Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, USA: ACM Press, 2000: 1-12. |
4 |
陈兴蜀, 张帅, 童浩, 等. 基于布尔矩阵和MapReduce的FP-Growth算法[J]. 华南理工大学学报(自然科学版), 2014, 42 (1): 135- 141.
doi: 10.3969/j.issn.1000-565X.2014.01.023 |
CHEN Xingshu , ZHANG Shuai , TONG Hao , et al. FP-growth algorithm based on boolean matrix and mapreduce[J]. Journal of South China University of Technology(Natural Science Edition), 2014, 42 (1): 135- 141.
doi: 10.3969/j.issn.1000-565X.2014.01.023 |
|
5 | PARK J S, CHEN M S, YU P S. An effective hash-based algorithm for mining association rules[C]//Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data. San Jose, USA: ACM Press, 1995, 24(2): 175-186. |
6 |
PARK J S , CHEN M S , YU P S . Using a hash-based method with transaction trimming for mining association rules[J]. IEEE Transactions on Knowledge and Data Engineering, 1997, 9 (5): 813- 825.
doi: 10.1109/69.634757 |
7 |
赵学健, 孙知信, 袁源, 等. 一种正交链表存储的改进Apriori算法[J]. 小型微型计算机系统, 2016, 37 (10): 2291- 2295.
doi: 10.3969/j.issn.1000-1220.2016.10.028 |
ZHAO Xuejian , SUN Zhixin , YUAN Yuan , et al. An improved Apriori algorithm based on orthogonal list storage[J]. Journal of Chinese Computer Systems, 2016, 37 (10): 2291- 2295.
doi: 10.3969/j.issn.1000-1220.2016.10.028 |
|
8 | WANG Feng, LI Yonghua. An improved apriori algorithm based on the matrix[C]//Proceedings of 2008 International Seminar on Future BioMedical Information Engineering. Wuhan: IEEE Computer Society, 2008: 152-155. |
9 | LUO D , TAOSHEN L I , COMPUTER S O , et al. Research on improved apriori algorithm based on compressed matrix[J]. Computer Science, 2013, 40 (12): 75- 80. |
10 |
任伟建, 于博文. 基于矩阵约简的Apriori算法改进[J]. 计算机与现代化, 2015, (9): 1- 5.
doi: 10.3969/j.issn.1006-2475.2015.09.001 |
REN Weijian , YU Bowen . Improved apriori algorithm based on matrix reduction[J]. Computer & Modernization, 2015, (9): 1- 5.
doi: 10.3969/j.issn.1006-2475.2015.09.001 |
|
11 | 杨秋翔, 孙涵. 基于权值向量矩阵约简的Apriori算法[J]. 计算机工程与设计, 2018, 39 (3): 690- 693. |
YANG Qiuxiang , SUN Han . Apriori algorithm based on weight vector matrix reduction[J]. Computer Engineering & Design, 2018, 39 (3): 690- 693. | |
12 | 边根庆, 王月. 一种基于矩阵和权重改进的Apriori算法[J]. 微电子学与计算机, 2017, 34 (1): 136- 140. |
BIAN Genqing , WANG Yue . An improved Apriori algorithm based matrix and weight[J]. Microelectronics & Computer, 2017, 34 (1): 136- 140. | |
13 | 张文东, 尹金焕, 贾晓飞, 等. 基于向量的频繁项集挖掘算法研究[J]. 山东大学学报(理学版), 2011, 46 (3): 31- 34. |
ZHANG Wendong , YIN Jinhuan , JIA Xiaofei , et al. Research of a frequent itemsets mining algorithm based on vector[J]. Journal of Shandong University (Natural Science), 2011, 46 (3): 31- 34. | |
14 | 黄剑, 李明奇, 郭文强. 基于Hadoop的Apriori改进算法研究[J]. 计算机科学, 2017, 44 (7): 262- 266. |
HUANG Jian , LI Mingqi , GUO Wenqiang . Research of Apriori improved algorithm based on hadoop[J]. Computer Science, 2017, 44 (7): 262- 266. | |
15 | YU H , WEN J , WANG H , et al. An improved apriori algorithm based on the boolean matrix and hadoop[J]. Procedia Engineering, 2011, 15 (1): 1827- 1831. |
16 |
吕桃霞, 刘培玉. 一种基于矩阵的强关联规则生成算法[J]. 计算机应用研究, 2011, 28 (4): 1301- 1303.
doi: 10.3969/j.issn.1001-3695.2011.04.029 |
LYU Taoxia , LIU Peiyu . Algorithm for generating strong association rules based on matrix[J]. Application Research of Computers, 2011, 28 (4): 1301- 1303.
doi: 10.3969/j.issn.1001-3695.2011.04.029 |
|
17 | KHARE N, ADLAKHA N, PARDASANI K.R. An algorithm for mining multidimensional association rules using boolean matrix[C]//2010 International Conference on Recent Trends in Information, Telecommunication and Computing. Kerala, India: IEEE Computer Society, 2010: 95-99. |
18 | 何月顺.关联规则挖掘技术的研究及应用[D].南京:南京航空航天大学, 2010. |
HE Yueshun. Research and application of association rule mining technology[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2010. | |
19 | LIU Huizhen, DAI Shangping, JIANG Hong. Quantitative association rules mining algorithm based on matrix[C]//International Conference on Computational Intelligence and Software Engineering. Wuhan: IEEE, 2009: 1-4. |
[1] | 姚华传, 王丽珍, 吴萍萍, 邹目权. AC_SAR:基于强关联规则的可行动分簇算法[J]. 山东大学学报(工学版), 2014, 44(6): 38-46. |
[2] | 宋威,刘文博,李晋宏. 基于动态裁剪频繁模式树的频繁项集并发挖掘算法[J]. 山东大学学报(工学版), 2011, 41(4): 49-55. |
|