您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2014, Vol. 44 ›› Issue (6): 38-46.doi: 10.6040/j.issn.1672-3961.2.2013.284

• 机器学习与数据挖掘 • 上一篇    下一篇

AC_SAR:基于强关联规则的可行动分簇算法

姚华传1, 王丽珍2, 吴萍萍2, 邹目权1   

  1. 1. 云南大学信息学院计算机科学与工程系, 云南 昆明 650091;
    2. 云南大学滇池学院计算机科学与工程系, 云南 昆明 650091
  • 收稿日期:2014-05-23 修回日期:2014-07-02 出版日期:2014-12-20 发布日期:2014-05-23
  • 通讯作者: 王丽珍(1962-),女,山东博兴人,教授,博导,博士,主要研究方向为数据库,数据挖掘和计算机算法.E-mail:lizhwang2005@126.com E-mail:lizhwang2005@126.com
  • 作者简介:姚华传(1977-)男,广东吴川人,硕士研究生,主要研究方向为数据挖掘.E-mail:djm13232@126.com
  • 基金资助:
    国家自然科学基金资助项目 (61063008,61272126,61262069);云南省应用基础研究基金资助项目(2010CD025);云南省教育厅基金资助项目(2012C103)

AC_SAR: actionable clustering algorithm based on strong association rule

YAO Huachuan1, WANG Lizhen2, WU Pingping2, ZOU Muquan1   

  1. 1. Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650091, Yunnan, China;
    2. Department of Computer Science and Engineering, Dianchi College, Yunnan University, Kunming 650091, Yunnan, China
  • Received:2014-05-23 Revised:2014-07-02 Online:2014-12-20 Published:2014-05-23

摘要: 提出一种基于强关联规则的可行动分簇算法(AC_SAR)。AC_SAR算法为每一个对象寻找关联性最强的对象,并通过反对称原则和可连接原则删除和合并相应规则,最终挖掘出涉及事务数据库中所有对象的多个连通子图(簇)。与传统算法相比,新算法无需设置阈值,没有冗余知识,算法的中间挖掘结果及最终生成的簇,能有效地解决诸多领域的实际问题。大量试验结果表明,该新算法具有较高的效率、准确性以及较强的可行动性。

关键词: 分簇, 可行动, 强关联原则, 可连接原则, 强关联规则, 反对称原则

Abstract: An actionable clustering algorithm based on strong association rules (AC_SAR) was proposed. The AC_SAR algorithm looked for strong associated objects for each object, and then some relevant rules were deleted and merged by the anti-symmetric principle and the connectivity principle. The connected sub-graphs (clusters) related to all objects in transaction database were discovered finally. Compared with the traditional algorithms, the AC_SAR algorithm did not need to set the thresholds by user, and there were not redundant rules in the results. Moreover, the intermediate mined results and the final generated clusters could solve the problems in many fields effectively. A large number of experiments showed that the AC_SAR algorithm had higher efficiency, higher accuracy, and stronger action.

Key words: clustering, actionable, strong relevance principle, connectivity principle, strong association rules, anti-symmetry principle

中图分类号: 

  • TP311
[1] HAN Jiawei, KAMBER M. 数据挖掘概念与技术 [M]. 2版.范明, 孟小峰, 译. 北京:机械工业出版社, 2007:146-183.
[2] 朱孝宇, 王理东, 汪光阳. 一种改进的Apriori挖掘关联规则算法[J].计算机技术与发展, 2006, 16(12):89-90. ZHU Xiaoyu, WANG Lidong, WANG Guangyang. An improvement of Apriori algorithm for mining association rules[J]. Computer Technology and Development, 2006, 16(12):89-90.
[3] 邵峰晶, 于忠清. 数据挖掘原理与算法[M]. 北京:中国水利水电出版社, 2003:123-135.
[4] 刘以安, 羊斌. 关联规则挖掘中对Apriori算法的一种改进研究[J]. 计算机应用, 2007, 27(2):418-420. LIU Yian, YANG Bin. Research of an improved Apriori algorithm in mining association rules[J]. Computer Applications, 2007, 27(2):418-420.
[5] 袁万莲, 郑诚, 翟明清. 一种改进的Apriori算法[J]. 计算机技术与发展, 2008,18(5):51-53. YUAN Wanlian, ZHENG Cheng, ZHAI Mingqing. An improved Apriori algorithm[J]. Computer Technology and Development, 2008, 18(5):51-53.
[6] HAN Jiawei, PEI Jian, YIN Yiwen. Mining frequent patterns without candidate generation:a frequent-pattern tree approach[C]//Proceedings of International Conference on Data Mining and Knowledge Discovery. New York, USA:ACM, 2004:53-87.
[7] 李志云,周国祥.一种基于MFP树的快速关联规则挖掘算法[J].计算机技术与发展,2007(6):94-100. LI Zhiyun, ZHOU Guoxiang. A fast association rule mining algorithm based on MFP tree[J]. Computer Technology and Development, 2007(6):94-100.
[8] 徐前方,阔建杰,李永春,等.一种具有时序特征的告警关联规则挖掘算法[J].微电子学与计算机,2007(24):23-26. XU Qianfang, KUO Jianjie, LI Yongchum, et al. An algorithm for mining time-series alarm association rules[J]. Microelectronics and Computer, 2007(24):23-26.
[9] 宋余庆,朱玉全,孙志辉,等.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003, 14(9):1586-1592. SONG Yuqing, ZHU Yuquan, SUN Zhihui, et al. An algorithm and its updating algorithm based on FP-Tree for mining maximum frequent itemsets[J]. Journal of Software, 2003, 14(9):1586-1592.
[10] 冯霞,李娟娟,闫冠男. 关联规则挖掘在航空安全报告分析中的应用[J]. 计算机工程与设计, 2011, 32(1):218-220. FENG Xia, LI Juanjuan, YAN Guannan. Applications of association rules mining in aviation safety reports analysis[J]. Computer Engineering and Design, 2011, 32(1):218-220.
[11] 钱冬云. 基于用户兴趣导向的关联规则数据挖掘[J]. 微计算机信息, 2007(21):207-209. QIAN Dongyun. Algorithms based user transmits of association rules in data mining[J]. Microcomputer Information, 2007(21):207-209.
[12] 戴臻,费洪晓,谢文彪,等. 基于特定模式树的用户行为关联规则挖掘算法[J]. 计算机系统应用, 2007(5):56-59. DAI Zhen, FEI Hongxiao, XIE Wenbiao, et al. The algorithm of users behavior associate rules mining based on specific pattern tree[J]. Computer Systems Applications, 2007(5):56-59.
[13] 宋江春,沈钧毅,宋擒豹. 一个基于关联规则的多层文档聚类算法[J].计算机应用, 2005, 25(7):1571-1572. SONG Jiangchun, SHEN Junyi, SONG Qinbao. Multi-level document clustering algorithm based on association rules[J]. Computer Applications, 2005, 25(7):1571-1572.
[14] 曾利军,李泽军,柳佳刚. 基于矩阵加权关联规则的区间模糊C均值聚类[J]. 计算机工程, 2010, 36(22):52-54. ZENG Lijun, LI Zejun, LIU Jiagang. Inter borough fuzzy C-means clustering based on matrix-weighted association rules[J]. Computer Engineering, 2010, 36(22):52-54.
[15] 苑森淼, 程晓青. 数量关联规则发现中的聚类方法研究[J],计算机学报, 2000, 23(8):866-871. YUAN Senmiao, CHENG Xiaoqing. Clustering method for mining quantitative association rules[J]. Chinese Journal of Computer, 2000, 23(8):866-871.
[16] 周霆, 张伟, 张泽洪. 基于关联规则的映射聚类算法[J]. 微电子学与计算机, 2006, 23(3):26-33. ZHOU Ting, ZHANG Wei, ZHANG Zehong. Association rules-based projected clustering algorithm[J]. Microelectronics and Computer, 2006, 23(3):26-33.
[17] 龙昊, 冯剑, 琳李曲. R-means:以关联规则为簇中心的文本聚类[J].计算机科学, 2005, 32(9):156-159. LONG Hao, FENG Jian, LIN Liqu. R-means:exploiting association rules as means for text clustering[J]. Computer Science, 2005, 32(9):156-159.
[18] Lotfi Admane, Karima Benatchba, Mouloud Koudi, et al. AntPart:an algorithm for the unsupervised classification problem using ants[J].Applied Mathematics and Computation, 2006, 180(1):16-28.
[19] DORIGO M, BONABEAU E, THERAULAZ G. Ant algorithms and stifmergy[J]. Future Generation Computer Systerns, 2000, 16(8):851-871.
[20] LUMER E, FAIETA B. Diversity and adaptation in populations of clustering ants[C]//Proceedings of the Third International Conference on Simulation of Adaptive Behavior:From Animals to Animals. MA:MIT Press, 1994:499-508.
[21] 吴斌,史忠植.一种基于蚁群算法的TSP问题分段求解算法[J].计算机学报, 2001(12):1328-1333. WU Bin, SHI Zhongzhi. An ant colony algorithm based partition algorithm for TSP[J]. Chinese Journal of Computers, 2001(12):1328-1333.
[22] 沙露,鲍培明,李尼格.基于蚁群系统的聚类算法研究[J]. 山东大学学报:工学版, 2010(3):13-18. SHA Lu, BAO Peiming, LI Nige. Clustering algorithm based on ant colony system[J]. Journal of Shandong University: Engineering Science, 2010(3):13-18.
[23] CAO Longbin, ZHANG Chengqi. Knowledge actionability:satisfying technical and business interestingness[J]. International Journal of Business Intelligence and Data Mining, 2007, 2(4):496-514.
[24] CAO Longbin, ZHANG Chengqi. Domain-driven actionable knowledge discovery in the real world[C]//Proceedings of the 10th Pacific-Asi Conf. on Advances in Knowledge Discovery and Data mining (PAKDD 2006). Berlin Heidelberg:Springer, LNAI 3918, 2006:821-830.
[25] CAO Longbing, ZHAO Yanchang, ZHANG Chengqi. Mining impact-targeted activity patterns in imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(8):1053-1066.
[1] 鲁松1,徐文春2,杨云2. 一种分环多跳的无线传感器网络分簇路由加权算法[J]. 山东大学学报(工学版), 2012, 42(4): 24-28.
[2] 蔡忠欣,张华忠 . 一种基于睡眠和网关选择机制的分簇协议[J]. 山东大学学报(工学版), 2008, 38(1): 56-60 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!