您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2019, Vol. 49 ›› Issue (2): 122-130.doi: 10.6040/j.issn.1672-3961.0.2018.211

• 机器学习与数据挖掘 • 上一篇    

基于多目标协同进化遗传算法的规则提取方法

张中伟(),梅红岩*(),周军,贾慧萍   

  1. 辽宁工业大学电子与信息工程学院, 辽宁 锦州 121000
  • 收稿日期:2018-05-25 出版日期:2019-04-20 发布日期:2019-04-19
  • 通讯作者: 梅红岩 E-mail:zzwzzwcool@126.com;liaoning_mhy@126.com
  • 作者简介:张中伟(1992—),男,山西朔州人,硕士研究生,主要研究方向为数据挖掘与知识发现.E-mail:zzwzzwcool@126.com
  • 基金资助:
    辽宁省科学技术计划项目面上项目(2015020089);辽宁省科学技术计划项目面上项目(201602372)

A rule extraction method based on multi-objective co-evolutionarygenetic algorithm

Zhongwei ZHANG(),Hongyan MEI*(),Jun ZHOU,Huiping JIA   

  1. School of Electronics & Information Engineering, Liaoning University of Technology, Jinzhou 121000, Liaoning, China
  • Received:2018-05-25 Online:2019-04-20 Published:2019-04-19
  • Contact: Hongyan MEI E-mail:zzwzzwcool@126.com;liaoning_mhy@126.com
  • Supported by:
    辽宁省科学技术计划项目面上项目(2015020089);辽宁省科学技术计划项目面上项目(201602372)

摘要:

针对事务数据库中连续型数值属性难以划分且规则提取效率较低的问题,提出一种交叉、变异种群协同进化的量化关联规则提取方法。利用帕累托原理的非支配排序对种群个体进行优化。利用个体相似度的基因型、表现型控制交叉种群中个体的配对,对变异种群采用水平集概念进行分割,并针对个体优劣分别采取单点突变和多点突变两种突变方式增强个体多样性。利用精英种群保存交叉种群与变异种群中的优秀个体并对其求取帕累托最优解集。在不同数据集上的仿真结果表明,该算法获得规则在性能和数量上达到较好的均衡,且能够有效覆盖数据集,验证了算法的有效性和可行性。

关键词: 多目标, 协同进化, 非支配, 相似度, 水平集

Abstract:

Aiming to deal with the problem that continuous numerical attributes in transaction database were difficult to divide and the efficiency of rule extraction was low, a multi-objective co-evolutionary quantification association rule extraction method was proposed with the cooperative of crossover population and mutation population. The non-dominated sorting of the Pareto principle was used to optimize the individuals of population. Genotype and phenotype of individuals similarity were used to control the matching of individuals in the crossover population. The mutation population was segmented by the concept of level set, then, single point mutation and multiple point mutation were adopted according to the quality of individuals to enhance the individuals diversity. The pareto optimal solution set was obtained from the elite population which was used to preserve the excellent individuals in the crossover population and the mutation population. The simulation results on different datasets showed that the algorithm achieved a good balance of performance and quantity, and the data set was effectively covered, which verifies the effectiveness and feasibility of the algorithm.

Key words: multi-objective, co-evolutionary, non-dominated, similarity, level set

中图分类号: 

  • TP311.13

表1

染色体编码"

A1 A2 Ai AN
T1 LB1 UB1 T2 LB2 UB2 Ti LBi UBi TN LBN UBN

图1

交叉种群进化模型"

图2

突变种群进化模型"

图3

算法模型"

表2

遗传算法进化参数"

种群大小 PCx PCn Pm 迭代次数
50 0.99 0.5 0.2 200

表3

数据集描述"

数据集 记录个数 属性个数 目标属性
Basketball 96 5 Points per minute
Bodyfat 252 18 Body height
Quake 2 178 4 Richter

表4

Basketball数据集提取的部分规则"

规则序号 规则前件 规则后件 置信度 可理解度 兴趣度
1 assists_per_minute∈(0.177 4, 0.337 8) height∈(159.011 7, 193.104 2)
points_per_minute∈(0.087 3, 0.773 9)
1 0.792 5 0.302 3
2 assists_per_minute∈(0.148 1, 0.291 6) height∈(154.136 8, 188.847 5)
time_played∈(5.578 0, 43.171 3)
age∈(22.613 6, 36.458 2)
points_per_minute∈(0.214 7, 0.691 7)
0.641 5 0.898 2 0.352 2
3 Height∈(169.988 1, 185.075 1) assists_per_minute∈(0.154 6, 0.359 5)
age∈(22.613 6, 36.458 2)
points_per_minute∈(0.214 7, 0.691 7)
0.931 0 0.861 4 0.354 3

图4

置信度均值随迭代次数变化情况"

图5

可理解度均值随迭代次数变化情况"

图6

兴趣度均值随迭代次数变化情况"

表5

提取规则支持度及规则大小"

数据集 支持度/% 规则大小
MODENAR RPSO GAR MOGAR REM_MOCGA MODENAR RPSO GAR MOGAR REM_MOCGA
Basketball 37.20 36.44 36.69 50.82 46.93 3.21 3.21 3.38 3.24 3.83
Bodyfat 65.22 65.22 65.26 57.22 41.17 6.87 7.06 7.45 6.96 7.90
Quake 39.86 38.74 38.65 30.12 47.69 2.03 2.22 2.33 2.38 3.12

表6

提取规则数目及规则置信度"

数据集 提取规则数量 置信度/%
MODENAR RPSO MOGAR REM_MOCGA MODENAR RPSO MOGAR REM_MOCGA
Basketball 48.0 34.2 50 152.3 61±2.1 60±2.8 83 88±2.9
Bodyfat 52.4 46.4 84 311.4 62±3.2 61±1.8 85 85±2.3
Quake 55.4 46.4 44.87 332.6 63±2.8 63±2.8 82 92±3.4

表7

提取规则覆盖度"

%
数据集 MODENAR RPSO GAR MOGAR REM_MOCGA
Basketball 100.00 100.00 100.00 100.00 100.00
Bodyfat 86.11 86.11 86.00 93.52 97.22
Quake 88.9 87.92 87.5 91.07 99.68
1 SRIKANT R , AGRAWAL R . Mining quantitative association rules in large relational tables[J]. ACM SIGMOD Record, 1996, 25 (2): 1- 12.
doi: 10.1145/235968
2 HAN Jiawei , PEI Jian , YIN Yiwen , et al. Mining frequent patterns without candidate generation: a frequent-pattern tree approach[J]. Data Mining and Knowledge Discovery, 2004, 8 (1): 53- 87.
3 HU Yichung , CHEN Rueyshun , TZENG Gwohshiung . Discovering fuzzy association rules using fuzzy partition methods[J]. Knowledge-Based Systems, 2003, 16 (3): 137- 147.
doi: 10.1016/S0950-7051(02)00079-5
4 CAO Hui, SI Gangquan, ZHANG Yanbin, et al. A density-based quantitative attribute partition algorithm for association rule mining on industrial database[C]//2008 American Control Conference. Seattle, USA: IEEE, 2008: 75-80.
5 KAYAA M , ALHAJJB R . Genetic algorithm based framework for mining fuzzy aAssociation rules[J]. Fuzzy Sets & Systems, 2005, 152 (3): 587- 601.
6 周丽娟, 石倩, 葛学彬, 等. 基于聚类的模糊遗传挖掘算法的研究[J]. 计算机工程与应用, 2010, 46 (13): 118- 121.
doi: 10.3778/j.issn.1002-8331.2010.13.035
ZHOU Lijuan , SHI Qian , GE Xuebin , et al. Cluster-based evaluation in fuzzy-genetic data mining[J]. Computer Engineering & Applications, 2010, 46 (13): 118- 121.
doi: 10.3778/j.issn.1002-8331.2010.13.035
7 HONG Tzungpei , CHEN Chunhao , WU Yulung , et al. A GA-based fuzzy mining approach to achieve a trade-off between number of rules and suitability of membership functions[J]. Soft Computing, 2006, 10 (11): 1091- 1101.
doi: 10.1007/s00500-006-0046-x
8 ALCALÁ-FDEZ J , ALCALÁ R , GACTO M J , et al. Learning the membership function contexts for mining fuzzy association rules by using genetic algorithms[J]. Fuzzy Sets & Systems, 2009, 160 (7): 905- 921.
9 PALACIOS A M , PALACIOS J , LU IS , et al. Genetic learning of the membership functions for mining fuzzy association rules from low quality data[J]. Information Sciences, 2015, 295 (C): 358- 378.
10 JESUS M J D , GÁMEZ J A , GONZÁLEZ P , et al. On the discovery of association rules by means of evolutionary algorithms[J]. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery, 2011, 1 (5): 397- 415.
11 ASHISH Ghosh , BHABESH Nath . Multi-objective rule mining using genetic algorithms[J]. Information Sciences, 2004, 163 (1/2/3): 123- 133.
12 ALATAS B , AKIN E . Rough particle swarm optimization and its applications in data mining[J]. Soft Computing, 2008, 12 (12): 1205- 1218.
doi: 10.1007/s00500-008-0284-1
13 ALATAS B , AKIN E , KARCI A . MODENAR: Multi-objective differential evolution algorithm for mining numeric association rules[J]. Applied Soft Computing, 2008, 8 (1): 646- 656.
14 MINAEI-BIDGOLI B , BARMAKI R , NASIRI M . Mining numerical association rules via multi-objective genetic algorithms[J]. Information Sciences, 2013, 233 (2): 15- 24.
15 DEB K , PRATAP A , AGARWAL S , et al. A fast and elitist multiobjective genetic algorithm: NSGA-Ⅱ[J]. IEEE Transactions Evolutionary Computation, 2002, 6 (2): 182- 197.
16 吴琼, 曾庆鹏. 基于多目标烟花算法的关联规则挖掘[J]. 模式识别与人工智能, 2017, 30 (4): 365- 376.
WU Qiong , ZENG Qingpeng . Association rules mining based on multi-objective fireworks optimization algorithm[J]. Pattern Recognition and Artificial Intelligence, 2017, 30 (4): 365- 376.
17 SHANG Ronghua , WANG Yuying , WANG Jia , et al. A multi-population cooperative coevolutionary algorithm for multi-objective capacitated arc routing problem[J]. Information Sciences, 2014, 277 (2): 609- 642.
18 MEHTA S B , CHAUDHURY S , BHATTACHARYYA A , et al. Tissue classification in magnetic resonance images through the hybrid approach of michigan and pittsburg genetic algorithm[J]. Applied Soft Computing, 2011, 11 (4): 3476- 3484.
doi: 10.1016/j.asoc.2011.01.021
19 NASIRI M , TAGHAVI L S , MINAEE B . Numeric multi-objective rule mining using simulated annealing algorithm[J]. Journal of Convergence Information Technology, 2011, 5 (1): 60- 68.
20 MATA J, ALVAREZ J, RIQUELME J. Discovering numeric association rules via evolutionary algorithm[C]// Advances in Knowledge Discovery and Data Mining. Berlin, Germany: Springer, 2002: 40-51.
[1] 邵孟伟,袁世飞,周宏志,王乃华. 基于BP神经网络和遗传算法的翅片管结构优化[J]. 山东大学学报 (工学版), 2025, 55(6): 76-82.
[2] 杜睿山,井远光,孟令东,张豪鹏. 基于改进多目标粒子群算法的储气库注气优化[J]. 山东大学学报 (工学版), 2024, 54(4): 42-50.
[3] 王丽娟,徐晓,丁世飞. 面向密度峰值聚类的高效相似度度量[J]. 山东大学学报 (工学版), 2024, 54(3): 12-21.
[4] 肖伟, 郑更生, 陈钰佳. 结合自训练模型的命名实体识别方法[J]. 山东大学学报 (工学版), 2024, 54(2): 96-102.
[5] 胡钢, 王乐萌, 卢志宇, 王琴, 徐翔. 基于节点多阶邻居递阶关联贡献度的重要性辨识[J]. 山东大学学报 (工学版), 2024, 54(1): 1-10.
[6] 余明骏,刁红军,凌兴宏. 基于轨迹掩膜的在线多目标跟踪方法[J]. 山东大学学报 (工学版), 2023, 53(2): 61-69.
[7] 刘笑,陈家炜,胡峻林. 用于亲属关系鉴别的成对约束组合度量学习[J]. 山东大学学报 (工学版), 2022, 52(2): 50-56.
[8] 赵康,田浩,马欢,杨冬. 基于复杂网络理论的多直流馈入受端电网优化分区方法[J]. 山东大学学报 (工学版), 2022, 52(1): 128-134.
[9] 宗欣露,杜佳圆. 基于多目标驱动人工蜂群算法的疏散仿真模型[J]. 山东大学学报 (工学版), 2021, 51(3): 1-6.
[10] 武慧虹,钱淑渠,刘衍民,徐国峰,郭本华. 精英克隆局部搜索的多目标动态环境经济调度差分进化算法[J]. 山东大学学报 (工学版), 2021, 51(1): 11-23.
[11] 谢晓兰,王琦. 一种基于多目标的容器云任务调度算法[J]. 山东大学学报 (工学版), 2020, 50(4): 14-21.
[12] 胡龙茂,胡学钢. 基于多维相似度和情感词扩充的相同产品特征识别[J]. 山东大学学报 (工学版), 2020, 50(2): 50-59.
[13] 孙润稼,朱海南,刘玉田. 基于偏好多目标优化和遗传算法的输电网架重构[J]. 山东大学学报 (工学版), 2019, 49(5): 17-23.
[14] 杨冬,王世文,王勇,陈博,郑天茹,周宁,肖天,赵雅文. 并网型风电场扩展光伏互补发电容量优化配置[J]. 山东大学学报 (工学版), 2019, 49(5): 44-51.
[15] 黄劲潮. 基于快速区域建议网络的图像多目标分割算法[J]. 山东大学学报(工学版), 2018, 48(4): 20-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 王素玉,艾兴,赵军,李作丽,刘增文 . 高速立铣3Cr2Mo模具钢切削力建模及预测[J]. 山东大学学报(工学版), 2006, 36(1): 1 -5 .
[2] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[3] 孔祥臻,刘延俊,王勇,赵秀华 . 气动比例阀的死区补偿与仿真[J]. 山东大学学报(工学版), 2006, 36(1): 99 -102 .
[4] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[5] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[6] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[7] 浦剑1 ,张军平1 ,黄华2 . 超分辨率算法研究综述[J]. 山东大学学报(工学版), 2009, 39(1): 27 -32 .
[8] 王丽君,黄奇成,王兆旭 . 敏感性问题中的均方误差与模型比较[J]. 山东大学学报(工学版), 2006, 36(6): 51 -56 .
[9] 孙殿柱,朱昌志,李延瑞 . 散乱点云边界特征快速提取算法[J]. 山东大学学报(工学版), 2009, 39(1): 84 -86 .
[10] 赵然杭,陈守煜 . 水资源数量与质量联合评价理论模型研究[J]. 山东大学学报(工学版), 2006, 36(3): 46 -50 .