山东大学学报(工学版) ›› 2018, Vol. 48 ›› Issue (3): 10-16.doi: 10.6040/j.issn.1672-3961.0.2017.405
叶明全,高凌云,万春圆
YE Mingquan, GAO Lingyun, WAN Chunyuan
摘要: 基因表达数据存在高维、小样本、高噪声等特性,使得相应的肿瘤分类诊断面临着一定的挑战。为了实现更加精确的分类准确率,利用人工蜂群(artificial bee colony, ABC)算法对支持向量机(support vector machine, SVM)的核函数参数和惩罚因子进行优化,采用准确率作为分类模型的适应度函数,提出一种基于ABC和SVM的基因表达数据分类方法ABC-SVM。在6种公开的肿瘤基因表达数据集上进行试验,并对比分析其他的分类方法。结果表明,在筛选得到的较少信息基因基础上,ABC-SVM可获得更高的肿瘤分类准确率,对肿瘤样本类型进行更有效的分类预测。
中图分类号:
[1] QUACKENBUSH J. Microarray analysis and tumor classification[J]. New England Journal of Medicine, 2006, 354(23): 2463-2472. [2] 陆慧娟,安春霖,马小平,等. 基于输出不一致测度的极限学习机集成的基因表达数据分类[J]. 计算机学报, 2013, 36(2): 341-348. LU Huijuan, AN Chunlin, MA Xiaoping, et al. Disagreement measure based ensemble of extreme learning machine for gene expression data classification[J]. Chinese Journal of Computers, 2013, 36(2): 341-348. [3] 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42. LI Sushu, WANG Shitong, LI Tao. A feature selection method based on LS-SVM and fuzzy supplementary criterion[J]. Journal of Shandong University(Engineering Science), 2017, 47(3): 34-42. [4] KAR S, SHARMA K D, MAITRA M. Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique[J]. Expert Systems with Applications, 2015, 42(1): 612-627. [5] 谢娟英,谢维信. 基于特征子集区分度与支持向量机的特征选择算法[J]. 计算机学报, 2014, 37(8): 1704-1718. XIE Juanying, XIE Weixin. Several feature selection algorithms based on the discernibility of a feature subset and support vector machines[J]. Chinese Journal of Computers, 2014, 37(8): 1704-1718. [6] 谢娟英,高红超. 基于统计相关性与K-means的区分基因子集选择算法[J]. 软件学报, 2014, 25(9): 2050-2075. XIE Juanying, GAO Hongchao. Statistical correlation and K-means based distinguishable gene subset selection algorithms[J]. Journal of Software, 2014, 25(9): 2050-2075. [7] 叶明全,高凌云,伍长荣,等. 基于对称不确定性和SVM递归特征消除的信息基因选择方法[J]. 模式识别与人工智能, 2017, 30(5): 429-438. YE Mingquan, GAO Lingyun, WU Changrong, et al. Informative gene selection method based on symmetric uncertainty and SVM recursive feature elimination[J]. Pattern Recognition and Artificial Intelligence, 2017, 30(5): 429-438. [8] GOLUB T R, SLONIM D K, TAMAYO P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring[J]. Science, 1999, 286(5439): 531-537. [9] KHALILI M, MAJD H A, KHODAKARIM S, et al. Prediction of the thromboembolic syndrome: an application of artificial neural networks in gene expression data analysis[J]. Journal of Paramedical Sciences, 2016, 7(2): 15-22. [10] GEORGE G V S, RAJ V C. Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile[J]. International Journal of Computer Science & Engineering Survey, 2011, 2(3): 16-27. [11] CORTES C, VAPNIK V. Support vector networks[J]. Machine Learning, 1995, 20(3):273-297. [12] 林俊,许露,刘龙. 基于SVM-RFE-BPSO算法的特征选择方法[J]. 小型微型计算机系统, 2015, 36(8):1865-1868. LIN Jun, XU Lu, LIU Long. Feature selection based on SVM-RFE and particle swarm optimization[J]. Journal of Chinese Computer Systems, 2015, 36(8):1865-1868. [13] HUANG Chenglung, WANG Chiehjen. A GA-based feature selection and parameters optimization for support vector machines[J]. Expert Systems with Applications, 2006, 31(2):231-240. [14] ZHANG Xiaoli, CHEN Xuefeng, HE Zhengjia, et al. An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Systems With Applications, 2010, 37(9): 6618-6628. [15] SUBASI A. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders[J]. Computers in Biology & Medicine, 2013, 43(5): 576-586. [16] BAO Yukun, HU Zhongyi, XIONG Tao. A PSO and pattern search based memetic algorithm for SVMs parameters optimization[J]. Neurocomputing, 2014, 117(1): 98-106. [17] ZHANG Qiantu, FANG Liqing, MA Leilei, et al. Research on parameters optimization of SVM based on improved fruit fly optimization algorithm[J]. International Journal of Computer Theory and Engineering, 2016, 8(6): 500-505. [18] KARABOGA D, BASTURK B. On the performance of artificial bee colony(ABC)algorithm[J]. Applied Soft Computing, 2008, 8(1): 687-697. [19] KARABOGA D, GORKKEMLI B, OZTURK C, et al. A comprehensive survey: artificial bee colony(ABC)algorithm and applications[J]. Artificial Intelligence Review, 2014, 42(1): 21-57. [20] KIRAN M S, BABALIK A. Improved artificial bee colony algorithm for continuous optimization problems[J]. Journal of Computer & Communications, 2014, 02(4): 108-116. [21] SECUI D C. A new modified artificial bee colony algorithm for the economic dispatch problem[J]. Energy Conversion & Management, 2015, 89(89): 43-62. [22] 秦全德,程适,李丽,等.人工蜂群算法研究综述[J]. 智能系统学报,2014,9(2):127-135. QIN Quande, CHENG Shi, LI Li, et al. Artificial bee colony algorithm: a survey[J]. CAAI Transactions on Intelligent Systems, 2014, 9(2): 127-135. [23] KARABOGA D, AKAY B. A comparative study of artificial bee colony algorithm[J]. Applied Mathematics & Computation, 2009, 214(1): 108-132. [24] TSAI H C. Integrating the artificial bee colony and bees algorithm to face constrained optimization problems[J]. Information Sciences, 2014, 258(3): 80-93. [25] SATHYANARAYANA S V, AMARAPPA S. Data classification using support vector machine(SVM), a simplified approach[J]. International Journal of Electronics & Computer Science Engineering, 2014, 3(4): 435-445. [26] 刘岩,李幼军,陈萌. 基于EMD和SVM的抑郁症静息态脑电信号分类研究[J]. 山东大学学报(工学版), 2017, 47(3): 21-26. LIU Yan, LI Youjun, CHEN Meng. Research on the classification of resting state EEG signal between depression patients and normal controls by EMD and SVM methods[J]. Journal of Shandong University(Engineering Science), 2017, 47(3): 21-26. [27] LI Meng, YI Liangzhong, GAO Zhisheng, et al. Support vector machine(SVM)based on membrane computing optimization and the application for C-band radio abnormal signal identification[J]. Journal of Information & Computational Science, 2014, 11(11): 3683-3693. [28] 李颖新,阮晓钢. 基于支持向量机的肿瘤分类特征基因选取[J]. 计算机研究与发展, 2005, 42(10): 1796-1801. LI Yingxin, RUAN Xiaogang. Feature selection for cancer classification based on Support Vector Machine[J]. Journal of Computer Research and Development, 2005, 42(10): 1796-1801. [29] YU Lei, LIU Huan. Efficient feature selection via analysis of relevance and redundancy[J]. Journal of Machine Learning Research, 2004, 5(12): 1205-1224. [30] CHANG Chihchung, LIN Chihjen. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems & Technology, 2011, 2(3): 1-39. [31] 梁兴建,詹志辉. 基于双模式变异策略的改进遗传算法[J]. 山东大学学报(工学版), 2014, 44(6): 1-7. LIANG Xingjian, ZHAN Zhihui. Improved genetic algorithm based on the dual-mode mutation strategy[J]. Journal of Shandong University(Engineering Science), 2014, 44(6): 1-7. [32] LIU Yihui. Cancer identification based on DNA microarray data[C] //Processdings of the International Conference on Emerging Technologies in Knowledge Discovery and Data Mining. Nanjing, China: Springer-Verlag, 2007:153-161. [33] ZHANG Shanwen, HUANG Deshuang, WANG Shulin. A method of tumor classification based on wavelet packet transforms and neighborhood rough set[J]. Computers in Biology & Medicine, 2010, 40(4): 430-437. |
[1] | 韩学山,王俊雄,孙东磊,李文博,张心怡,韦志清. 计及空间关联冗余的节点负荷预测方法[J]. 山东大学学报(工学版), 2017, 47(6): 7-12. |
[2] | 刘岩,李幼军,陈萌. 基于EMD和SVM的抑郁症静息态脑电信号分类研究[J]. 山东大学学报(工学版), 2017, 47(3): 21-26. |
[3] | 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42. |
[4] | 刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83. |
[5] | 刘晓勇. 一种基于树核函数的半监督关系抽取方法研究[J]. 山东大学学报(工学版), 2015, 45(2): 22-26. |
[6] | 浩庆波, 牟少敏, 尹传环, 昌腾腾, 崔文斌. 一种基于聚类的快速局部支持向量机算法[J]. 山东大学学报(工学版), 2015, 45(1): 13-18. |
[7] | 李发权, 杨立才, 颜红博. 基于PCA-SVM多生理信息融合的情绪识别方法[J]. 山东大学学报(工学版), 2014, 44(6): 70-76. |
[8] | 周咏梅1,杨佳能2,阳爱民2. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报(工学版), 2013, 43(6): 27-33. |
[9] | 王昊,华继学,范晓诗. 基于双联支持向量机的入侵检测技术[J]. 山东大学学报(工学版), 2013, 43(6): 53-56. |
[10] | 安春霖1,陆慧娟1,2*,郑恩辉3,王明怡1,陆羿4. 嵌入误分类代价和拒识代价的极限学习机基因表达数据分类[J]. 山东大学学报(工学版), 2013, 43(4): 18-25. |
[11] | 施珺,朱敏. 一种基于灰色系统和支持向量机的预测优化模型[J]. 山东大学学报(工学版), 2012, 42(5): 7-11. |
[12] | 赵加敏,冯爱民*,刘学军. 局部密度嵌入的结构单类支持向量机[J]. 山东大学学报(工学版), 2012, 42(4): 13-18. |
[13] | 潘冬寅,朱发,徐昇,业宁*. 结肠癌基因表达谱的特征选取研究[J]. 山东大学学报(工学版), 2012, 42(2): 23-29. |
[14] | 孙鹏,程世庆*,谢敬思,张海瑞. 预测混合生物质灰熔点的CV-GA-SVM模型[J]. 山东大学学报(工学版), 2012, 42(2): 108-111. |
[15] | 赵燕燕, 范丽亚. 多输出支持向量回归机在依赖时间的变分不等式中的应用[J]. 山东大学学报(工学版), 2011, 41(3): 23-30. |
|