JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2018, Vol. 48 ›› Issue (3): 10-16.doi: 10.6040/j.issn.1672-3961.0.2017.405

Previous Articles     Next Articles

Gene expression data classification based on artificial bee colony and SVM

YE Mingquan, GAO Lingyun, WAN Chunyuan   

  1. Research Center of Health Big Data Mining and Applications, Wannan Medical College, Wuhu 241002, Anhui, China
  • Received:2017-05-09 Online:2018-06-20 Published:2017-05-09

Abstract: The characteristics of high dimension, small sample and high noise for gene expression data raised many challenges in tumor diagnosis. In order to classify tumor gene expression data more accurately, the kernel function parameters and penalty factors of SVM(support vector machine)were optimized by ABC(artificial bee colony)algorithm, in which classification accuracy was used as the fitness function. A new gene expression data classification method based on ABC algorithm and SVM, which named ABC-SVM, was proposed. Experiments were conducted on six public tumor gene expression datasets, and other classicfication methods were compared. The results showed that ABC-SVM, a method based on fewer informative genes, could obtain higher classification accuracy, and the classification of tumor samples could be more effectively predicted.

Key words: artificial bee colony, support vector machine, gene expression data, tumor classification, bioinformatics, intelligent optimization

CLC Number: 

  • TP391
[1] QUACKENBUSH J. Microarray analysis and tumor classification[J]. New England Journal of Medicine, 2006, 354(23): 2463-2472.
[2] 陆慧娟,安春霖,马小平,等. 基于输出不一致测度的极限学习机集成的基因表达数据分类[J]. 计算机学报, 2013, 36(2): 341-348. LU Huijuan, AN Chunlin, MA Xiaoping, et al. Disagreement measure based ensemble of extreme learning machine for gene expression data classification[J]. Chinese Journal of Computers, 2013, 36(2): 341-348.
[3] 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42. LI Sushu, WANG Shitong, LI Tao. A feature selection method based on LS-SVM and fuzzy supplementary criterion[J]. Journal of Shandong University(Engineering Science), 2017, 47(3): 34-42.
[4] KAR S, SHARMA K D, MAITRA M. Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique[J]. Expert Systems with Applications, 2015, 42(1): 612-627.
[5] 谢娟英,谢维信. 基于特征子集区分度与支持向量机的特征选择算法[J]. 计算机学报, 2014, 37(8): 1704-1718. XIE Juanying, XIE Weixin. Several feature selection algorithms based on the discernibility of a feature subset and support vector machines[J]. Chinese Journal of Computers, 2014, 37(8): 1704-1718.
[6] 谢娟英,高红超. 基于统计相关性与K-means的区分基因子集选择算法[J]. 软件学报, 2014, 25(9): 2050-2075. XIE Juanying, GAO Hongchao. Statistical correlation and K-means based distinguishable gene subset selection algorithms[J]. Journal of Software, 2014, 25(9): 2050-2075.
[7] 叶明全,高凌云,伍长荣,等. 基于对称不确定性和SVM递归特征消除的信息基因选择方法[J]. 模式识别与人工智能, 2017, 30(5): 429-438. YE Mingquan, GAO Lingyun, WU Changrong, et al. Informative gene selection method based on symmetric uncertainty and SVM recursive feature elimination[J]. Pattern Recognition and Artificial Intelligence, 2017, 30(5): 429-438.
[8] GOLUB T R, SLONIM D K, TAMAYO P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring[J]. Science, 1999, 286(5439): 531-537.
[9] KHALILI M, MAJD H A, KHODAKARIM S, et al. Prediction of the thromboembolic syndrome: an application of artificial neural networks in gene expression data analysis[J]. Journal of Paramedical Sciences, 2016, 7(2): 15-22.
[10] GEORGE G V S, RAJ V C. Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile[J]. International Journal of Computer Science & Engineering Survey, 2011, 2(3): 16-27.
[11] CORTES C, VAPNIK V. Support vector networks[J]. Machine Learning, 1995, 20(3):273-297.
[12] 林俊,许露,刘龙. 基于SVM-RFE-BPSO算法的特征选择方法[J]. 小型微型计算机系统, 2015, 36(8):1865-1868. LIN Jun, XU Lu, LIU Long. Feature selection based on SVM-RFE and particle swarm optimization[J]. Journal of Chinese Computer Systems, 2015, 36(8):1865-1868.
[13] HUANG Chenglung, WANG Chiehjen. A GA-based feature selection and parameters optimization for support vector machines[J]. Expert Systems with Applications, 2006, 31(2):231-240.
[14] ZHANG Xiaoli, CHEN Xuefeng, HE Zhengjia, et al. An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Systems With Applications, 2010, 37(9): 6618-6628.
[15] SUBASI A. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders[J]. Computers in Biology & Medicine, 2013, 43(5): 576-586.
[16] BAO Yukun, HU Zhongyi, XIONG Tao. A PSO and pattern search based memetic algorithm for SVMs parameters optimization[J]. Neurocomputing, 2014, 117(1): 98-106.
[17] ZHANG Qiantu, FANG Liqing, MA Leilei, et al. Research on parameters optimization of SVM based on improved fruit fly optimization algorithm[J]. International Journal of Computer Theory and Engineering, 2016, 8(6): 500-505.
[18] KARABOGA D, BASTURK B. On the performance of artificial bee colony(ABC)algorithm[J]. Applied Soft Computing, 2008, 8(1): 687-697.
[19] KARABOGA D, GORKKEMLI B, OZTURK C, et al. A comprehensive survey: artificial bee colony(ABC)algorithm and applications[J]. Artificial Intelligence Review, 2014, 42(1): 21-57.
[20] KIRAN M S, BABALIK A. Improved artificial bee colony algorithm for continuous optimization problems[J]. Journal of Computer & Communications, 2014, 02(4): 108-116.
[21] SECUI D C. A new modified artificial bee colony algorithm for the economic dispatch problem[J]. Energy Conversion & Management, 2015, 89(89): 43-62.
[22] 秦全德,程适,李丽,等.人工蜂群算法研究综述[J]. 智能系统学报,2014,9(2):127-135. QIN Quande, CHENG Shi, LI Li, et al. Artificial bee colony algorithm: a survey[J]. CAAI Transactions on Intelligent Systems, 2014, 9(2): 127-135.
[23] KARABOGA D, AKAY B. A comparative study of artificial bee colony algorithm[J]. Applied Mathematics & Computation, 2009, 214(1): 108-132.
[24] TSAI H C. Integrating the artificial bee colony and bees algorithm to face constrained optimization problems[J]. Information Sciences, 2014, 258(3): 80-93.
[25] SATHYANARAYANA S V, AMARAPPA S. Data classification using support vector machine(SVM), a simplified approach[J]. International Journal of Electronics & Computer Science Engineering, 2014, 3(4): 435-445.
[26] 刘岩,李幼军,陈萌. 基于EMD和SVM的抑郁症静息态脑电信号分类研究[J]. 山东大学学报(工学版), 2017, 47(3): 21-26. LIU Yan, LI Youjun, CHEN Meng. Research on the classification of resting state EEG signal between depression patients and normal controls by EMD and SVM methods[J]. Journal of Shandong University(Engineering Science), 2017, 47(3): 21-26.
[27] LI Meng, YI Liangzhong, GAO Zhisheng, et al. Support vector machine(SVM)based on membrane computing optimization and the application for C-band radio abnormal signal identification[J]. Journal of Information & Computational Science, 2014, 11(11): 3683-3693.
[28] 李颖新,阮晓钢. 基于支持向量机的肿瘤分类特征基因选取[J]. 计算机研究与发展, 2005, 42(10): 1796-1801. LI Yingxin, RUAN Xiaogang. Feature selection for cancer classification based on Support Vector Machine[J]. Journal of Computer Research and Development, 2005, 42(10): 1796-1801.
[29] YU Lei, LIU Huan. Efficient feature selection via analysis of relevance and redundancy[J]. Journal of Machine Learning Research, 2004, 5(12): 1205-1224.
[30] CHANG Chihchung, LIN Chihjen. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems & Technology, 2011, 2(3): 1-39.
[31] 梁兴建,詹志辉. 基于双模式变异策略的改进遗传算法[J]. 山东大学学报(工学版), 2014, 44(6): 1-7. LIANG Xingjian, ZHAN Zhihui. Improved genetic algorithm based on the dual-mode mutation strategy[J]. Journal of Shandong University(Engineering Science), 2014, 44(6): 1-7.
[32] LIU Yihui. Cancer identification based on DNA microarray data[C] //Processdings of the International Conference on Emerging Technologies in Knowledge Discovery and Data Mining. Nanjing, China: Springer-Verlag, 2007:153-161.
[33] ZHANG Shanwen, HUANG Deshuang, WANG Shulin. A method of tumor classification based on wavelet packet transforms and neighborhood rough set[J]. Computers in Biology & Medicine, 2010, 40(4): 430-437.
[1] HAN Xueshan, WANG Junxiong, SUN Donglei, LI Wenbo, ZHANG Xinyi, WEI Zhiqing. Nodal load forecasting method considering spatial correlation and redundancy [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(6): 7-12.
[2] LI Sushu, WANG Shitong, LI Tao. A feature selection method based on LS-SVM and fuzzy supplementary criterion [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(3): 34-42.
[3] LIU Jie, YANG Peng, LYU Wensheng, LIU Agudamu, LIU Junxiu. Prediction models of PM2.5 mass concentration based on meteorological factors [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(6): 76-83.
[4] LIU Xiaoyong. A semi-supervised method based on tree kernel for relationship extraction [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(2): 22-26.
[5] HAO Qingbo, MU Shaomin, YIN Chuanhuan, CHANG Tengteng, CUI Wenbin. An algorithm of fast local support vector machine based on clustering [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 13-18.
[6] LI Faquan, YANG Licai, YAN Hongbo. An emotion recognition method of multiphysiological information fusion based on PCA-SVM [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(6): 70-76.
[7] WANG Xiao-feng, SUI Ting-ting. Protein sequence identification based on improved TIGA-S4VM algorithm [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(1): 1-6.
[8] WANG Hao, HUA Ji-xue, FAN Xiao-shi. Intrusion detection technology based on twin support vector machine [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(6): 53-56.
[9] AN Chun-lin1, LU Hui-juan1,2*, ZHENG En-hui3, WANG Ming-yi1, LU Yi4. Gene expression data classification of the extreme learning machine with misclassification cost and rejection cost [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(4): 18-25.
[10] LI Fu-gui1,2, HUANG Tian-qiang1,2*, SU Li-chao1,2, SU Wei-feng3. Heterologous video copy-move forgery detection by fusing multiple features [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(4): 32-38.
[11] SHI Jun, ZHU Min. An optimization model for forecasting based on grey system and support vector machine [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(5): 7-11.
[12] ZHAO Jia-min, FENG Ai-min*, LIU Xue-jun. A new structured one-class support vector machine with local density embedding [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(4): 13-18.
[13] PAN Dong-yin, ZHU Fa, XU Sheng, YE Ning*. Feature selection of gene expression profiles of colon cancer [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(2): 23-29.
[14] SUN Peng, CHENG Shi-qing*, XIE Jing-si, ZHANG Hai-rui. CV-GA-SVM model for predicting the ash fusion point of a mixed biomass [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(2): 108-111.
[15] ZHOU Changhui1, HU Yongjian2, YU Shaopeng1. Design of a robust source scanner identification algorithm [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(2): 62-65.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!