您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (3): 34-42.doi: 10.6040/j.issn.1672-3961.0.2016.308

• • 上一篇    下一篇

基于LS-SVM与模糊补准则的特征选择方法

李素姝,王士同,李滔   

  1. 江南大学数字媒体学院, 江苏 无锡 214122
  • 收稿日期:2016-11-07 出版日期:2017-06-20 发布日期:2016-11-07
  • 作者简介:李素姝(1993— ),女,江苏南通人,硕士研究生,主要研究方向为人工智能,模式识别. E-mail:lss85318977@163.com
  • 基金资助:
    国家自然科学基金资助项目(61170122)

A feature selection method based on LS-SVM and fuzzy supplementary criterion

LI Sushu, WANG Shitong, LI Tao   

  1. School of Digital Media, Jiangnan University, Wuxi 214122, Jiangsu, China
  • Received:2016-11-07 Online:2017-06-20 Published:2016-11-07

摘要: 针对传统特征选择算法采用单一度量的方式难以兼顾泛化性能和降维性能的不足,提出新的特征选择算法(least squares support vector machines and fuzzy supplementary criterion, LS-SVM-FSC)。通过核化的最小二乘支持向量机(least squares support vector machines, LS-SVM)对每个特征的样本进行分类,使用新的模糊隶属度函数获得每个样本对其所属类的模糊隶属度,使用模糊补准则选择具有最小冗余最大相关的特征子集。试验表明:与其他10个特征选择方法与7个隶属度决定方法相比,所提算法在9个数据集上都具有很高的分类准确率和很强的降维性能,且在高维数据集中的学习速度依然很快。

关键词: 最小二乘支持向量机, 模糊隶属度函数, 分类, 模糊补准则, 特征选择

Abstract: Traditional feature selection algorithm used a single scalar metric such that it might become difficult to achieve a trade-off between generalization performance and dimension reduction at the same time. A new feature selection algorithm called LS-SVM-FSC was proposed to circumvent this shortcoming. The kernel-based least squares support vector machines was used to train a set of binary classifiers on each single feature and a kind of new fuzzy membership function was used to obtain fuzzy membership value of each pattern belonging to its class. Based on a new fuzzy supplementary criterion, the features with minimal redundancy and maximal relevance was selected. Experiments indicated that the proposed algorithm had high classification accuracy and strong dimension reduction capability on nine datasets. In particular, it still kept fast learning speed for high-dimensional datasets, in contrast to other ten feature selection methods and seven degree determination methods.

Key words: feature selection, fuzzy supplementary criterion, least squares support vector machines, classification, fuzzy membership degree function

中图分类号: 

  • TP181
[1] JAIN A, ZONGKER D. Feature selection: evaluation, application, and small sample performance[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1997, 19(2):153-158.
[2] TAN M, PU J, ZHENG B. Optimization of breast mass classification using sequential forward floating selection(SFFS)and a support vector machine(SVM)model[J]. International Journal of Computer Assisted Radiology & Surgery, 2014, 9(6):76-82.
[3] NARENDRA P M, FUKUNAGA K. A branch and bound algorithm for feature subset selection[J]. Electronics Letters, 2010, 26(9):917-922.
[4] ROBNIK-SIKONJA M, KONONENKO I. Theoretical and empirical analysis of ReliefF and RReliefF[J]. Machine Learning, 2003, 53(1-2):23-69.
[5] MITRA P, MURTHY C A, PAL S K, et al. Unsupervised feature selection using feature similarity[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2002, 24(3):301-312.
[6] LI D, PEDRYCZ W, PIZZI N J. Fuzzy wavelet packet based feature extraction method and its application to biomedical signal classification[J]. IEEE Transactions on Bio-medical Engineering, 2005, 52(6):1132-1139.
[7] OOI C H, CHETTY M, TENG S W. Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets[J]. Data Mining & Knowledge Discovery, 2007, 14(3):329-366.
[8] ZHANG D, CHEN S, ZHOU Z H. Constraint score:a new filter method for feature selection with pairwise constraints[J]. Pattern Recognition, 2008, 41(5):1440-1451.
[9] MOUSTAKIDIS S P, THEOCHARIS J B. SVM-FuzCoC: a novel SVM-based feature selection method using a fuzzy complementary criterion[J]. Pattern Recognition, 2010, 43(11):3712-3729.
[10] CHANG C C, LIN C J. LIBSVM: A library for support vector machines[J]. Acm Transactions on Intelligent Systems & Technology, 2011, 2(3):389-396.
[11] SUYKENS J, VANDEWALLE J. Least squares support vector machine classifiers[J]. Neural Processing Letters,1999,9(3):293-300.
[12] ZHANG N, ZHOU Y, HUANG T, et al. Discriminating between lysine sumoylation and lysine acetylation using mRMR feature selection and analysis[J]. Plos One, 2014, 9(9):e107464.
[13] 张战成,王士同,邓赵红,等. 支持向量机的一种快速分类算法[J]. 电子与信息学报, 2011, 33(9):2181-2186. ZHANG Zhancheng, WANG Shitong, DENG Zhaohong, et al. Fast decision using SVM for incoming samples[J]. Journal of Electronics and Information Technolog, 2011, 33(9):2181-2186.
[14] 李欢,王士同. 适合多观测样本的基于LS-SVM的新分类算法[J]. 计算机工程与应用, 2016, 52(1):113-119. LI Huan, WANG Shitong. Novel LS-SVM based classification algorithm for multi-observation sets[J]. Computer Engineering and Applications, 2016, 52(1):113-119.
[15] 苟博,黄贤武. 支持向量机多类分类方法[J]. 数据采集与处理, 2006, 21(3):334-339. GOU Bo, HUANG Xianwu. SVM multi-class classification[J]. Journal of Data Acquisition and Processing, 2006, 21(3):334-339.
[16] AZADEH A, ARYAEE M, ZARRIN M, et al. A novel performance measurement approach based on trust context using fuzzy T-norm and S-norm operators: the case study of energy consumption[J]. Energy Exploration & Exploitation, 2016, 34(4):561-585.
[17] DERELI T, BAYKASOGLU A, ALTUN K, et al. Industrial applications of type-2 fuzzy sets and systems: a concise review[J]. Computers in Industry, 2011, 62(2):125-137.
[18] BHATT R B, GOPAL M. On the extension of functional dependency degree from crisp to fuzzy partitions[J]. Pattern Recognition Letters, 2006, 27(5):487-491.
[19] PLATT J C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[J]. Advances in Large Margin Classifiers, 2000, 10(4):61-74.
[20] MADEVSKA-BOGDANOVA A, NIKOLIK D, CURFS L. Probabilistic SVM outputs for pattern recognition using analytical geometry[J]. Neurocomputing, 2004, 62(1):293-303.
[21] LIU Y, GUO J, HU G, et al. Gene prediction in metagenomic fragments based on the SVM algorithm[J]. Bmc Bioinformatics, 2013, 14(2):1738-1742.
[22] BOUCHAFFRA D, GOVINDARAJU V, SRIHARI S. A methodology for mapping scores to probabilities[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1999, 21(9):923-927.
[23] STOUT Q F. Isotonic regression via partitioning[J]. Algorithmica, 2013, 66(1):93-112.
[1] 牟廉明. 自适应特征选择加权k子凸包分类[J]. 山东大学学报(工学版), 2018, 48(5): 32-37.
[2] 张璞,刘畅,王永. 基于特征融合和集成学习的建议语句分类模型[J]. 山东大学学报(工学版), 2018, 48(5): 47-54.
[3] 曹雅,邓赵红,王士同. 基于单调约束的径向基函数神经网络模型[J]. 山东大学学报(工学版), 2018, 48(3): 127-133.
[4] 龙柏,曾宪宇,李徵,刘淇. 电商商品嵌入表示分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 17-24.
[5] 谢志峰,吴佳萍,马利庄. 基于卷积神经网络的中文财经新闻分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 34-39.
[6] 王婷婷,翟俊海,张明阳,郝璞. 基于HBase和SimHash的大数据K-近邻算法[J]. 山东大学学报(工学版), 2018, 48(3): 54-59.
[7] 陈嘉杰,王金凤. 基于蚁群算法求解Choquet模糊积分模型[J]. 山东大学学报(工学版), 2018, 48(3): 81-87.
[8] 王换,周忠眉. 一种基于聚类的过抽样算法[J]. 山东大学学报(工学版), 2018, 48(3): 134-139.
[9] 叶明全,高凌云,万春圆. 基于人工蜂群和SVM的基因表达数据分类[J]. 山东大学学报(工学版), 2018, 48(3): 10-16.
[10] 王磊,邓晓刚,曹玉苹,田学民. 基于MLFDA的化工过程故障模式分类方法[J]. 山东大学学报(工学版), 2017, 47(5): 179-186.
[11] 何其佳,刘振丙,徐涛,蒋淑洁. 基于LBP和极限学习机的脑部MR图像分类[J]. 山东大学学报(工学版), 2017, 47(2): 86-93.
[12] 郭超,杨燕,江永全,宋祎. 基于多视图分类集成的高铁工况识别[J]. 山东大学学报(工学版), 2017, 47(1): 7-14.
[13] 方昊,李云. 基于多次随机欠采样和POSS方法的软件缺陷检测[J]. 山东大学学报(工学版), 2017, 47(1): 15-21.
[14] 陈泽华,尚晓慧,柴晶. 基于混合Hausdorff距离的多示例学习近邻分类器[J]. 山东大学学报(工学版), 2016, 46(6): 15-22.
[15] 莫小勇,潘志松,邱俊洋,余亚军,蒋铭初. 基于在线特征选择的网络流异常检测[J]. 山东大学学报(工学版), 2016, 46(4): 21-27.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!