山东大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (5): 195-202.doi: 10.6040/j.issn.1672-3961.0.2017.180
姚宇,冯健*,张化光,韩克镇
YAO Yu, FENG Jian*, ZHANG Huaguang, HAN Kezhen
摘要: 为了解决训练样本数据集中正类、负类样本不平衡的问题,提出一种考虑负类样本信息的加权超椭球体支持向描述方法(weighted hyper-ellipsoidal support vector data description with negative samples, WNESVDD)。 该方法首先引入马氏距离,充分考虑样本分布信息,同时利用正类、负类样本信息建模,融合代价敏感学习思想对不同类样本赋予不同权重。研究结果表明,所提方法可有效减少决策边界包围的空白区域,更好地调整决策边界,而且数据集的利用率明显提高。所提方法应用在University of California at Irvine(UCI)数据集和半导体工业过程数据上的试验结果证明,所提方法具有较强的异常检测能力,相比于同类方法,漏报误报明显减少。
中图分类号:
[1] CHANDOLA V, BANERJEE A, KUMAR V. Anomaly detection: a survey[J]. ACM Computing Surveys(CSUR), 2009, 41(3): 75-79. [2] 陈斌, 陈松灿, 潘志松,等. 异常检测综述[J]. 山东大学学报(工学版), 2009, 39(6):13-23. CHEN Bin, CHEN Songcan, PAN Zhisong, et al. Survey of outlier detection technologies[J]. Journal of Shandong University(Engineering Science), 2009, 39(6): 13-23. [3] 莫小勇, 潘志松, 邱俊洋,等. 基于在线特征选择的网络流异常检测[J]. 山东大学学报(工学版), 2016, 46(4): 21-27. MO Xiaoyong, PAN Zhisong, QIU Junyang, et al. Anomaly detection in network traffic based on online feature selection[J]. Journal of Shandong University(Engineering Science), 2016, 46(4): 21-27. [4] DHAR S, CHERKASSKY V. Development and evaluation of cost-sensitive universum-SVM[J]. IEEE Transactions on Cybernetics, 2015, 45(4): 806-818. [5] MALDONADO S, LÓPEZ J. Imbalanced data classification using second-order cone programming support vector machines[J]. Pattern Recognition, 2014, 47(5): 2070-2079. [6] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357. [7] WANG S, LI Z, CHAO W, et al. Applying adaptive over-sampling technique based on data density and cost-sensitive SVM to imbalanced learning[C] //International Joint Conference on Neural Networks(IJCNN). Brisbane, Australia: IEEE, 2012: 1-8. [8] ANAND A, PUGALENTHI G, FOGEL G B, et al. An approach for classification of highly imbalanced data using weighting and under-sampling[J]. Amino Acids, 2010, 39(5): 1385-1391. [9] 梁玮, 陶亮, 张光先,等. 基于特征提取和极值搜索的焊接缺陷检测算法[J]. 山东大学学报(工学版), 2014, 44(3): 48-51. LIANG Wei, TAO Liang, ZHANG Guangxian, et al. Welding defect detection method based on feature extraction and extreme searching[J]. Journal of Shandong University(Engineering Science), 2014, 44(3): 48-51. [10] MALDONADO S, WEBER R, FAMILI F. Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines[J]. Information Sciences, 2014, 286: 228-246. [11] FENG J, WANG J, ZHANG H, et al. Fault diagnosis method of joint fisher discriminant analysis based on the local and global manifold learning and its kernel version[J]. IEEE Transactions on Automation Science and Engineering, 2016, 13(1): 122-133. [12] ZHANG Y, WANG D. A cost-sensitive ensemble method for class-imbalanced datasets[J]. Abstract and Applied Analysis, 2013, 2013(1): 900-914. [13] TAX D M J. One-class classification: concept-learning in the absence of counter-examples[D]. Delft:Delft University of Technology, 2001. [14] XANTHOPOULOS P, RAZZAGHI T. A weighted support vector machine method for control chart pattern recognition[J]. Computers & Industrial Engineering, 2014, 70(4): 134-149. [15] CAO P, ZHAO D, ZAIANE O. An optimized cost-sensitive SVM for imbalanced data learning[C] //Pacific-Asia Conference on Knowledge Discovery and Data Mining. Gold Coast, Australia: Springer Berlin Heidelberg, 2013: 280-292. [16] TANG M Z, YANG C H, GUI W H. Fault detection based on modified QBC and CS-SVM[J]. Control & Decision, 2012, 27(10): 1489-1493. [17] TAX D M J, DUIN R P W. Support vector data description[J]. Machine Learning, 2004, 54(1): 45-66. [18] LIU B, XIAO Y, CAO L, et al. SVDD-based outlier detection on uncertain data[J]. Knowledge and Information Systems, 2013, 34(3): 597-618. [19] YANG Min, ZHANG Huanguo, FU Jianming, et al. Anomaly intrusion detection method based on SVDD[J]. Computer Engineering, 2005, 31(3): 39-42. [20] 张思懿, 王士同. 核化空间深度间距的特征提取方法[J]. 山东大学学报(工学版), 2012, 42(3): 45-51. ZHANG Siyi, WANG Shitong. Kernelized spatial depth function for the feature extraction method[J]. Journal of Shandong University(Engineering Science), 2012, 42(3): 45-51. [21] VAPNIK V N. An overview of statistical learning theory[J]. IEEE Transactions on Neural Networks, 1999, 10(5): 988-999. [22] HE Y, PI D. Anomaly detection algorithm for helicopter rotor based on STFT and SVDD[C] //International Conference on Cloud Computing and Security. Nanjing, China: Springer International Publishing, 2016: 383-393. [23] CHEN G, ZHANG X, WANG Z J, et al. Robust support vector data description for outlier detection with noise or uncertain data[J]. Knowledge-Based Systems, 2015, 90(C): 129-137. [24] ZHENG S. Smoothly approximated support vector domain description[J]. Pattern Recognition, 2016, 49(C): 55-64. [25] GHASEMIGOL M, MONSEFI R, YAZDI H S. Intrusion detection by new data description method[C] //International Conference on Intelligent Systems, Modelling and Simulation. Liverpool, United Kingdom: IEEE, 2010: 1-5. [26] RAJASEGARAR S, LECKIE C, BEZDEK J C, et al. Centered hyperspherical and hyperellipsoidal one-class support vector machines for anomaly detection in sensor networks[J]. IEEE Transactions on Information Forensics and Security, 2010, 5(3): 518-533. |
[1] | 莫小勇,潘志松,邱俊洋,余亚军,蒋铭初. 基于在线特征选择的网络流异常检测[J]. 山东大学学报(工学版), 2016, 46(4): 21-27. |
[2] | 陶志伟,张莉. 基于马氏距离的分段矢量量化时间序列分类[J]. 山东大学学报(工学版), 2016, 46(3): 51-57. |
[3] | 张思懿1,2,王士同1*. 核化空间深度间距的特征提取方法[J]. 山东大学学报(工学版), 2012, 42(3): 45-51. |
[4] | 孙静宇,余雪丽,陈俊杰, 李鲜花. 采样特异性因子及异常检测[J]. 山东大学学报(工学版), 2010, 40(5): 56-59. |
[5] | 冯爱民1,刘学军1,陈斌2. 结构大间隔单类分类器[J]. 山东大学学报(工学版), 2010, 40(3): 6-12. |
[6] | 陈斌 陈松灿 潘志松 李斌. 异常检测综述[J]. 山东大学学报(工学版), 2009, 39(6): 13-23. |
|