您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (4): 21-27.doi: 10.6040/j.issn.1672-3961.1.2016.078

• • 上一篇    下一篇

基于在线特征选择的网络流异常检测

莫小勇,潘志松*,邱俊洋,余亚军,蒋铭初   

  1. 解放军理工大学指挥信息系统学院, 江苏 南京 210007
  • 收稿日期:2016-03-01 出版日期:2016-08-20 发布日期:2016-03-01
  • 通讯作者: 潘志松(1973— ),男,江苏南京人,教授,博士(后),主要研究方向为模式识别与机器学习.E-mail:panzs@nuaa.edu.cn E-mail:mxyliulangmeng@126.com
  • 作者简介:莫小勇(1993— ),男,贵州铜仁人,硕士研究生,主要研究方向为模式识别与机器学习.E-mail:mxyliulangmeng@126.com
  • 基金资助:
    国家自然科学基金资助项目(61473149)

Anomaly detection in network traffic based on online feature selection

MO Xiaoyong, PAN Zhisong*, QIU Junyang, YU Yajun, JIANG Mingchu   

  1. College of Command Information System, PLA University of Science and Technology, Nanjing 210007, Jiangsu, China
  • Received:2016-03-01 Online:2016-08-20 Published:2016-03-01

摘要: 针对传统批处理特征选择方法处理大规模骨干网数据流存在时间和空间的限制,提出基于在线特征选择(online feature selection, OFS)的网络流异常检测方法,该方法将在线思想融入线性分类模型,在特征选择过程中,首先使用在线梯度下降法更新分类器,并将其限制在L1球内,然后用截断函数控制特征选择的数量。研究结果表明,提出的方法能充分利用网络流的时序性特点,同时减少检测时间且准确率和批处理方法相近,能满足网络流异常检测的实时性要求,为网络流分类和异常检测提供一种全新的思路。

关键词: 网络流, 异常检测, 时序性, 在线特征选择, 批处理

Abstract: Traditional batch feature selection methods had the limitations in time and space when dealing large-scale backbone network traffic. A method based on online feature selection detection was proposed to address the limitations, which integrated the idea of online learning into the linear classification model. When selecting the features, the classifier was first updated by online gradient descent and projected to a L1 ball to ensure that the norm of the classifier is bounded, and then the truncate function was used to control the quantity of features. The analysis results showed that the proposed method could make a good use of the time-sequence property of traffic, reduce the time of anomaly detection and hold the similar accuracy when comparing with the batch methods, and meet the real-time demand of network traffic anomaly detection. The proposed method provided a new idea for the network traffic anomaly detection.

Key words: network traffic, anomaly detection, time-sequence, online feature selection, batch learning

中图分类号: 

  • TP181
[1] 杨龙琪. 网络安全态势感知关键技术研究[D]. 南京: 中国人民解放军理工大学, 2015. YANG Longqi. Key techniques of network security situation awareness[D]. Nanjing: PLA University of Science and Technology, 2015.
[2] MOORE A, ZUEV D, CROGAN M. Discriminators for use in flow-based classification[R]. UK: Computer Science Department, Queen Mary University of London, 2005.
[3] LI Wei, MOORE A. A machine learning approach for efficient traffic classification[C] //Proceedings of 15th International Symposium on MASCOTS'07. Istanbul, Turkey: IEEE Press, 2007:310-317.
[4] MOORE A, ZUEV D. Internet traffic classification using bayesian analysis techniques[J]. Acm Sigmetrics Performance Evaluation Review, 2005, 33(1):50-60.
[5] KIM H, CLAFFY K, FOMENKOV M, et al. Internet traffic classification demystified: myths, caveats, and the best practices[C] //Proceedings of the 2008 ACM CoNEXT Conference. Madrid, Spain: ACM Press, 2008:1-12.
[6] NGUYEN T, ARMITAGE G. A survey of techniques for internet traffic classification using machine learning[J]. Communications Surveys & Tutorials, 2008, 10(4):56-76.
[7] ZHAO Zheng, MORSTATTER F, SHARMA S, et al. Advancing feature selection research[R]. USA:School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, 2010.
[8] KATAKIS I, TSOUMAKAS G, VLAHAVAS I. On the utility of incremental feature selection for the classification of textual data streams[C] // Proceedings of the 10th Panhellenic Conference on Informatics. Volos, Greece: Springer Berlin Heidelberg Press, 2005:338-348.
[9] WENERSTROM B, GIRAUD-CARRIER C. Temporal data mining in dynamic feature spaces[C] // Proceedings of the Sixth ICDM'06. Hong Kong, China: IEEE Computer Society Press, 2006:1141-1145.
[10] MASUD M, CHEN Q, GAO J, et al. Classification and novel class detection of data streams in a dynamic feature space[C] // Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer Berlin Heidelberg Press, 2010:337-352.
[11] YANG Longqi, HU Guyu, LI Dong, et al. Anomaly detection based on efficient Euclidean projection[J]. Security and Communication Networks, 2015, 8(17):3229-3237.
[12] WIDROW B, HOFF M E. Adaptive switching circuits[C] // Proceedings of the 1960 IRE WESCON Convention Record. Los Angeles, USA: Institute of Radio Engineers Press, 1960:96-104.
[13] ROSENBLATT F. The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological Review, 1958, 65(6):386-408.
[14] FREUND Y, SCHAPIRE R E. Large margin classification using the perceptron algorithm[J]. Machine Learning, 1999, 37(3):277-296.
[15] WANG Jialei, ZHAO Peilin, HOI S C H, et al. Online feature selection and its applications[J]. Knowledge and Data Engineering, 2014, 26(3):698-710.
[16] ABERNETHY J, BARTLETT P, RAKHLIN A. Multitask learning with expert advice[C] // Proceedings of the 2007 COLT. San Diego, USA: Springer Berlin Heidelberg Press, 2007:484-498.
[17] LUGOSI G, PAPASPILIOPOULOS O, STOLTZ G. Online multi-task learning with hard constraints[C] // Proceedings of the COLT'09. Montreal, Canada: ACL Press, 2009:315-320.
[18] WARMUTH M K, KUZMIN D. Online variance minimization[J]. Machine Learning, 2012, 87(1):514-528.
[19] DEKEL O, GILAD-BACHRACH R, SHAMIR O, et al. Optimal distributed online prediction using mini-batches[J]. The Journal of Machine Learning Research, 2012, 13(1):165-202.
[20] JAIN P, KULIS B, DHILLON I S, et al. Online metric learning and fast similarity search[C] //Proceedings of the NIPS'09. Vancouver, Canada: NIPS Foundation Press, 2009:761-768.
[21] BORDES A, ERTEKIN S, WESTON J, et al. Fast kernel classifiers with online and active learning[J]. The Journal of Machine Learning Research, 2012, 6(3):1579-1619.
[22] DONOHO D L. Compressed sensing[J]. Information Theory, 2006, 52(4):1289-1306.
[23] FONTUGNE R, BORGNAT P, ABRY P, et al. Mawilab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking[C] //Proceedings of the 2010 ACM CoNEXT conference. Philadelphia, USA: ACM Press, 2010:1-12.
[1] 郑晓,陈鹤,周东傲,宫永顺. 基于视频描述增强和双流特征融合的视频异常检测方法[J]. 山东大学学报 (工学版), 2025, 55(5): 110-119.
[2] 郭芳,陈蕾,杨子文. 基于MGU的大规模IP骨干网络实时流量预测[J]. 山东大学学报 (工学版), 2019, 49(2): 88-95.
[3] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[4] 孙静宇,余雪丽,陈俊杰, 李鲜花. 采样特异性因子及异常检测[J]. 山东大学学报(工学版), 2010, 40(5): 56-59.
[5] 阳爱民1,周咏梅1,邓河2,周剑峰3. 一种网络流量分类特征的产生及选择方法[J]. 山东大学学报(工学版), 2010, 40(5): 1-7.
[6] 陈斌 陈松灿 潘志松 李斌. 异常检测综述[J]. 山东大学学报(工学版), 2009, 39(6): 13-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 王素玉,艾兴,赵军,李作丽,刘增文 . 高速立铣3Cr2Mo模具钢切削力建模及预测[J]. 山东大学学报(工学版), 2006, 36(1): 1 -5 .
[2] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[3] 孔祥臻,刘延俊,王勇,赵秀华 . 气动比例阀的死区补偿与仿真[J]. 山东大学学报(工学版), 2006, 36(1): 99 -102 .
[4] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .
[5] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[6] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[7] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[8] 浦剑1 ,张军平1 ,黄华2 . 超分辨率算法研究综述[J]. 山东大学学报(工学版), 2009, 39(1): 27 -32 .
[9] 王丽君,黄奇成,王兆旭 . 敏感性问题中的均方误差与模型比较[J]. 山东大学学报(工学版), 2006, 36(6): 51 -56 .
[10] 孙殿柱,朱昌志,李延瑞 . 散乱点云边界特征快速提取算法[J]. 山东大学学报(工学版), 2009, 39(1): 84 -86 .