您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (4): 21-27.doi: 10.6040/j.issn.1672-3961.1.2016.078

• • 上一篇    下一篇

基于在线特征选择的网络流异常检测

莫小勇,潘志松*,邱俊洋,余亚军,蒋铭初   

  1. 解放军理工大学指挥信息系统学院, 江苏 南京 210007
  • 收稿日期:2016-03-01 出版日期:2016-08-20 发布日期:2016-03-01
  • 通讯作者: 潘志松(1973— ),男,江苏南京人,教授,博士(后),主要研究方向为模式识别与机器学习.E-mail:panzs@nuaa.edu.cn E-mail:mxyliulangmeng@126.com
  • 作者简介:莫小勇(1993— ),男,贵州铜仁人,硕士研究生,主要研究方向为模式识别与机器学习.E-mail:mxyliulangmeng@126.com
  • 基金资助:
    国家自然科学基金资助项目(61473149)

Anomaly detection in network traffic based on online feature selection

MO Xiaoyong, PAN Zhisong*, QIU Junyang, YU Yajun, JIANG Mingchu   

  1. College of Command Information System, PLA University of Science and Technology, Nanjing 210007, Jiangsu, China
  • Received:2016-03-01 Online:2016-08-20 Published:2016-03-01

摘要: 针对传统批处理特征选择方法处理大规模骨干网数据流存在时间和空间的限制,提出基于在线特征选择(online feature selection, OFS)的网络流异常检测方法,该方法将在线思想融入线性分类模型,在特征选择过程中,首先使用在线梯度下降法更新分类器,并将其限制在L1球内,然后用截断函数控制特征选择的数量。研究结果表明,提出的方法能充分利用网络流的时序性特点,同时减少检测时间且准确率和批处理方法相近,能满足网络流异常检测的实时性要求,为网络流分类和异常检测提供一种全新的思路。

关键词: 网络流, 异常检测, 时序性, 在线特征选择, 批处理

Abstract: Traditional batch feature selection methods had the limitations in time and space when dealing large-scale backbone network traffic. A method based on online feature selection detection was proposed to address the limitations, which integrated the idea of online learning into the linear classification model. When selecting the features, the classifier was first updated by online gradient descent and projected to a L1 ball to ensure that the norm of the classifier is bounded, and then the truncate function was used to control the quantity of features. The analysis results showed that the proposed method could make a good use of the time-sequence property of traffic, reduce the time of anomaly detection and hold the similar accuracy when comparing with the batch methods, and meet the real-time demand of network traffic anomaly detection. The proposed method provided a new idea for the network traffic anomaly detection.

Key words: network traffic, anomaly detection, time-sequence, online feature selection, batch learning

中图分类号: 

  • TP181
[1] 杨龙琪. 网络安全态势感知关键技术研究[D]. 南京: 中国人民解放军理工大学, 2015. YANG Longqi. Key techniques of network security situation awareness[D]. Nanjing: PLA University of Science and Technology, 2015.
[2] MOORE A, ZUEV D, CROGAN M. Discriminators for use in flow-based classification[R]. UK: Computer Science Department, Queen Mary University of London, 2005.
[3] LI Wei, MOORE A. A machine learning approach for efficient traffic classification[C] //Proceedings of 15th International Symposium on MASCOTS'07. Istanbul, Turkey: IEEE Press, 2007:310-317.
[4] MOORE A, ZUEV D. Internet traffic classification using bayesian analysis techniques[J]. Acm Sigmetrics Performance Evaluation Review, 2005, 33(1):50-60.
[5] KIM H, CLAFFY K, FOMENKOV M, et al. Internet traffic classification demystified: myths, caveats, and the best practices[C] //Proceedings of the 2008 ACM CoNEXT Conference. Madrid, Spain: ACM Press, 2008:1-12.
[6] NGUYEN T, ARMITAGE G. A survey of techniques for internet traffic classification using machine learning[J]. Communications Surveys & Tutorials, 2008, 10(4):56-76.
[7] ZHAO Zheng, MORSTATTER F, SHARMA S, et al. Advancing feature selection research[R]. USA:School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, 2010.
[8] KATAKIS I, TSOUMAKAS G, VLAHAVAS I. On the utility of incremental feature selection for the classification of textual data streams[C] // Proceedings of the 10th Panhellenic Conference on Informatics. Volos, Greece: Springer Berlin Heidelberg Press, 2005:338-348.
[9] WENERSTROM B, GIRAUD-CARRIER C. Temporal data mining in dynamic feature spaces[C] // Proceedings of the Sixth ICDM'06. Hong Kong, China: IEEE Computer Society Press, 2006:1141-1145.
[10] MASUD M, CHEN Q, GAO J, et al. Classification and novel class detection of data streams in a dynamic feature space[C] // Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer Berlin Heidelberg Press, 2010:337-352.
[11] YANG Longqi, HU Guyu, LI Dong, et al. Anomaly detection based on efficient Euclidean projection[J]. Security and Communication Networks, 2015, 8(17):3229-3237.
[12] WIDROW B, HOFF M E. Adaptive switching circuits[C] // Proceedings of the 1960 IRE WESCON Convention Record. Los Angeles, USA: Institute of Radio Engineers Press, 1960:96-104.
[13] ROSENBLATT F. The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological Review, 1958, 65(6):386-408.
[14] FREUND Y, SCHAPIRE R E. Large margin classification using the perceptron algorithm[J]. Machine Learning, 1999, 37(3):277-296.
[15] WANG Jialei, ZHAO Peilin, HOI S C H, et al. Online feature selection and its applications[J]. Knowledge and Data Engineering, 2014, 26(3):698-710.
[16] ABERNETHY J, BARTLETT P, RAKHLIN A. Multitask learning with expert advice[C] // Proceedings of the 2007 COLT. San Diego, USA: Springer Berlin Heidelberg Press, 2007:484-498.
[17] LUGOSI G, PAPASPILIOPOULOS O, STOLTZ G. Online multi-task learning with hard constraints[C] // Proceedings of the COLT'09. Montreal, Canada: ACL Press, 2009:315-320.
[18] WARMUTH M K, KUZMIN D. Online variance minimization[J]. Machine Learning, 2012, 87(1):514-528.
[19] DEKEL O, GILAD-BACHRACH R, SHAMIR O, et al. Optimal distributed online prediction using mini-batches[J]. The Journal of Machine Learning Research, 2012, 13(1):165-202.
[20] JAIN P, KULIS B, DHILLON I S, et al. Online metric learning and fast similarity search[C] //Proceedings of the NIPS'09. Vancouver, Canada: NIPS Foundation Press, 2009:761-768.
[21] BORDES A, ERTEKIN S, WESTON J, et al. Fast kernel classifiers with online and active learning[J]. The Journal of Machine Learning Research, 2012, 6(3):1579-1619.
[22] DONOHO D L. Compressed sensing[J]. Information Theory, 2006, 52(4):1289-1306.
[23] FONTUGNE R, BORGNAT P, ABRY P, et al. Mawilab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking[C] //Proceedings of the 2010 ACM CoNEXT conference. Philadelphia, USA: ACM Press, 2010:1-12.
[1] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[2] 孙静宇,余雪丽,陈俊杰, 李鲜花. 采样特异性因子及异常检测[J]. 山东大学学报(工学版), 2010, 40(5): 56-59.
[3] 阳爱民1,周咏梅1,邓河2,周剑峰3. 一种网络流量分类特征的产生及选择方法[J]. 山东大学学报(工学版), 2010, 40(5): 1-7.
[4] 陈斌 陈松灿 潘志松 李斌. 异常检测综述[J]. 山东大学学报(工学版), 2009, 39(6): 13-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[2] 岳远征. 远离平衡态玻璃的弛豫[J]. 山东大学学报(工学版), 2009, 39(5): 1 -20 .
[3] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[4] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[5] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[6] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[7] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[8] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[9] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[10] 赵科军 王新军 刘洋 仇一泓. 基于结构化覆盖网的连续 top-k 联接查询算法[J]. 山东大学学报(工学版), 2009, 39(5): 32 -37 .