一种基于改进ReliefF算法的入侵检测模型

doi:10.6040/j.issn.1672-3961.0.2022.136

摘要/Abstract

摘要：

针对现有入侵检测算法中特征提取不充分、未考虑特征权重的影响、模型分类不够精确等问题，提出一种基于改进ReliefF算法的入侵检测模型。通过优化入侵数据特征权重计算，提出改进的ReliefF算法；根据计算特征的Pearson相关系数，建立特征相关性量表。只保留其中一个相关性高的特征，以实现特征的二次优化；对最优特征子集分别使用决策树(decision tree，DT)、k-最近邻(k-nearest neighbor，KNN)、随机森林(random forest，RF)、朴素贝叶斯(naive bayes，NB)和支持向量机(support vector machine，SVM)5种分类器评价该方法的分类性能和准确性。在NSL-KDD和UNSW-NB15两个数据集上的试验结果表明，该方法不仅具有较好的检测性能，还能有效降低特征维度，对分类器的计算复杂度有积极的影响。

关键词: ReliefF算法, 权重优化, 特征选择, 入侵检测, 分类

Abstract:

Aiming at the problems of insufficient feature extraction in the existing intrusion detection algorithms, the influence of feature weights was not considered, and the model classification was not accurate enough, an intrusion detection model based on the improved ReliefF algorithm was proposed. By optimizing the calculation of the feature weight of the intrusion data, an improved algorithm of ReliefF was proposed, based on the Pearson correlation coefficient of the calculated feature, a feature correlation scale was established. Only one of the features with high correlation was retained to realize the secondary optimization of the features, and finally decision tree, k-nearest neighbor, random forest, naive bayes and support vector machine classifier were used to evaluate the classification performance and accuracy. Experimental results on NSL-KDD and UNSW-NB15 data sets showed that this method could not only effectively reduce the feature dimension, but also had better detection performance, which had a positive effect on the computational complexity of the classifier.

Key words: ReliefF algorithm, weight optimization, feature selection, intrusion detection, classification

中图分类号:

TP3-0

刘财辉,周琪,叶晓文. 一种基于改进ReliefF算法的入侵检测模型[J]. 山东大学学报 (工学版), 2023, 53(2): 1-10.

Caihui LIU,Qi ZHOU,Xiaowen YE. An intrusion detection model based on improved ReliefF algorithm[J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 1-10.

图/表 13

图1

图2

表1

表2

图3

表3

表4

图4

表5

表6

图5

图6

表7

参考文献 40

1	SULTANA N , CHILAMKURTI N , PENG W , et al. Survey on SDN based network intrusion detection system using machine learning approaches[J]. Peer-to-Peer Networking and Applications, 2019, 12 (2): 493- 501. doi: 10.1007/s12083-017-0630-0
2	SVENMARCK P, LUOTSINEN L, NILSSON M, et al. Possibilities and challenges for artificial intelligence in military applications[C]//Proceedings of the NATO Big Data and Artificial Intelligence for Military Decision Making Specialists' Meeting. Bordeaux, France: Computer Science, 2018: 1-16.
3	STAMPAR M, FERTALJ K. Artificial intelligence in network intrusion detection[C]//Proceedings of 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). Opatija, Croatia: IEEE, 2015: 1318-1323.
4	LEE W, STOLFO S J, CHAN P K, et al. Real time data mining-based intrusion detection[C]//Proceedings of DARPA Information Survivability Conference and Exposition Ⅱ(DISCEX'01). Anaheim, USA: IEEE, 2001: 89-100.
5	KUMAR G , KUMAR K , SACHDEVA M . The use of artificial intelligence based techniques for intrusion detection: a review[J]. Artificial Intelligence Review, 2010, 34 (4): 369- 387. doi: 10.1007/s10462-010-9179-5
6	MEHDI S A, KHALID J, KHAYAM S A. Revisiting traffic anomaly detection using software defined networking[C]//Proceedings of International Workshop on Recent Advances in Intrusion Detection. Heidelberg, Germany: Springer, 2011: 161-180.
7	LAZAREVIC A, ERTOZ L, KUMAR V, et al. A comparative study of anomaly detection schemes in network intrusion detection[C]//Proceedings of the 2003 SIAM International Conference on Data Mining. Philadelphia, USA: SIAM, 2003: 25-36.
8	YE N , ZHANG Y , BORROR C M . Robustness of the Markov-chain model for cyber-attack detection[J]. IEEE Transactions on Reliability, 2004, 53 (1): 116- 123. doi: 10.1109/TR.2004.823851
9	NOVIKOV D, YAMPOLSKIY R V, REZNIK L. Anomaly detection based intrusion detection[C]//Proceedings of Third International Conference on Information Technology: New Generations (ITNG'06). Las Vegas, USA: IEEE, 2006: 420-425.
10	WANG Wei , DAI Hong , ZHAO Siqi . Intrusion detection method based on feature optimization and BP neural[J]. Computer Engineering and Design, 2021, 42 (10): 2755- 2761.
11	TOOSI A N , KAHANI M . A new approach to intrusion detection based on an evolutionary soft computing model using neuro-fuzzy classifiers[J]. Computer Communications, 2007, 30 (10): 2201- 2212. doi: 10.1016/j.comcom.2007.05.002
12	LAHRE M K , DHAR M T , SURESH D , et al. Analyze different approaches for ids using kdd 99 data set[J]. International Journal on Recent and Innovation Trends in Computing and Communication, 2013, 1 (8): 645- 651.
13	ZHANG Z , SHEN H . Application of online-training SVMs for real-time intrusion detection with different considerations[J]. Computer Communications, 2005, 28 (12): 1428- 1442. doi: 10.1016/j.comcom.2005.01.014
14	TAN S. An intrusion detection method based on stacked autoencoder and support vector machine[C]//Proceedings of Journal of Physics: Conference Series. Xi'an, China: IOP, 2020: 1-7.
15	KHRAISAT A , GONDAL I , VAMPLEW P , et al. Hybrid intrusion detection system based on the stacking ensemble of c5 decision tree classifier and one class support vector machine[J]. Electronics, 2020, 9 (1): 173- 191. doi: 10.3390/electronics9010173
16	LIU W , CI L L , LIU L P . A new method of fuzzy support vector machine algorithm for intrusion detection[J]. Applied Sciences, 2020, 10 (3): 1065- 1085. doi: 10.3390/app10031065
17	ILGUN K , KEMMERER R A , PORRAS P A . State transition analysis: a rule-based intrusion detection approach[J]. IEEE Transactions on Software Engineering, 1995, 21 (3): 181- 199. doi: 10.1109/32.372146
18	LEE W, STOLFO S J, MOK K W. A data mining framework for building intrusion detection models[C]//Proceedings of the 1999 IEEE Symposium on Security and Privacy. Oakland, USA: IEEE, 1999: 120-132.
19	LOHIYA R , THAKKAR A . Intrusion detection using deep neural network with antirectifier layer[M]. Singapore: Springer, 2021: 89- 105.
20	LI L H, AHMAD R, TSAI W C, et al. A feature selection based DNN for intrusion detection system[C]//Proceedings of 2021 15th International Conference on Ubiquitous Information Management and Communication(IMCOM). Seoul, Korea: IEEE, 2021: 1-8.
21	FARRAHI S V , AHMADZADEH M . KCMC: a hybrid learning approach for network intrusion detection using K-means clustering and multiple classifiers[J]. International Journal of Computer Applications, 2015, 124 (9): 18- 23. doi: 10.5120/ijca2015905365
22	PALIWAL S , GUPTA R . Denial-of-service, probing & remote to user (R2L) attack detection using genetic algorithm[J]. International Journal of Computer Applications, 2012, 60 (19): 57- 62.
23	PENG K , LEUNG V , ZHENG L , et al. Intrusion detection system based on decision tree over big data in fog environment[J]. Wireless Communications and Mobile Computing, 2018, 2018 (1): 1- 10.
24	VIMALKUMAR K, RADHIKA N. A big data framework for intrusion detection in smart grids using apache spark[C]//Proceedings of 2017 International Conference on Advances in Computing, Communications and Infor-matics. Udupi, India: IEEE, 2017: 198-204.
25	GUO K, SUI L, QIU J, et al. From model to FPGA: software-hardware co-design for efficient neural network acceleration[C]//Proceedings of 2016 IEEE Hot Chips 28 Symposium (HCS). Cupertino, USA: IEEE, 2016: 1-27.
26	RAJAGOPAL S , KUNDAPUR P P , HAREESHA K S . A stacking ensemble for network intrusion detection using heterogeneous datasets[J]. Security and Communication Networks, 2020, 2020 (1): 1- 9. doi: 10.1016/S1353-4858(20)30001-5
27	BALAKRISHNAN S , VENKATALAKSHMI K , KANNAN A . Intrusion detection system using feature selection and classification technique[J]. International Journal of Computer Science and Application, 2014, 3 (4): 145- 151. doi: 10.14355/ijcsa.2014.0304.02
28	ZHANG Y, REN X, ZHANG J. Intrusion detection method based on information gain and ReliefF feature selection[C]// Proceedings of 2019 International Joint Conference on Neural Networks (IJCNN). Budapest, Hungary: IEEE, 2019: 1-5.
29	ZHANG J, ZHANG Y, LI K. A network intrusion detection model based on the combination of ReliefF and Borderline-SMOTE[C]//Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence. New York, USA: Association for Computing Machinery, 2020: 199-203.
30	KIRA K, RENDELL L A. The feature selection problem: traditional methods and a new algorithm[C]//Proceedings of the Tenth National Conference on Artificial Intelligence. San Jose, California: AAAI, 1992: 129-134.
31	KONONENKO L . Estimating attributes: analysis and extensions of Relief[J]. Lecture Notes in Computer Science, 1994, 784 (1): 171- 182.
32	马超. 基于ReliefF和改进乌鸦搜索优化的并行入侵检测方法[J]. 计算机应用研究, 2019, 36 (10): 3063- 3068.
	MA Chao . Parallel network intrusion detection method based on ReliefF and improved crow search optimization[J]. Application Research of Computers, 2019, 36 (10): 3063- 3068.
33	SUN L , KONG X , XU J , et al. A hybrid gene selection method based on ReliefF and ant colony optimization algorithm for tumor classification[J]. Scientific Reports, 2019, 9 (1): 1- 14. doi: 10.1038/s41598-018-37186-2
34	BENESTY J , CHEN J , HUANG Y , et al. Pearson correlation coefficient[M]. Berlin, Germany: Springer, 2009: 1- 4.
35	REVATHI S , MALATHI A . A detailed analysis on NSL-KDD dataset using various machine learning techniques for intrusion detection[J]. International Journal of Engineering Research & Technology, 2013, 2 (12): 1848- 1853.
36	ROY A, SINGH K J. Multi-classification of UNSW-NB15 dataset for network anomaly detection system[C]//Proceedings of International Conference on Communication and Computational Technologies. Singapore: Springer, 2021: 429-451.
37	张师鹏, 李永忠, 杜祥通. 基于半监督学习和三支决策的入侵检测模型[J]. 计算机应用, 2021, 41 (9): 2602- 2608.
	ZHANG Shipeng , LI Yongzhong , DU Xiangtong . Intrusion detection model based on semi-supervised learning and three-way decision[J]. Journal of Computer Applications, 2021, 41 (9): 2602- 2608.
38	吴启睿, 黄树成. 结合卷积神经网络和三支决策的入侵检测算法[J]. 计算机工程与应用, 2022, 58 (13): 119- 127.
	WU Qirui , HUANG Shucheng . Intrusion detection algorithm combining convolutional neural network and three-branch decision[J]. Computer Engineering and Applications, 2022, 58 (13): 119- 127.
39	王振东, 张林, 杨书新, 等. 面向入侵检测的Taylor神经网络构建与分析[J/OL]. 计算机科学与探索. (2021-09-09)[2021-11-14]. http://kns.cnki.net/kcms/detail/11.5602.TP.20210909.0906.002.html.
40	朱世松, 巴梦龙, 王辉, 等. 基于NBSR模型的入侵检测技术[J]. 计算机工程与科学, 2020, 42 (3): 427- 433.
	ZHU Shisong , BA Menglong , WANG Hui , et al. An intrusion detection technology based on NBSR model[J]. Computer Engineering & Science, 2020, 42 (3): 427- 433.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

类别	样本数量	百分比/%
Nomal	77 054	51.88
Dos	53 385	35.95
Probe	14 077	9.48
U2R	252	0.17
R2L	3 749	2.52
合计	148 517	100.00

攻击类型	Tr训练集样本个数	Ts测试集样本个数
Nomal	67 343	9 711
Dos	45 927	7 458
Probe	11 656	2 421
U2R	52	200
R2L	995	2 754
合计	125 973	22 544

特征名称	特征权重
特征名称	10%数据集	20%数据集
dst_host_serror_rate	0.496 5	0.501 0
logged_in	0.457 2	0.438 8
serror_rate	0.428 9	0.424 3
srv_serror_rate	0.373 1	0.382 7
same_srv_rate	0.273 1	0.271 0
dst_host_srv_serror_rate	0.210 4	0.213 6
dst_host_same_srv_rate	0.182 4	0.177 3
dst_host_srv_count	0.145 4	0.139 4
protocol_type	0.131 1	0.135 3
dst_host_count	0.124 0	0.131 2
flag	0.089 6	0.089 4
service	0.085 7	0.086 8
dst_host_same_src_port_rate	0.072 6	0.087 4
count	0.045 8	0.041 7
dst_host_rerror_rate	0.034 8	0.032 5
srv_diff_host_rate	0.025 8	0.032 5
dst_host_diff_srv_rate	0.022 0	0.024 3
dst_host_srv_rerror_rate	0.020 6	0.017 0
rerror_rate	0.016 5	0.018 2
srv_count	0.012 6	0.011 5
is_guest_login	0.010 6	0.011 7
diff_srv_rate	0.010 0	0.010 1
srv_rerror_rate	0.010 0	0.012 0
wrong_fragment	0.007 1	0.006 7
dst_host_srv_diff_host_rate	0.002 5	0.004 2
duration	0.001 9	0.003 0
root_shell	0.001 6	0.000 8
su_attempted	0.000 8	0.000 0
hot	0.000 7	0.000 8
num_shells	0.000 4	0.000 0
num_failed_logins	0.000 0	0.000 0
num_access_files	0.000 0	0.000 0
src_bytes	0.000 0	0.000 0
dst_bytes	0.000 0	0.000 0
land	0.000 0	0.000 0
urgent	0.000 0	0.000 0
num_compromised	0.000 0	0.000 0
num_root	0.000 0	0.000 0
num_file_creations	0.000 0	0.000 0
is_hot_login	0.000 0	0.000 0

样本分类	判断为攻击	判断为正常
攻击样本	T_P	F_N
正常样本	F_P	T_N

数据集名称	分类算法	评估准则
数据集名称	分类算法	A_CC/%	R_P/%	F₁ /%	R_D/%	R_FP/%	算法耗时/s
NSL-KDD	DT	98.15	98.45	98.19	97.94	1.62	0.246 6
	MR-DT	98.01	98.32	98.05	97.79	1.75	0.172 4
	KNN	98.79	98.92	98.81	98.71	1.12	40.345 2
	MR-KNN	97.89	97.89	97.94	97.70	1.90	3.193 4
	RF	99.44	99.40	99.45	99.50	0.62	5.879 6
	MR-RF	96.06	94.28	96.24	98.28	1.49	3.489 3
	NB	87.02	84.19	87.89	91.93	18.14	0.210 5
	MR-NB	85.13	82.39	86.16	90.29	20.28	0.052 1
	SVM	96.70	94.32	96.86	99.55	6.29	2 591.549 7
	MR-SVM	95.64	93.58	95.84	98.22	6.63	70.291 3
UNSW-NB15	DT	95.03	96.32	96.35	96.38	7.80	2.519 6
	MR-DT	94.96	95.83	96.33	96.84	9.09	1.033 7
	KNN	93.82	94.47	95.50	96.56	11.99	126.400 7
	MR-KNN	94.22	95.45	95.77	96.09	9.75	59.895 2
	RF	95.91	96.19	97.02	97.86	0.62	24.542 2
	MR-RF	93.71	91.54	95.58	99.99	19.68	10.384 6
	NB	87.40	88.74	90.96	93.29	25.12	0.221 6
	MR-NB	85.68	84.06	90.30	97.55	39.95	0.118 2
	SVM	93.69	91.83	95.54	99.58	18.81	683.028 1
	MR-SVM	93.94	92.12	95.74	99.65	18.39	434.906 3