[1]郑锦波,王慧玲.一种混合特征选择的朴素贝叶斯网络入侵检测算法[J].南京师大学报(自然科学版),2025,48(03):73-83.[doi:10.3969/j.issn.1001-4616.2025.03.009]
 Zheng Jinbo,Wang Huiling.A Naive Bayes Network Intrusion Detection Algorithm with Mixed Feature Selection[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):73-83.[doi:10.3969/j.issn.1001-4616.2025.03.009]
点击复制

一种混合特征选择的朴素贝叶斯网络入侵检测算法()

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
48
期数:
2025年03期
页码:
73-83
栏目:
计算机科学与技术
出版日期:
2025-06-20

文章信息/Info

Title:
A Naive Bayes Network Intrusion Detection Algorithm with Mixed Feature Selection
文章编号:
1001-4616(2025)03-0073-11
作者:
郑锦波王慧玲
(伊犁师范大学网络安全与信息技术学院,新疆 伊宁 835000)
Author(s):
Zheng JinboWang Huiling
(School of Network Security and Information Technology,Yili Normal University,Yining 835000,China)
关键词:
网络入侵检测条件独立性特征选择条件互信息pearson相关系数
Keywords:
network intrusion detectionconditional independencefeature selectionconditional mutual informationpearson correlation coefficient
分类号:
TP309
DOI:
10.3969/j.issn.1001-4616.2025.03.009
文献标志码:
A
摘要:
在入侵检测应用中,机器学习算法发挥着至关重要的作用,特征选择作为关键的数据预处理步骤,可以有效提升分类器的分类效果. 而现有的特征选择算法未考虑数据分布不均匀时特征间存在的伪相关性,影响了分类器的泛化能力. 针对此问题,本文提出了一种混合特征选择的朴素贝叶斯网络入侵检测算法,将相关性度量准则引入特征提取阶段,避免特征间存在的伪相关性,更好地满足朴素贝叶斯算法的强假设,使模型检测性能有效提升. 该方法采用了两步特征选择策略:第一步筛选数据集中和类变量相关性较强特征; 第二步去除冗余特征,筛选出相互条件独立的特征作为特征子集,并将此特征子集送入朴素贝叶斯算法进行检测. 实验结果表明,提议的方法在检测率和泛化性能上都优于参与对比的6个传统机器学习算法,并且在一定程度上克服了数据分布不平衡导致的精度低的问题,与近期提出的两个深度学习算法相比较,在准确率和精确率上优于两个对比深度学习算法.
Abstract:
In intrusion detection applications,machine learning algorithms play a crucial role. Feature selection,as a key data preprocessing step,can effectively improve the classification performance of classifiers. However,existing feature selection algorithms do not consider the existence of pseudo-correlations between features when the data distribution is imbalanced,which affects the generalization ability of classifiers. To address this issue,a hybrid feature selection naive Bayes network intrusion detection algorithm is proposed,which introduces correlation measurement criteria into the feature extraction stage to avoid the pseudo-correlations between features and better satisfy the strong assumption of the naive Bayes algorithm,thereby improving the detection performance of the model. This method adopts a two-step feature selection strategy. In the first step,features that are strongly correlated with the class variable are selected from the dataset. In the second step,redundant features are removed to select a subset of mutually conditionally independent features,which are then fed into the naive Bayes algorithm for detection. Experimental results show that the proposed method outperforms 6 traditional machine learning algorithms in terms of detection rate and generalization performance,and it partially overcomes the problem of low accuracy caused by imbalanced data distribution. Compared with two recently proposed deep learning algorithms,it performs better in terms of accuracy and precision.

参考文献/References:

[1]WU J X,LI J H,JI X S. Security for cyberspace:Challenges and opportunities[J]. Frontiers of information technology & electronic engineering,2018,19(12):1459-1461.
[2]中国信息通信研究院安全研究所. 2020年网络安全威胁信息研究报告[EB/OL]. [2021-12-03]. http://www.caict.ac.cn/kxyj/qwfb/ztbg/202112/P020211203576374176759.pdf.
[3]MOHAMED M B,MEDDEB-MAKHLOUF A,FAKHFAKH A. Intrusion cancellation for anomaly detection in healthcare applications[C]//International Wireless Communications & Mobile Computing Conference(IWCMC). Split Croatia:IEEE,2019:313-318.
[4]GHOSH K,NEOGY S,DAS P K,et al. Intrusion detection at international borders and large military barracks with multi-sink wireless sensor networks:An energy efficient solution[J]. Wireless pearsonal communications,2018,98(1):1083-1101.
[5]GAO L L,LI F,XU X,et al. Intrusion detection system using SOEKS and deep learning for in-vehicle security[J]. Cluster computing,2019,22(S6):14721-14729.
[6]HU Y,YANG A,LI H,et al. A survey of intrusion detection on industrial control systems[J]. International journal of distributed sensor networks,2018,14(8):155014771879461.
[7]HALDER S,GHOSAL A,CONTI M. Efficient physical intrusion detection in internet of things:A node deployment approach[J]. Computer networks,2019,154:28-46.
[8]SENTHILNAYAKI B,VENKATALAKSHMI K,KANNAN A. Intrusion detection system using fuzzy rough set feature selection and modified KNN classifier[J]. The internation arab journal of information technology,2019,16(4):746-753.
[9]AUNG Y Y,MIN M M. Hybrid intrusion detection system using K-Means and K-Nearest neighbors algorithms[C]//ACIS International Conference on Computer and Information Science(ICIS). Osaka,Japan:IEEE,2018:34-38.
[10]SAHU S K,KATIYAR A,KUMARI K M,et al. An SVM-based ensemble approach for intrusion detection[J]. International journal of information technology and Web engineering,2019,14(1):66-84.
[11]KAUR A,GULERIA K,TRIVEDI N K. Feature selection in machine learning:methods and comparison[C]//2021 International Conference on Advance Computing and Innovative Technologies in Engineering(ICACITE). Greater Noida,India:IEEE,2021:789-795.
[12]任晓奎,缴文斌,周丹. 基于粒子群的加权朴素贝叶斯入侵检测模型[J]. 计算机工程与应用,2016,52(7):122-126.
[13]YU N. A novel selection method of network intrusion optimal route detection based on Naive Bayesian[J]. International journal of applied decision sciences,2018,11(1):1
[14]LIANG J,MA M,SADIQ M,et al. A filter model for intrusion detection system in vehicle Ad Hoc Networks:A hidden Markov Methodology[J]. Knowledge-based systems,2019,163:611-623.
[15]戴远飞,陈星,陈宏,等. 基于特征选择的网络入侵检测方法[J]. 计算机应用研究,2017,34(8):2429-2433.
[16]戴敏. 基于并行特征选择和分类的网络入侵检测方法[J]. 计算机工程与设计,2019,40(3):654-661.
[17]DEMSAR J. Statistical comparisons of classifiers over multiple data sets[J]. The journal of machine learning research,2006,7:1-30
[18]NEMENYI P B. Distribution-free multiple comparisons[M]. USA:Princeton University,1963.
[19]赵洪斌. 多时间尺度下基于相关系数的光伏电站出力特性分析[J]. 青海大学学报,2020,38(5):60-65.
[20]KOCHER G,GUISHAN K A.Analysis of machine learning algo-rithms with feature selection for intrusion detection using UNSW-NB15 dataset[J]. International journal of network security & its applications,2021,13(1):21-31.
[21]李俊,夏松竹,兰海燕,等. 基于GRU-RNN的网络入侵检测方法[J]. 哈尔滨工程大学学报,2021,42(6):879-884.
[22]JAYSM,MANOLLAS. Efficient deep CNN-BiLSTM model for network intrusion detection[C]//Proceedings of the 3rd International Conference on Artificial Intelligence and Pattern Recognition. Xiamen:Semantic Scholoar,2020.

备注/Memo

备注/Memo:
收稿日期:2024-12-15.
基金项目:新疆维吾尔自治区自然科学基金项目(2022D01C337)、伊犁师范学院重点学科开放课题(XJZDXKphy202302).
通讯作者:王慧玲,高级实验师,研究方向:计算机视觉,图像内容理解. E-mail:37993034@qq.com
更新日期/Last Update: 2025-06-20