[1]顾海艳,郑淇文.日志模板提取的FT-Tree改进算法研究[J].南京师大学报(自然科学版),2021,44(02):121-126.[doi:10.3969/j.issn.1001-4616.2021.02.017]
 Gu Haiyan,Zheng Qiwen.Research on Improved Algorithm of FT-Tree for Log Template Extraction[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(02):121-126.[doi:10.3969/j.issn.1001-4616.2021.02.017]
点击复制

日志模板提取的FT-Tree改进算法研究()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第44卷
期数:
2021年02期
页码:
121-126
栏目:
·计算机科学与技术·
出版日期:
2021-06-30

文章信息/Info

Title:
Research on Improved Algorithm of FT-Tree for Log Template Extraction
文章编号:
1001-4616(2021)02-0121-06
作者:
顾海艳郑淇文
江苏警官学院计算机信息与网络安全系,江苏 南京 210031
Author(s):
Gu HaiyanZheng Qiwen
Department of Computer Information and Cyber Security,Jiangsu Police Institute,Nanjing 210031,China
关键词:
日志模板提取方法FT-Tree算法Apriori算法改进算法
Keywords:
Log templateextraction methodFT-tree algorithmApriori algorithmimproved algorithm
分类号:
TP183
DOI:
10.3969/j.issn.1001-4616.2021.02.017
文献标志码:
A
摘要:
计算机日志能够真实、全面记录事件信息,已经作为一种电子证据在侦查办案中得到普遍应用. 要实现对海量日志信息的快速自动分析,需要有高效可靠的日志模板提取方法. 鉴于目前日志模板提取方法存在的不足,本文以网络服务器日志作为模板提取的研究对象,基于FT-Tree算法,提出了利用Apriori算法计算剪枝阈值,再利用该值控制剪枝的FT-Tree改进算法,并以某高校网络服务器真实日志进行了模板提取实验. 结果表明,改进算法较之FT-Tree算法显著提高了日志模板提取的准确度,能更好地满足实际应用需要.
Abstract:
Due to the authenticity and comprehensiveness of the event information recorded by computer log files,it has been widely used as electronic evidences in the investigation and handling of cases. In order to realize the fast and automatic analysis of massive log files,an efficient and reliable log template extraction method is needed. In view of the problems existing in the current log template extraction method,we take network server log as the research object of template extraction and propose an improved FT-tree algorithm. In the proposed algorithm,the pruning threshold is calculated by using Apriori algorithm,and then the pruning is controlled by using this threshold. An experiment of template extraction is carried out using the real log file of a university network server. The results show that the improved algorithm significantly improves the accuracy of log template extraction compared with the FT-tree algorithm,and can better meet the needs of practical application.

参考文献/References:

[1] TANG L,LI T,PERNG C S. LogSig:Generating system events from raw textual logs[C]//Proceedings of the 20th Association for Computing Machiner(ACM)International Conference on Information and Knowledge Management. New York,NY,USA:ACM,2011:785-794.
[2]李文杰,闫世强,蒋莹,等. 自适应确定DBSCAN算法参数的算法研究[J]. 计算机工程与应用,2019,55(5):1-7,148.
[3]VAARANDI R,PIHELGAS M. LogCluster—A data clustering and pattern mining algorithm for event logs[C]//2015 11th International Conference on Network and Service Management(CNSM).Barcelona,Spain:IEEE,2015:1-7.
[4]NANDI A,MANDAL A,ATREJA S,et al. Anomaly detection using program control flow graph mining from execution logs[C]//Association for Computing Machinery International Conference on Knowledge Discovery and Data Mining(ACM SIGKDD). New York,NY,USA:ACM,2016:215-224.
[5]双锴,李怡雯,吕志恒,等. 基于归一化特征判别的日志模板挖掘算法[J/OL]. 北京邮电大学学报:1-6[2020-02-09]. https://doi.org/10.13190/j.jbupt.2019-033.
[6]崔元,张琢. 基于大规模网络日志的模板提取研究[J]. 计算机科学,2017(11A):448-452.
[7]ZHANG S L,MENG W B,BU J H,et al. Syslog processing for switch failure diagnosis and prediction in datacenter networks[C]//2017 IEEE/ACM 25th International Symposium on Quality of Service(IWQoS). Vilanova i la Geltrú,Spain:ACM,2017:1-10.
[8]刘洪歧,陈远平,马建化. 系统日志模板提取方法研究[J/OL]. 计算机系统应用,2019,28(10):239-244. http://www.c-s-a.org.cn/1003-3254/7112.html.
[9]ZHANG S L,SONG L,ZHANG M,et al. Efficient and robust syslog parsing for network devices in datacenter networks[J]. IEEE access,2020,8:30245-30261.
[10]李峰,李明祥,张宇敬. 局部迭代的快速K-means聚类算法[J/OL]. 计算机工程与应用:1-11[2020-07-01]. http://kns.cnki.net/kcms/detail/11.2127.tp.20190815.1706.027.html.
[11]廖纪勇,吴晟,刘爱莲. 基于布尔矩阵约简的Apriori算法改进研究[J]. 计算机工程与科学,2019,41(12):2231-2238.
[12]郭涛敏. 基于轻量化关联规则挖掘的安全日志审计技术研究[J]. 现代电子技术,2019,42(15):83-85.
[13]TAN P N,STEINBACH M,KUMAR V,等. 数据挖掘导论[M]. 北京:人民邮电出版社,2011:202-207.

相似文献/References:

[1]鲁明亮,陶永春.一种提取纳米CMOS器件中源/漏寄生电阻的恒定迁移率方法[J].南京师大学报(自然科学版),2018,41(01):50.[doi:10.3969/j.issn.1001-4616.2018.01.010]
 Lu Mingliang,Tao Yongchun.A Constant-Mobility Method to Extract Source/Drain ParasiticResistance for Nanometer CMOS Devices[J].Journal of Nanjing Normal University(Natural Science Edition),2018,41(02):50.[doi:10.3969/j.issn.1001-4616.2018.01.010]

备注/Memo

备注/Memo:
收稿日期:2020-08-10.
基金项目:“十三五”江苏省重点建设学科建设工程资助项目(苏教研[2016]9号)、江苏省教育厅教改项目(2019JSJG006、2019JSJG595).
通讯作者:顾海艳,副教授,研究方向:数据挖掘,信息安全. E-mail:ghy7388@126.com
更新日期/Last Update: 2021-06-30