[1]单芝慧,韩 萌,韩 强.基于滑动窗口的数据流高效用模糊项集挖掘[J].南京师大学报(自然科学版),2023,46(01):120-129.[doi:10.3969/j.issn.1001-4616.2023.01.016]
 Shan Zhihui,Han Meng,Han Qiang.High Utility Fuzzy Itemsets Mining Over Data Stream Based on Sliding Window Model[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(01):120-129.[doi:10.3969/j.issn.1001-4616.2023.01.016]
点击复制

基于滑动窗口的数据流高效用模糊项集挖掘()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第46卷
期数:
2023年01期
页码:
120-129
栏目:
计算机科学与技术
出版日期:
2023-03-15

文章信息/Info

Title:
High Utility Fuzzy Itemsets Mining Over Data Stream Based on Sliding Window Model
文章编号:
1001-4616(2023)01-0120-10
作者:
单芝慧韩 萌韩 强
(1.北方民族大学计算机科学与工程学院,宁夏 银川 750021)
(2.北方民族大学图像图形智能处理国家民委重点实验室,宁夏 银川 750021)
Author(s):
Shan ZhihuiHan MengHan Qiang
(1.School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China)
(2.The Key Laboratory of Images & Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China)
关键词:
数据流挖掘滑动窗口高效用项集挖掘模糊效用效用列表
Keywords:
data stream mining sliding window high utility itemsets mining fuzzy utility utility list
分类号:
TP391
DOI:
10.3969/j.issn.1001-4616.2023.01.016
文献标志码:
A
摘要:
高效用项集挖掘可以提供有趣的结果集,但并不能提供单个项的数量,因此,本文提出了高效用模糊项集. 但是,现实世界的数据是不断出现的,需要实时处理新到来的数据. 为解决当前高效用模糊项集不能处理数据流的问题,又提出了模糊效用列表(fuzzy utility list,FUL)结构用于存储当前窗口中项的批次号、项在事务中的事务标识符、项的模糊效用以及项的剩余模糊效用,该结构能有效的对批次进行插入和删除操作. 最后,基于FUL提出了数据流高效用模糊项集挖掘算法. 对真实数据集和合成数据集进行了广泛的实验,结果证实了算法的效率及可行性.
Abstract:
High-utility itemsets mining(HUI)can provide interesting itemsets,but cannot provide information on the number of items. Therefore,high utility fuzzy itemsets are proposed. However,real-world data is constantly emerging. Thus,new incoming data needs to be processed in real time. To solve the problem that the current high utility fuzzy itemsets cannot handle the data stream,a fuzzy utility list(FUL)structure is proposed to store the information of items,including batch number of items,the transaction identifier of the items,the fuzzy utility of items,and the reminding fuzzy utility of items. FUL can effectively insert and delete batches. Finally,based on FUL,a high utility fuzzy itemset mining algorithm on data stream is proposed,extensive experiments on real and synthetic datasets show the efficiency and feasibility of the algorithm.

参考文献/References:

[1]LIU Y,LIAO W,CHOUDHARY A. A two-phase algorithm for fast discovery of high utility itemsets[C]//Proceedings of the 9th Pacific-Asia Conf on Advances in Knowledge Discovery and Data Mining. Berlin:Springer,2005:689-695.
[2]DAM T L,KENLI L I,FOURNIER-VIGER P,et al. CLS-Miner:efficient and effective closed high-utility itemset mining[J]. Frontiers of computer science,2019,13(2):357-381.
[3]SETHI K K,DHARAVATH R. A fast high average-utility itemset mining with efficient tighter upper bounds and novel list structure[J]. Journal of supercomputing,2020,76(12):10288-10318.
[4]杨皓,段磊,胡斌,等. 带间隔约束Top-k对比序列模式挖掘[J]. 软件学报,2015,26(11):2994-3009.
[5]王晓璇,王丽珍,陈红梅,等. 基于特征效用参与率的空间高效用co-location模式挖掘方法[J]. 计算机学报,2019,42(8):1721-1738.
[6]吉根林,王敏. 时空轨迹聚集模式挖掘研究进展[J]. 南京师大学报(自然科学版),2015,38(4):1-7.
[7]NOUIOUA M,FOURNIER-VIGER P,WU C W,et al. FHUQI-Miner:fast high utility quantitative itemset mining[J]. Applied intelligence,2021,51(10):6785-6809.
[8]WANG C M,CHEN S H,HUANG Y F. A fuzzy approach for mining high utility quantitative itemsets[C]//In 2009 IEEE International Conference on Fuzzy Systems. IEEE,2009:1909-1913.
[9]LAN G C,HONG T P,LIN Y H,et al. Fuzzy utility mining with upper-bound measure[J]. Applied soft computing,2015,30:767-777.
[10]LAN G C,HONG T P,LIN Y H,et al. Fast discovery of high fuzzy utility itemsets[C]//2014 IEEE International Conference on Systems,Man,and Cybernetics(SMC). IEEE,2014:2764-2767.
[11]HONG T P,LIN C Y,HUANG W M. One-phase temporal fuzzy utility mining[C]//In 2020 IEEE International Conference on Fuzzy Systems(FUZZ-IEEE). IEEE,2020:1-5.
[12]WU J M T,LIN J C W,FOURNIER-VIGER P,et al. A ga-based framework for mining high fuzzy utility itemsets[C]//2019 IEEE International Conference on Big Data(Big Data). IEEE,2019:2708-2715.
[13]YANG F,MU N,LIAO X,et al. EA-HUFIM:Optimization for fuzzy-based high-utility itemsets mining[J]. International journal of fuzzy systems,2021,23:1652-1668.
[14]宋威,刘明渊,李晋宏. 基于事务型滑动窗口的数据流中高效用项集挖掘算法[J]. 南京大学学报(自然科学),2014,50(4):494-504.
[15]TSAI P S M. Mining high utility itemsets in data streams based on the weighted sliding window model[J]. International journal of data mining and knowledge management process,2014,4(2):13-28.
[16]JAYSAWAL B P,HUANG J W. SOHUPDS:a single-pass one-phase algorithm for mining high utility patterns over a data stream[C]//Proceedings of the 35th Annual ACM Symposium on Applied Computing. New York:ACM,2020:490-497.
[17]DAWAR S,SHARMA V,GOYAL V. Mining top-k high-utility itemsets from a data stream under sliding window model[J]. Applied intelligence,2017,47(4):1240-1255.
[18]程浩东,韩萌,张妮,等. 基于滑动窗口模型的数据流闭合高效用项集挖掘[J]. 计算机研究与发展,2021,58(11):2500-2514.
[19]FOURNIER-VIGER P,WU C W,ZIDA S,et al. FHM:faster high-utility itemset mining using estimated utility co-occurrence pruning[C]//Processdings of 21st International Symposium on Methodologies for Intelligent Systems. Roskilde,Denmark:Lecture,2014:83-92.
[20]FOURNIER-VIGER P,GOMARIZ A,GUENICHE T,et al. SPMF:a Java open-source pattern mining library[J]. Journal of machine learning research,2014,15(1):3389-3393.
[21]HONG T P,LIN C Y,HUANG W M,et al. Using tree structure to mine high temporal fuzzy utility itemsets[J]. IEEE access,2020,8:153692-153706.

相似文献/References:

[1]贾 涛,韩 萌,王少峰,等.数据流决策树分类方法综述[J].南京师大学报(自然科学版),2019,42(04):49.[doi:10.3969/j.issn.1001-4616.2019.04.008]
 Jia Tao,Han Meng,Wang Shaofeng,et al.Survey of Decision Tree Classification Methods over Data Streams[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(01):49.[doi:10.3969/j.issn.1001-4616.2019.04.008]

备注/Memo

备注/Memo:
收稿日期:2022-08-08.
基金项目:国家自然科学基金项目(62062004、61862001)、宁夏自然科学基金项目(2020AAC03216).
通讯作者:韩萌,博士,教授,研究方向:数据挖掘. E-mail:2003051@nmu.edu.cn
更新日期/Last Update: 2023-03-15