|Table of Contents|

New Parallel Algorithm for Mining Frequent Item Sets Based on FP_Growth(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2016年04期
Page:
0-
Research Field:
·数学与计算机科学·
Publishing date:

Info

Title:
New Parallel Algorithm for Mining Frequent Item Sets Based on FP_Growth
Author(s):
Sun HongyanJi Genlin
School of Computer Science and Technology,Nanjing Normal University,Nanjing 210023,China
Keywords:
frequent item setsassociation ruleFP_GrowthHadoopMap/Reduce
PACS:
TP311.11
DOI:
10.3969/j.issn.1001-4616.2016.04.005
Abstract:
Mining of frequent item sets is used to find the association rules between items. In order to get frequent item sets of big data efficiently,this paper proposes a new parallel algorithm for mining frequent item sets based on FP_Growth,named NPFP_Growth(New Parallel algorithm based on FP_Growth). The storage structure of local frequent pattern tree is improved and created in each node based on parallel computing model Map/Reduce and distributed storage system HDFS,and then longest global frequent item sets are mined in each branch of the tree. Finally,Support for item sets which does not meet global minimum support is computed and then sent to corresponding computing node to count. Parallel mining algorithm NPFP_Growth is implemented. The experimental results show that the algorithm have high computing efficiency and good scalability.

References:

[1] HAN J,PEI J,YIN Y. Mining frequent patterns without candidate generation[J]. ACM SIGMOD Record,2000,29(2):1-12.
[2]?ZDOGAN G ?,Abul O. Task-Parallel FP_growth on cluster computers[C]//Proceedings of the International Symposium on Computer and Information Science,London,UK,2010:383-388.
[3]TANBEER S K,AHMED C F,JEONG B S. Parallel and distributed frequent pattern mining in large databases[C]//11th IEEE International Conference on High Performance Computing and Communications,Seoul,Korea,2009:407-414.
[4]SHEN X L,TAO L. Association rules parallel algorithm based on FP-tree[C]//2010 2nd International Conference on Computer Engineering and Technology,Century City New International Convention & Exhibition Center,Chengdu,China,2010,4:687-689.
[5]TU F,HE B. A parallel algorithm for mining association rules based on FP-tree. Advances in computer science,environment,ecoinformatics,and education[M]. Berlin,Heidelberg:Springer,2011:399-403.
[6]金桃,何艳珊,宋伟国,等. 一种简单有效的并行化频繁项集挖掘算法[J]. 微计算机信息,2010(18):147-149.
[7]LI H,WANG Y,ZHANG D,et al. PFP:parallel FP-Growth for query recommendation[C]//Proceedings of the 2008 ACM Conference on Recommender Systems,Lausanne,Switzerland,2008:107-114.
[8]章志刚,吉根林. 一种基于FP-Growth的频繁项目集并行挖掘算法[J]. 计算机工程与应用,2014(2):103-106.

Memo

Memo:
-
Last Update: 2016-12-31