[1]尹婷婷,刘俊焱,周溜溜,等.基于动态抽样的图分类算法[J].南京师大学报(自然科学版),2015,38(01):113.
 Yin Tingting,Liu Junyan,Zhou Liuliu,et al.Graph Classification Algorithm Based on Dynamic Sampling[J].Journal of Nanjing Normal University(Natural Science Edition),2015,38(01):113.
点击复制

基于动态抽样的图分类算法()
分享到:

《南京师大学报》(自然科学版)[ISSN:1001-4616/CN:32-1239/N]

卷:
第38卷
期数:
2015年01期
页码:
113
栏目:
计算机科学
出版日期:
2015-06-30

文章信息/Info

Title:
Graph Classification Algorithm Based on Dynamic Sampling
作者:
尹婷婷刘俊焱周溜溜业 宁尹佟明
南京林业大学信息科学技术学院,江苏 南京 210037
Author(s):
Yin TingtingLiu JunyanZhou LiuliuYe NingYin Tongming
College of Information Science and Technology,Nanjing Forestry University,Nanjing 210037,China
关键词:
图分类动态抽样顶点平均度代表子模式
Keywords:
graph classificationdynamic samplingaverage vertex degreerepresentative sub-model
分类号:
TP311.13
文献标志码:
A
摘要:
传统的图分类算法由于支持度阈值选择过低导致频繁子模式规模过大,进而造成效率过低,阈值选择过高导致重要模式丢失而造成分类精度下降,如FSG和CEP方法. 针对这些问题,提出将动态抽样策略引入图分类领域,在保持分类准确率的前提下通过顶点平均度的计算抽样选取代表性子模式,结合CEP所给出的频繁闭显露模型,设计出一种新的图特征(分类规则)提取方法,解决了CEP算法由于支持度阈值设置过低而导致的无法计算现象,大大提高了分类效率; 并通过实验证明本文算法优于现有的一些主流算法.
Abstract:
Support threshold of traditional graph classification algorithm like FSG or CEP is too low that may cause over-sized frequent subschema and low efficiency,while too high may lead to the loss of important models or accuracy drop. To solve these problems,we introduce the strategy of dynamic sampling into the graph classification and design a new pattern feature extraction method,by which we get the representative sub-model by calculating every graph’s average vertex degree and use the frequent closed revealed model from CEP,under the premise of classification accuracy. The new method settled the problem unable to be calculated due to support threshold chosen too low in CEP and greatly improved the classification efficiency. Experiments showed that the new method surpassed a series of mainstream algorithm in this field.

参考文献/References:

[1] 汪卫,周皓峰,袁晴晴,等. 基于图论的频繁模式挖掘[J]. 计算机研究与发展,2005,42(2):230-235.
[2]刘勇,李建中,朱敬华. 一种新的基于频繁闭显露模式的图分类方法[J]. 计算机研究与发展,2007,44(7):1169-1176.
[3]周溜溜. 基于图结构的数据挖掘研究及应用[D]. 南京:南京林业大学信息科学技术学院,2013.
[4]Deshpande M,Kuramochi M,Karypis G. Frequent substructure based approaches for classifying chemical compounds[J]. IEEE Trans on Knowledge and Data Engineering,2005,17(8):1 036-1 050.
[5]Horvath T,Gartner T,Wrobel S. Cyclic pattern kernels for predictive graph mining[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD). Washington DC,USA:ACM,2004:158-167.
[6]Kashima H,Tsuda K,Inokuchi A. Marginalized kernels between labeled graphs[C]//Proceedings of the 20th International Conference on Machine Learning. Washington DC,USA:ICML,2003.
[7]Borgwardt K M,Kriegel H P. Shortest-path kernels on graphs[C]//Proceedings of the 5th IEEE International Conference on Data Mining(ICDM). Houston,Texas,USA:IEEE Computer Society,2005:74-81.
[8]Yan X,Han J. Closegraph:Mining closed frequent graph patterns[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD). Washington DC,USA:ACM,2003:286-295.
[9]Liu B,Hsu W,Ma Y. Integrating classification and association rule mining[C]//Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining(KDD-98). New York,USA:AAAI,1998:80-86.
[10]Silva A,Meira Jr W,Zaki M J. Structural correlation pattern mining for large graphs[C]//Proceedings of the Eighth Workshop on Mining and Learning with Graphs. USA:ACM,2010:119-126.
[11]赵建邦,董安国,高琳. 一种用于生物网络数据的频繁模式挖掘算法[J]. 电子学报,2010,38(8):1 803-1 807.
[12]丁悦,张阳,李战怀,等. 图数据挖掘技术的研究与进展[J]. 计算机应用,2012,32(1):182-190.
[13]薛冰,张俊峰,郑超,等. 基于分割图集的频繁闭图挖掘算法[J]. 计算机应用研究,2011,28(1):61-64,68.

备注/Memo

备注/Memo:
收稿日期:2014-08-16.
基金项目:国家973项目(2012CB114505)、国家杰青项目(31125008)、江苏省自然科学基金(BK2012815)、江苏省青蓝工程项目、江苏省六大人才高峰项目、江苏省2013年度普通高校研究生科研创新计划项目(CXZZ13_0538).
通讯联系人:业宁,教授,研究方向:数据挖掘,特征信息. E-mail:yening@njfu.edu.cn
更新日期/Last Update: 2015-03-30