[1]陈铭,吉根林.一种基于相似维的高维子空间聚类算法[J].南京师大学报(自然科学版),2010,33(04):119-122.
 Chen Ming,Ji Genlin.A Subspace Clustering Algorithm for High Dimensional Data Based on Similar Dimension[J].Journal of Nanjing Normal University(Natural Science Edition),2010,33(04):119-122.
点击复制

一种基于相似维的高维子空间聚类算法()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第33卷
期数:
2010年04期
页码:
119-122
栏目:
计算机科学
出版日期:
2010-12-20

文章信息/Info

Title:
A Subspace Clustering Algorithm for High Dimensional Data Based on Similar Dimension
作者:
陈铭;吉根林;
南京师范大学计算机科学与技术学院, 江苏南京210097 江苏省信息安全保密技术工程研究中心, 江苏南京210097
Author(s):
Chen MingJi Genlin
School of Computer Science and Technology,Nanjing Normal University,Nanjing 210097,China
关键词:
子空间聚类 相似维 G in i值
Keywords:
space c luste ring sim ilar d im ension G in i value
分类号:
TP311.13
摘要:
提出了一种基于相似维的子空间聚类算法SDSCA(Similar Dimension based Subspace Clustering Algo-rithm).算法首先通过Gini值来删除原高维数据空间中的冗余属性,然后运用相似维概念来寻找彼此相似的属性,最后在这些相似维所形成的子空间上运用传统聚类算法来进行聚类.实验结果表明算法是正确的,并且能够有效地避免冗余属性的干扰.
Abstract:
Th is paper proposes a subspace cluster ing a lgo ritm——SDSCA based on sim ilar dim ens ion. Firstly the G ini va lue is used to rem ove the redundant attributes in the data space. After rem ov ing the redundant attributes, the sim ilar d im ension is used to find the a ttr ibutes tha t a re close to each o the r. Fina lly, the trad itiona l c lustering a lgor ithm s are used on these subspaces that form ed by sim ilar dim ens ion. The experim ent results show tha t algorithm SDSCA is e ffectiv e and a lso reduces the redundant attr ibutes effective ly.

参考文献/References:

[ 1]M iche line JH. 数据挖掘: 概念与技术[M ]. 北京: 机械工业出版社, 2001.
[ 2]Agraw a l R, G ehrke J, Gunopu lo s D. Autom atic subspace c luster ing o f h igh dim ens iona l data for data m ining app lications [ C] / /Proceedings o f the 1998 ACM-SIGMOD Interna tiona l Confe rence on M anagem en t o f Da ta Seattle. W ashington: ACM Press, 1998, 6: 94-105.
[ 3]Chen C H, Fu A W C, Zhang Y. Entropy-based subspace cluster ing fo rm ining num erical data[ C ] / /Pro ceedings of the 5 th ACM S IGKDD Interna tiona l Conference on Know ledge D iscovery and Da taM ining. San D iego: ACM Press, 1999: 84-93.
[ 4]Pro copiuc CM, JonesM, Aga rwa l P K, e t a.l A Monte Car lo a lgo rithm for fast projective c lustering[ C] / /Proceedings o f the 2002 ACM SIGMOD Internationa l Con ference onM anagem ent of Da ta. M adison: ACM Press, 2002: 418-427.
[ 5]Aggarwa l C C, Procop iucC, W o lf J L, et a.l Fast algor ithm s for pro jected c lustering[ C] / /Proceed ing o f the 1999 ACM SIGMOD Internationa l Con ference onM anagem ent of Da ta. New York: ACM Press, 1999: 61-72.
[ 6]Aggarwa l C C, Yu P S. F inding g eneralized pro jected c luste rs in high dim ensional spaces[ C ] / /Proceedings o f the 2000 AC M SIGMOD Internationa l Con ference onM anagem ent of data. Dallas: ACM Press, 2000: 70-81.
[ 7]单世民, 王新艳, 张宪超. 高维分类属性的子空间聚类算法[ J] . 小型微型计算机系统, 2009( 10) : 2 016-2 021.
[ 8]刘铭, 王晓龙, 刘远超. 一种大规模高维数据快速聚类算法[ J] . 自动化学报, 2009, 35( 7): 859-866.
[ 9]EsterM, K riege lH P, Sande r J, et a.l A density-based algor ithm fo r discover ing c lusters in large spatial databasew ith no ise [ C] / /KDD??96: Proceedings o f the 2nd Internationa l Conference on Know ledg e D iscov ering and Data M in ing. Piscataw ay: IEEE Press, 1996: 226-231.
[ 10]Yang Y iling, Guan Xudong, You Jinyuan. CLOPE: a fast and effec tive cluster ing a lgo rithm for transactiona l data[ C ] / /Proceed ings o f the 8th ACMS IGKDD Internationa lConference on Know ledge D iscovery and DataM in ing. A lberta: ACM Press, 2002: 682-687.
[ 11]陈建斌. 高维聚类知识发现关键技术研究和应用[M ]. 北京: 电子工业出版社, 2008.

备注/Memo

备注/Memo:
基金项目: 国家自然科学基金( 40871176) . 通讯联系人: 吉根林, 博士, 教授, 博士生导师, 研究方向: 数据挖掘技术及其应用. Email:glji@njnu.edu.cn
更新日期/Last Update: 2013-04-08