|Table of Contents|

A Subspace Clustering Algorithm for High Dimensional Data Based on Similar Dimension(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2010年04期
Page:
119-122
Research Field:
计算机科学
Publishing date:

Info

Title:
A Subspace Clustering Algorithm for High Dimensional Data Based on Similar Dimension
Author(s):
Chen MingJi Genlin
School of Computer Science and Technology,Nanjing Normal University,Nanjing 210097,China
Keywords:
space c luste ring sim ilar d im ension G in i value
PACS:
TP311.13
DOI:
-
Abstract:
Th is paper proposes a subspace cluster ing a lgo ritm——SDSCA based on sim ilar dim ens ion. Firstly the G ini va lue is used to rem ove the redundant attributes in the data space. After rem ov ing the redundant attributes, the sim ilar d im ension is used to find the a ttr ibutes tha t a re close to each o the r. Fina lly, the trad itiona l c lustering a lgor ithm s are used on these subspaces that form ed by sim ilar dim ens ion. The experim ent results show tha t algorithm SDSCA is e ffectiv e and a lso reduces the redundant attr ibutes effective ly.

References:

[ 1]M iche line JH. 数据挖掘: 概念与技术[M ]. 北京: 机械工业出版社, 2001.
[ 2]Agraw a l R, G ehrke J, Gunopu lo s D. Autom atic subspace c luster ing o f h igh dim ens iona l data for data m ining app lications [ C] / /Proceedings o f the 1998 ACM-SIGMOD Interna tiona l Confe rence on M anagem en t o f Da ta Seattle. W ashington: ACM Press, 1998, 6: 94-105.
[ 3]Chen C H, Fu A W C, Zhang Y. Entropy-based subspace cluster ing fo rm ining num erical data[ C ] / /Pro ceedings of the 5 th ACM S IGKDD Interna tiona l Conference on Know ledge D iscovery and Da taM ining. San D iego: ACM Press, 1999: 84-93.
[ 4]Pro copiuc CM, JonesM, Aga rwa l P K, e t a.l A Monte Car lo a lgo rithm for fast projective c lustering[ C] / /Proceedings o f the 2002 ACM SIGMOD Internationa l Con ference onM anagem ent of Da ta. M adison: ACM Press, 2002: 418-427.
[ 5]Aggarwa l C C, Procop iucC, W o lf J L, et a.l Fast algor ithm s for pro jected c lustering[ C] / /Proceed ing o f the 1999 ACM SIGMOD Internationa l Con ference onM anagem ent of Da ta. New York: ACM Press, 1999: 61-72.
[ 6]Aggarwa l C C, Yu P S. F inding g eneralized pro jected c luste rs in high dim ensional spaces[ C ] / /Proceedings o f the 2000 AC M SIGMOD Internationa l Con ference onM anagem ent of data. Dallas: ACM Press, 2000: 70-81.
[ 7]单世民, 王新艳, 张宪超. 高维分类属性的子空间聚类算法[ J] . 小型微型计算机系统, 2009( 10) : 2 016-2 021.
[ 8]刘铭, 王晓龙, 刘远超. 一种大规模高维数据快速聚类算法[ J] . 自动化学报, 2009, 35( 7): 859-866.
[ 9]EsterM, K riege lH P, Sande r J, et a.l A density-based algor ithm fo r discover ing c lusters in large spatial databasew ith no ise [ C] / /KDD??96: Proceedings o f the 2nd Internationa l Conference on Know ledg e D iscov ering and Data M in ing. Piscataw ay: IEEE Press, 1996: 226-231.
[ 10]Yang Y iling, Guan Xudong, You Jinyuan. CLOPE: a fast and effec tive cluster ing a lgo rithm for transactiona l data[ C ] / /Proceed ings o f the 8th ACMS IGKDD Internationa lConference on Know ledge D iscovery and DataM in ing. A lberta: ACM Press, 2002: 682-687.
[ 11]陈建斌. 高维聚类知识发现关键技术研究和应用[M ]. 北京: 电子工业出版社, 2008.

Memo

Memo:
-
Last Update: 2013-04-08