[1]吉珊珊.基于神经网络树和人工蜂群优化的数据聚类[J].南京师大学报(自然科学版),2021,44(01):119-127.[doi:10.3969/j.issn.1001-4616.2021.01.017]
 Ji Shanshan.Neuron Network Tree and Artificial Bee Colony OptimizationBased Data Clustering Algorithm[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(01):119-127.[doi:10.3969/j.issn.1001-4616.2021.01.017]
点击复制

基于神经网络树和人工蜂群优化的数据聚类()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第44卷
期数:
2021年01期
页码:
119-127
栏目:
·计算机科学与技术·
出版日期:
2021-03-15

文章信息/Info

Title:
Neuron Network Tree and Artificial Bee Colony OptimizationBased Data Clustering Algorithm
文章编号:
1001-4616(2021)01-0119-09
作者:
吉珊珊
东莞职业技术学院计算机工程系,广东 东莞 523808
Author(s):
Ji Shanshan
Department of Computer Enginneering,Dongguan Polytechnic,Dongguan 523808,China
关键词:
高维数据神经网络树人工蜂群优化聚类算法特征选择
Keywords:
high dimensional dataneuron network treeartificial bee optimizationclustering algorithmfeature selection
分类号:
TP391
DOI:
10.3969/j.issn.1001-4616.2021.01.017
文献标志码:
A
摘要:
针对高维数据引起的“维数灾难”问题,设计了一种基于神经网络树和人工蜂群优化的高维数据聚类算法. 首先,设计了改进的二元人工蜂群优化算法,以封装式方法最大化径向基函数网络的准确率,以过滤式方法最小化特征的冗余度; 然后,基于每个特征子集的样本集训练径向基函数网络,构建以径向基函数网络为节点的神经树; 最终,采用门网络将连接的类簇分离,获得最终的聚类结果. 基于高维数据集和低维数据集均完成了仿真实验,结果表明本算法对于高维数据集实现了较高的聚类准确率.
Abstract:
Focusing on the“curse of dimensionality”problem caused by high dimensional data,a neuron network tree and artificial bee colony optimization based clustering algorithm for high dimensional data is designed. Firstly,an improved binary artificial bee optimization algorithm is designed,the accuracy of radial basis function network is maximized by a wrapper method,the feature redundancy is minimized by a filter method; then,a radial basis function network is trained by samples corresponding to each feature,a neuron tree that each node consists of a radial basis function network is constructed; finally,the gating network is adopted to separate the jointed clusters to output the final results. Simulation experiments are done based on both high dimensional datasets and low dimensional datasets,the results show that the proposed algorithm realizes good clustering accuracy to high dimensional datasets.

参考文献/References:

[1] 刘娜,毛晓菊,吴敏. 集群分类映射的文本多标签模糊关联降维聚类[J]. 计算机工程与设计,2017,38(6):1657-1663.
[2]王翔,胡学钢. 高维小样本分类问题中特征选择研究综述[J]. 计算机应用,2017,37(9):2433-2438.
[3]GARCíA T M,GóMEZ V F,MELIáN B B,et al. High-dimensional feature selection via feature grouping:a variable neighborhood search approach[J]. Information sciences,2016,326(C):102-118.
[4]BOLóN G V,SáNCHEZ M N,ALONSO B A. Feature selection for high-dimensional data[J]. Computational management science,2016,5(2):65-75.
[5]金利英,赵升吨. 混合测量子空间聚类算法的研究[J]. 西安交通大学学报,2018(3):139-144.
[6]CHEN C,DONG D,QI B,et al. Quantum ensemble classification:a sampling-based learning control approach[J]. IEEE Transactions on neural networks & learning systems,2017,28(6):1345-1359.
[7]LI Y,GUO H,XIAO L,et al. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data[J]. Knowledge-based systems,2016,94:88-104.
[8]SONG G,YE Y,ZHANG H,et al. Dynamic clustering forest:an ensemble framework to efficiently classify textual data stream with concept drift[J]. Information sciences,2016,357:125-143.
[9]FARID D M,NOWE A,MANDERICK B. A feature grouping method for ensemble clustering of high-dimensional genomic big data[C]//2016 Future Technologies Conference(FTC). San Francisco,USA,IEEE,2016:260-268.
[10]DAGDIA Z C,ZARGES C,GA? B,et al. A distributed rough set theory based algorithm for an efficient big data pre-processing under the spark framework[C]//IEEE International Conference on Big Data. Seattle,USA,IEEE,2018:911-916.
[11]LIN K C,ZHANG K Y,HUANG Y H,et al. Feature selection based on an improved cat swarm optimization algorithm for big data classification[J]. Journal of supercomputing,2016,72(8):3210-3221.
[12]BAIG M M,AWAIS M M,EL-ALFY E S M. AdaBoost-based artificial neural network learning[J]. Neurocomputing,2017,248(26):120-126.
[13]NAG K,PAL N R. A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification[J]. IEEE transactions on cybernetics,2017,46(2):499-510.
[14]HOQUE N,SINGH M,BHATTACHARYYA D K. EFS-MI:an ensemble feature selection method for classification[J]. Complex & intelligent systems,2018,4(2):105-118.
[15]BRAHIM A B,LIMAM M. Ensemble feature selection for high dimensional data:a new method and a comparative study[J]. Advances in data analysis & classification,2018,12(4):937-952.



[16]GüNEY H,?TOPRAK H. Microarray-based cancer diagnosis:repeated cross-validation-based ensemble feature selection[J]. Electronics letters,2018,54(5):272-274.

[17]FONOLLOSA J,RODRíGUEZLUJáN I,TRINCAVELLI M,et al. Data set from chemical sensor array exposed to turbulent gas mixtures[J]. Data in brief,2015,3(C):216-220.

备注/Memo

备注/Memo:
收稿日期:2020-07-08.
基金项目:东莞市科技局项目(2020507156694)、东莞职业技术学院横向课题(202021189).
通讯作者:吉珊珊,讲师,研究方向:计算机智能信息处理与控制、计算机仿真、计算机教育. E-mail:jss060@163.com
更新日期/Last Update: 2021-03-15