[1]杜秀丽,姜晓虎,孙晨瞳,等.基于方向性多重假设检验和信息熵的函数型数据聚类新方法[J].南京师大学报(自然科学版),2022,45(04):1-9.[doi:10.3969/j.issn.1001-4616.2022.04.001]
 Du Xiuli,Jiang Xiaohu,Sun Chentong,et al.A New Functional Data Clustering Method Based on Directional Multiple Hypothesis Test and Information Entropy[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(04):1-9.[doi:10.3969/j.issn.1001-4616.2022.04.001]
点击复制

基于方向性多重假设检验和信息熵的函数型数据聚类新方法()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第45卷
期数:
2022年04期
页码:
1-9
栏目:
数学
出版日期:
2022-12-15

文章信息/Info

Title:
A New Functional Data Clustering Method Based on Directional Multiple Hypothesis Test and Information Entropy
文章编号:
1001-4616(2022)04-0001-09
作者:
杜秀丽姜晓虎孙晨瞳于 正
(南京师范大学数学科学学院,江苏 南京 210023)
Author(s):
Du XiuliJiang XiaohuSun ChentongYu Zheng
(School of Mathematical Sciences,Nanjing Normal University,Nanjing 210023,China)
关键词:
函数型数据聚类分析错误发现率方向性多重假设检验信息熵平行度
Keywords:
functional data clustering analysisfalse discovery ratedirectional multiple hypothesis testingthe information entropyparallelism
分类号:
O212.1
DOI:
10.3969/j.issn.1001-4616.2022.04.001
文献标志码:
A
摘要:
近年来,针对函数型数据的聚类分析得到了一定程度的发展. 但当数据属于无限维空间时,会给聚类带来一定的难度. 传统聚类方法的局限性在函数型数据的聚类过程中日益凸显. 因此,本文提出了一种针对函数型数据的新聚类方法,能够更好地适应数据的特点,实现较好的聚类效果. 首先基于错误发现率的方向性多重假设检验和信息熵的理论,提出了新的平行度统计量,用以描述函数型曲线的形态差异. 在此基础上提出了新接近度的计算公式,最终改进了凝聚式层次聚类算法. 新的聚类方法被应用到4个不同类型的函数型数据集中,并与现有的其它方法的聚类结果进行分析和比较,证明了改进后的凝聚式层次聚类方法的有效性.
Abstract:
In recent years,clustering analysis for functional data has been developed to a certain extent. However,when the data belong to infinite dimensional space,it will bring some difficulty to clustering. The limitations of traditional clustering methods are increasingly prominent in the clustering process of functional data. Therefore,this paper proposes a new clustering method for functional data,which can better adapt to the characteristics of data and achieve better clustering effect. Firstly,based on the directional multiple hypothesis test of false discover rate and the theoretical basis of information entropy,a new parallelism statistic is proposed to describe the morphological differences of functional curves. On this basis,a new calculation formula of proximity is proposed,and finally the condensed hierarchical clustering algorithm is improved. The new clustering method is applied to four different types of functional data sets,and the clustering results are analyzed and compared with other existing methods,which proves the effectiveness and advantages of the improved condensed hierarchical clustering method.

参考文献/References:

[1]BOULLÉ M,GUIGOURS R,ROSSI F. Nonparametric hierarchical clustering of functional data[M]. Chapman:Springer,2014.
[2]IEVA F,PAGANONI A M,PIGOLI D,et al. Multivariate functional clustering for the morphological analysis of electrocardio-graph curves[J]. Journal of the royal statistical society,2013,62(3):401-418.
[3]TOKUSHIGE S,YADOHISA H,INADA K. Crisp and fuzzy k-means clustering algorithms for multivariate functional data[J]. Computational statistics,2007,22:1-16.
[4]ZAMBOM A Z,COLLAZOS J A A. Functional data clustering via hypothesis testing k-means clustering algorithms for multivariate functional data[J]. Computational statistics,2007,22:1-16.
[5]ZAMBOM A Z,COLLAZOS J A A. Functional data clustering via hypothesis testing ans[J]. Computational statistics,2019,34(2):527-549.
[6]JAMES G M,SUGAR C A. Clustering for sparsely sampled functional data[J]. Journal of the american statistical association,2003,98(462):397-408.
[7]CHIOU J M,LI P L. Functional clustering and identifying substructures of longitudinal data[J]. Journal of the royal statistical society,2007,69(4):679-699.
[8]JACQUES J,PREDA C. Functional data clustering:a survey[J]. Advances in data analysis and classification,2013,8(3):231-255.
[9]TARPEY T,KINATEDER K K J. Clustering functional data[J]. Journal of classification,2003,20(1):93-114.
[10]HEARD N A,HOLMES C C,STEPHENS D A. A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes:an application of bayesian hierarchical clustering of curves[J]. Journal of the American Statistical Association,2006,101(473):18-29.
[11]KAYANO M,DOZONO K,KONISHI S. Functional cluster analysis via orthonormalized gaussian basis expansions and its application[J]. Journal of classification,2010,27(2):211-230.
[12]XU P,LEE Y,SHI J Q. Automatic detection of significant areas for functional data with directional error control[J]. Statistics in medicine,2018,38(3):376-397.
[13]SHI J Q,CHOI T. Gaussian process regression analysis for functional data[M]. New York:Chapman and Hall,2011.
[14]BOUVEYRON C,JACQUES J. Model-based clustering of time series in group-specific functional subspaces[J]. Advances in data analysis and classification,2011,5(4):281-300.

备注/Memo

备注/Memo:
收稿日期:2022-01-21.
基金项目:国家社会科学基金项目(21BTJ044).
通讯作者:于正,副研究员,研究方向:教育管理. E-mail:33022@njnu.edu.cn
更新日期/Last Update: 2022-12-15