|Table of Contents|

A Semi-supervised Clustering Algorithm Based onAnti-annealing and Gaussian Mixture Model(PDF)


Research Field:
Publishing date:


A Semi-supervised Clustering Algorithm Based onAnti-annealing and Gaussian Mixture Model
Wang YaoChai BianfangLi WenbinLü Feng
School of Information Engineering,Hebei GEO University,Shijiazhuang 050031,China
Gaussian mixture modelexpectation maximization algorithmanti-annealingsemi-supervised clustering
Semi-supervised Gaussian mixture model(SGMM)based on labeling nodes can improve the accuracy of model parameter estimation. However,the accuracy and convergence of the Expectation Maximization(EM)algorithm are affected by the amount of overlap and mixing coefficients among the Gaussian distributions. In order to improve the accuracy and speed of the SGMM parameter estimation,the Anti-annealing is combined with the EM algorithm of SGMM. A clustering algorithm of the semi-supervised Gaussian mixture model based on anti-annealing(ASGMM-EM)is proposed. The inverse temperature parameter of the algorithm increases from a smaller value to an upper bound that more than 1 and then back to 1. The semi-supervised clustering EM algorithm is implemented at each inverse temperature parameter. Experiments on synthetic and real data show that the ASGMM-EM is better compared to the algorithms only using semi-supervised or anti-annealing technique.


[1] YEUNG K Y,YEUNG K Y,HAYNOR D R,et al. Validating clustering for gene expression data[J]. Bioinformatics,2001,17:309-318.
[2]YANG Y,XU D,NIE F,et al. Image clustering using local discriminant models and global integration[J]. IEEE Trans Image Process,2010,19:2 761-2 773.
[3]XU W,LIU X,GONG Y. Document clustering based on non-negative matrix factorization[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(ACM). Nuray,2003:267-273.
[4]CHAPELLE O,SCH?CKOPF B,ZIEN A. Semi-Supervised Learning[C]//Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Springer-Verlag,2006:588-595.
[5]YANG M S,LAI C Y. A robust EM clustering algorithm for Gaussian mixture models[J]. Patter recognition,2012,45(10):3 950-3 961.
[6]PORTELA N M,CAVALCANTI G D C,REN T I. Semi-supervised clustering for MR brain image segmentation[J]. Expert systems with applications,2014,41(4):1 492-1 497.
[7]於跃成,生佳根,邹晓华. 基于约束正则化的生成聚类分析[J]. 系统工程与电子设计,2014,36(4):777-783.
[8]HE X F,CAI D,SHAO Y L,et al. Laplacian regularized Gaussian mixture model for data clustering[J]. IEEE Trans on Knowledge and Data Engineering,2011,23(9):1 406-1 418.
[9]GAN H T,SANG N,HUANG R. Manifold regularized semi-supervised Gaussian mixture model[J]. Journal of the optical society of America. A,optics and image science,2015,32(4):566-575.
[10]LIU J,CAI D,HE X. Gaussian mixture model with local consistency[C]//The 24th AAAI Conference on Artificial Intelligence(AAAI). Atlanta,USA,2010:512-517.
[11]MARTINEZ U A,PLA F,SOTOCA J M. A semi-supervised Gaussian mixture model for image segmentation[C]//Proc of the 20th International Conference on Pattern Recognition. Istanbul Turkey,2010:2 941-2 944.
[12]周志华. 机器学习[M]. 北京:清华大学出版社,2015:293-295.
[13]IFTEKHAR N,DANIEL G. Convergence of the EM algorithm for Gaussian mixtures with unbalanced mixing coefficients[C]// ICML,Edinburgh,Scotland,2012.
[14]UEDA N,NAKANO R. Deterministic annealing EM algorithm[J]. Neural networks,1998,11(2):271-282.
[15]STEINHAUS H. Sur la division des corpmaterielsen parties[J]. Bulletin of Acad Polon Sci,IV(C1. Ⅲ),1956:801-804.
[16]MACQUEEN J. Some methods for classification and analysis of multivariate observations[C]//Fifth Berkeley Symposium on Mathematics,Statistics and Probability. California,USA, 1967:281-297.
[17]ASUNCION A,NEWMAN D. UCI machine learning repository[EB/OL]. [2014-02-18]. http://www.ics.uci.edu/~mlearn/MLRepository.html. 2007.


Last Update: 2017-09-30