|Table of Contents|

A Unified Semi-supervised Community Detection Approach Integrating Network Topology and Node Contents(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2023年01期
Page:
130-138
Research Field:
计算机科学与技术
Publishing date:

Info

Title:
A Unified Semi-supervised Community Detection Approach Integrating Network Topology and Node Contents
Author(s):
Xu Weizhong1Cao Jinxin1Jin Di2Sun Xiang3Zhang Xiaofeng1Liu Lu3Ding Weiping1
(1.School of Information Science and Technology, Nantong University, Nantong 226019, China)
(2.College of Intelligence and Computing, Tianjin University, Tianjin 300350, China)
(3.School of Informatics, University of Leicester, Leicester, UK LE1 7RH)
Keywords:
community detection node contents prior information stochastic block nonnegative matrix factorization
PACS:
TP182
DOI:
10.3969/j.issn.1001-4616.2023.01.017
Abstract:
Community detection plays an increasing important role in complex network analysis. There is still a goal that how to improve the performance of community detection in real applications. Due to the content in networks helpful to identifying communities,some methods focus on combining network topology with node content,which obtains no bad performance of community detection. Besides,some community detection enhancement methods are mainly based on designing the topological similarity of nodes to adjust network topology,which aims to achieve the enhancement. In order to further improve the quality of community detection,we propose a unified method,Semi-supervised Community Detection with Node Contents,shorted as SCDNC,which not only apply the enhancement into community detection,but also achieve the integration of network topology and node content. In the new method,firstly,we propose a stochastic block model to describe the community memberships of nodes. Secondly,we present another stochastic model to describe the community memberships of node contents,which utilizes community memberships of nodes as weight vectors of node contents. By now,integrating network topology with node content is achieved. Thirdly,we calculate the topological similarity of nodes by using links,and then model the prior information based on topological similarity,i.e.,we make nodes and their most similar neighbors have the same community membership. Finally,we present a nonnegative matrix factorization approach to obtain the parameters of the model. On both synthetic and real-world networks with ground-truths,we compare performance of the new method with the state-of-the-art methods. The experimental results show that the new method obtains significant improvement for community detection via combining node contents and network topology enhanced by prior information.

References:

[1]LATORA V,VINCENZO N,GIOVANNI R. Complex networks:principles,methods and applications[M]. Cambridge:Cambridge University Press,2017.
[2]RIOLO M A,NEWMAN M E J. Consistency of community structure in complex networks[J]. Physical review E,2020,101(5):052306.
[3]LESKOVEC J. Large-scale graph representation learning[C]//IEEE International Conference on Big Data. Boston,MA:IEEE,2017:4-4.
[4]胡云,张舒,佘侃侃,等. 基于重叠社区发现的社会网络推荐算法研究[J]. 南京师大学报(自然科学版),2018,41(3):35-41.
[5]黄立威,李彩萍,张海粟,等. 一种基于因子图模型的半监督社区发现方法[J]. 自动化学报,2016,42(10):1520-1531.
[6]陈俊宇,周刚,南煜,等. 一种半监督的局部扩展式重叠社区发现方法[J]. 计算机研究与发展,2016,53(6):1376-1388.
[7]JIN D,ZHANG B B,SONG Y,et al. ModMRF:A modularity-based Markov Random Field method for community detection[J]. Neurocomputing,2020,405:218-228.
[8]NEWMAN M E J,CLAUSET A. Structure and inference in annotated networks[J]. Nature communications,2016,7(1):1-11.
[9]JIN D,WANG X B,LIU M Q,et al. Identification of generalized semantic communities in large social networks[J]. IEEE transactions on network science and engineering,2020,7(4):2966-2979.
[10]WANG X,JIN D,CAO X C,et al. Semantic community identification in large attribute networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix,AZ:AAAI,2016:265-271.
[11]RUAN Y Y,FUHRY D,PARTHASARATHY S. Efficient community detection in large networks using content and links[C]//Proceedings of the 22nd International Conference on World Wide Web. New York,NY,USA:ACM,2013:1089-1098.
[12]YANG J,MCAULEY J,LESKOVEC J. Community detection in networks with node attributes[C]//IEEE Inter-national Conference on Data Mining. Dallas,TX:IEEE,2013:1151-1156.
[13]HE D X,WANG Y Y,CAO J X,et al. A network embedding-enhanced Bayesian model for generalized community detection in complex networks[J]. Information sciences,2021,575:306-322.
[14]GIRVAN M,NEWMAN M E J. Community structure in social and biological networks[J]. Proceedings of the national academy of sciences,2002,99(12):7821-7826.
[15]YANG L,CAO X C,JIN D,et al. A unified semi-supervised community detection framework using latent space graph regularization[J]. IEEE transactions on cybernetics,2014,45(11):2585-2598.
[16]HE D X,WANG H C,JIN D,et al. A model framework for the enhancement of community detection in complex networks[J]. Physica A:statistical mechanics and its applications,2016,461:602-612.
[17]HOFMANN T. Probabilistic latent semantic indexing[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York,NY:ACM,1999:50-57.
[18]CAO J X,WANG H C,JIN D,et al. Combination of links and node contents for community discovery using a graph regularization approach[J]. Future generation computer systems,2019,91:361-370.
[19]ALLAHVERDYAN A E,VER STEEG G,GALSTYAN A. Community detection with and without prior information[J]. Europhysics letters,2010,90(1):18002.
[20]MA X K,GAO L,YONG X R,et al. Semi-supervised clustering algorithm for community structure detection in complex networks[J]. Physica A:statistical mechanics and its applications,2010,389(1):187-197.
[21]XU X W,YURUK N,FENG Z,et al. Scan:a structural clustering algorithm for networks[C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose,CA:ACM,2007:824-833.
[22]CHOI S. Algorithms for orthogonal nonnegative matrix factorization[C]//2008 IEEE International Joint Conference on Neural Networks(IEEE World Congress on Computational Intelligence). Hongkong,China:IEEE,2008:1828-1832.
[23]LIU H F,WU Z H,LI X L,et al. Constrained nonnegative matrix factorization for image representation[J]. IEEE transactions on pattern analysis and machine intelligence,2011,34(7):1299-1311.
[24]YEUNG K Y,RUZZO W L. An empirical study on principal component analysis for clustering gene expression data[J]. Bioinformatics,2001,17(9):763-774.
[25]LANCICHINETTI A,FORTUNATO S. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities[J]. Physical review E,2009,80(1):016118.
[26]CAO J X,JIN D,DANG J W. Autoencoder based community detection with adaptive integration of network topology and node contents[C]//International Conference on Knowledge Engineering and Management. Changchun,China:Springer,2018:184-196.
[27]SEN P,NAMATA G,BILGIC M,et al. Collective classification in network data[J]. AI magazine,2008,29(3):93-106.

Memo

Memo:
-
Last Update: 2023-03-15