Scalable Community Detection in Massive Networks Using Aggregated Relational Data

13391
image of laptop with data on screen

Scalable Community Detection in Massive Networks Using Aggregated Relational Data

Tian Zheng, Professor of Statistics at Columbia University

Abstract: Fitting large Bayesian network models quickly become computationally infeasible when the number of nodes grows into the hundreds of thousands and millions. In particular, the mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model used for community detection. In this paper, we introduce a scalable inference method that leverages nodal information that often accompanies real-world networks. Conditioning on this extra information leads to a model that admits a parallel variational inference algorithm. We apply our method to a citation network with over two million nodes and 25 million edges.

Host: Ran Chen

Tian Zheng is currently Professor of Statistics at Columbia University. In her research, she develops novel methods for exploring and understanding patterns in complex data from different application domains such as biology, psychology, climate modeling, etc. Her research has been recognized by the 2008 Outstanding Statistical Application Award from the American Statistical Association (ASA), the Mitchell Prize from ISBA, and a Google research award. She became a Fellow of the American Statistical Association in 2014, a Fellow of the Institute of Mathematical Statistics in 2022, and a Fellow of the American Association for the Advancement of Science in 2024. From 2017 to 2020, she served as Associate Director for Education at the Columbia Data Science Institute. From 2019 to 2025, she was chair of the Department of Statistics at Columbia. Professor Zheng is the recipient of the 2017 Columbia Presidential Award for Outstanding Teaching. In 2021, she was recognized with a Lenfest Distinguished Columbia Faculty Award, which honors the excellence of faculty as teachers and mentors of both undergraduate and graduate students.