TY - GEN
T1 - Finding a dense subgraph with sparse cut
AU - Miyauchi, Atsushi
AU - Kakimura, Naonori
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/10/17
Y1 - 2018/10/17
N2 - Community detection is one of the fundamental tasks in graph mining, which has many real-world applications in diverse domains. In this study, we propose an optimization model for finding a community that is densely connected internally but sparsely connected to the rest of the graph. The model extends the densest subgraph problem, in which we maximize the density while minimizing the average cut size. We first show that our proposed model can be solved efficiently. Then we design two polynomial-time exact algorithms based on linear programming and a maximum flow algorithm, respectively. Moreover, to deal with larger-sized graphs in practice, we present a scalable greedy algorithm that runs in almost linear time with theoretical performance guarantee of the output. In addition, as our model is closely related to a quality function called the modularity density, we show that our algorithms can also be used to find global community structure in a graph. With thorough experiments using well-known real-world graphs, we demonstrate that our algorithms are highly effective in finding a suitable community in a graph. For example, for web-Google, our algorithm finds a solution with more than 99.1% density and less than 3.1% cut size, compared with a solution obtained by a baseline algorithm for the densest subgraph problem.
AB - Community detection is one of the fundamental tasks in graph mining, which has many real-world applications in diverse domains. In this study, we propose an optimization model for finding a community that is densely connected internally but sparsely connected to the rest of the graph. The model extends the densest subgraph problem, in which we maximize the density while minimizing the average cut size. We first show that our proposed model can be solved efficiently. Then we design two polynomial-time exact algorithms based on linear programming and a maximum flow algorithm, respectively. Moreover, to deal with larger-sized graphs in practice, we present a scalable greedy algorithm that runs in almost linear time with theoretical performance guarantee of the output. In addition, as our model is closely related to a quality function called the modularity density, we show that our algorithms can also be used to find global community structure in a graph. With thorough experiments using well-known real-world graphs, we demonstrate that our algorithms are highly effective in finding a suitable community in a graph. For example, for web-Google, our algorithm finds a solution with more than 99.1% density and less than 3.1% cut size, compared with a solution obtained by a baseline algorithm for the densest subgraph problem.
KW - Approximation algorithms
KW - Community detection
KW - Densest subgraph
KW - Exact algorithms
KW - Graphs
KW - Modularity density
KW - Sparsest cut
UR - http://www.scopus.com/inward/record.url?scp=85058064740&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058064740&partnerID=8YFLogxK
U2 - 10.1145/3269206.3271720
DO - 10.1145/3269206.3271720
M3 - Conference contribution
AN - SCOPUS:85058064740
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 547
EP - 556
BT - CIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management
A2 - Paton, Norman
A2 - Candan, Selcuk
A2 - Wang, Haixun
A2 - Allan, James
A2 - Agrawal, Rakesh
A2 - Labrinidis, Alexandros
A2 - Cuzzocrea, Alfredo
A2 - Zaki, Mohammed
A2 - Srivastava, Divesh
A2 - Broder, Andrei
A2 - Schuster, Assaf
PB - Association for Computing Machinery
T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
Y2 - 22 October 2018 through 26 October 2018
ER -