TY - GEN
T1 - Novel Dense Subgraph Discovery Primitives
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
AU - Tsourakakis, Charalampos E.
AU - Chen, Tianyi
AU - Kakimura, Naonori
AU - Pachocki, Jakub
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020
Y1 - 2020
N2 - In the densest subgraph problem, given an undirected graph G(V, E, w) with non-negative edge weights we are asked to find a set of nodes S⊆V that maximizes the degree density w(S)/|S|, where w(S) is the sum of the weights of the edges in the graph induced by S. This problem is solvable in polynomial time, and in general is well studied. But what happens when the edge weights are negative? Is the problem still solvable in polynomial time? Also, why should we care about the densest subgraph problem in the presence of negative weights? In this work we answer the aforementioned questions. Specifically, we provide two novel graph mining primitives that are applicable to a wide variety of applications. Our primitives can be used to answer questions such as “how can we find a dense subgraph in Twitter with lots of replies and mentions but no follows?”, “how do we extract a dense subgraph with high expected reward and low risk from an uncertain graph”? We formulate both problems mathematically as special instances of dense subgraph discovery in graphs with negative weights. We study the hardness of the problem, and we prove that the problem in general is NP-hard, but we also provide sufficient conditions under which it is poly-time solvable. We design an efficient approximation algorithm that works best in the presence of small negative weights, and an effective heuristic for the more general case. Finally, we perform experiments on various real-world datasets that verify the value of the proposed primitives, and the effectiveness of our proposed algorithms. The code and the data are available at https://github.com/nega-tivedsd.
AB - In the densest subgraph problem, given an undirected graph G(V, E, w) with non-negative edge weights we are asked to find a set of nodes S⊆V that maximizes the degree density w(S)/|S|, where w(S) is the sum of the weights of the edges in the graph induced by S. This problem is solvable in polynomial time, and in general is well studied. But what happens when the edge weights are negative? Is the problem still solvable in polynomial time? Also, why should we care about the densest subgraph problem in the presence of negative weights? In this work we answer the aforementioned questions. Specifically, we provide two novel graph mining primitives that are applicable to a wide variety of applications. Our primitives can be used to answer questions such as “how can we find a dense subgraph in Twitter with lots of replies and mentions but no follows?”, “how do we extract a dense subgraph with high expected reward and low risk from an uncertain graph”? We formulate both problems mathematically as special instances of dense subgraph discovery in graphs with negative weights. We study the hardness of the problem, and we prove that the problem in general is NP-hard, but we also provide sufficient conditions under which it is poly-time solvable. We design an efficient approximation algorithm that works best in the presence of small negative weights, and an effective heuristic for the more general case. Finally, we perform experiments on various real-world datasets that verify the value of the proposed primitives, and the effectiveness of our proposed algorithms. The code and the data are available at https://github.com/nega-tivedsd.
UR - http://www.scopus.com/inward/record.url?scp=85084808875&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084808875&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-46150-8_23
DO - 10.1007/978-3-030-46150-8_23
M3 - Conference contribution
AN - SCOPUS:85084808875
SN - 9783030461492
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 378
EP - 394
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2019, Proceedings
A2 - Brefeld, Ulf
A2 - Fromont, Elisa
A2 - Hotho, Andreas
A2 - Knobbe, Arno
A2 - Maathuis, Marloes
A2 - Robardet, Céline
PB - Springer
Y2 - 16 September 2019 through 20 September 2019
ER -